`
yexin218
  • 浏览: 973213 次
  • 性别: Icon_minigender_1
  • 来自: 珠海
社区版块
存档分类
最新评论

隐马尔可夫模型

阅读更多

前一篇文章没有系统地介绍这个模型,本篇文章将详细介绍

1. 定义:

The Hidden Markov Model is a finite set of states , each of which is associated with a (generally multidimensional) probability distribution []. Transitions among the states are governed by a set of probabilities called transition probabilities. In a particular state an outcome or observation can be generated, according to the associated probability distribution. It is only the outcome, not the state visible to an external observer and therefore states are ``hidden'' to the outside; hence the name Hidden Markov Model.

隐马尔科夫模型是由有限个状态组成的,每一个状态都以一定的概率出现,状态之间的转换由转换概率表决定,每一个状态都可以产生一个观察到的状态,在隐马尔科夫模型中,只有观察到的状态所见,真实的马尔科夫状态链不可见,因此被称为隐马尔科夫模型,该模型包含如下三个要素:

  • The number of states of the model, N .
  • The number of observation symbols in the alphabet, M . If the observations are continuous then M is infinite.
  • A set of state transition probabilities tex2html_wrap_inline2612 .

    displaymath2614

    where tex2html_wrap_inline2616 denotes the current state.
    Transition probabilities should satisfy the normal stochastic constraints,

    displaymath2618

    and

    displaymath2620

  • A probability distribution in each of the states, tex2html_wrap_inline2622 .

    displaymath2624

    where tex2html_wrap_inline2626 denotes the tex2html_wrap_inline2628 observation symbol in the alphabet, and tex2html_wrap_inline2630 the current parameter vector.
    Following stochastic constraints must be satisfied.

    displaymath2632

    and

    displaymath2634

    If the observations are continuous then we will have to use a continuous probability density function, instead of a set of discrete probabilities. In this case we specify the parameters of the probability density function. Usually the probability density is approximated by a weighted sum of M Gaussian distributions tex2html_wrap_inline2638 ,

    displaymath2640

    where,

    displaymath2604

    tex2html_wrap_inline2646 should satisfy the stochastic constrains,

    displaymath2648

    and

    displaymath2650

  • The initial state distribution, tex2html_wrap_inline2652 .
    where,

    displaymath2654

Therefore we can use the compact notation

displaymath2656

to denote an HMM with discrete probability distributions, while

displaymath2658

to denote one with continuous densities. .

所以,一个隐马尔科夫模型可记为

displaymath2656

2.一些假设:

For the sake of mathematical and computational tractability, following assumptions are made in the theory of HMMs.

(1)The Markov assumption
状态的马尔科夫假设,即当前状态只与前一个状态相关
As given in the definition of HMMs, transition probabilities are defined as,

displaymath2660

In other words it is assumed that the next state is dependent only upon the current state. This is called the Markov assumption and the resulting model becomes actually a first order HMM.
However generally the next state may depend on past k states and it is possible to obtain a such model, called an tex2html_wrap_inline2628 order HMM by defining the transition probabilities as follows.

displaymath2666

But it is seen that a higher order HMM will have a higher complexity. Even though the first order HMMs are the most common, some attempts have been made to use the higher order HMMs too.

(2)The stationarity assumption
稳定性假设,状态转换与时间无关
Here it is assumed that state transition probabilities are independent of the actual time at which the transitions takes place. Mathematically,

displaymath2668

for any tex2html_wrap_inline2670 and tex2html_wrap_inline2672 .

(3)The output independence assumption
当前状态到观察状态的转换概率与已经发生的观察序列无关,即可以将观察序列分解为多个无关的步骤
This is the assumption that current output(observation) is statistically independent of the previous outputs(observations). We can formulate this assumption mathematically, by considering a sequence of observations,

displaymath2674

. Then according to the assumption for an HMM tex2html_wrap_inline2676 ,

displaymath2678

However unlike the other two, this assumption has a very limited validity. In some cases this assumption may not be fair enough and therefore becomes a severe weakness of the HMMs.

3.要解决的三个问题:

Once we have an HMM, there are three problems of interest.

(1)The Evaluation Problem
计算某一个观察序列在模型下的出现概率
Given an HMM tex2html_wrap_inline2676 and a sequence of observations tex2html_wrap_inline2682 , what is the probability that the observations are generated by the model, tex2html_wrap_inline2684 ?
(2)The Decoding Problem
根据观察到的序列,计算其最有可能对应的隐藏状态序列,即解码问题
Given a model tex2html_wrap_inline2676 and a sequence of observations tex2html_wrap_inline2682 , what is the most likely state sequence in the model that produced the observations?
(3)The Learning Problem
怎样改进这个模型,使得观察到的序列的概率最大化
Given a model tex2html_wrap_inline2676 and a sequence of observations tex2html_wrap_inline2682 , how should we adjust the model parameters tex2html_wrap_inline2694 in order to maximize tex2html_wrap_inline2696

Evaluation problem can be used for isolated (word) recognition. Decoding problem is related to the continuous recognition as well as to the segmentation. Learning problem must be solved, if we want to train an HMM for the subsequent use of recognition tasks.

4. 估计观察序列的概率问题:

We have a model tex2html_wrap_inline2698 and a sequence of observations tex2html_wrap_inline2682 , and tex2html_wrap_inline2684 must be found. We can calculate this quantity using simple probabilistic arguments. But this calculation involves number of operations in the order of tex2html_wrap_inline2704 . This is very large even if the length of the sequence, T is moderate. Therefore we have to look for an other method for this calculation. Fortunately there exists one which has a considerably low complexity and makes use an auxiliary variable, tex2html_wrap_inline2708 called forward variable .

The forward variable is defined as the probability of the partial observation sequence tex2html_wrap_inline2710 , when it terminates at the state i . Mathematically,

  equation195

前向变量:观察到O1,O2,..,Ot并且t时刻Qt = i 的概率,它是按t向前推进的,当t=T时,整个观察序列都已经获取到,因此只要对所有的前向变量在T时刻的值求和就得到了观察序列出现的概率

Then it is easy to see that following recursive relationship holds.

  equation206

where,

displaymath2722

Using this recursion we can calculate

displaymath2724

and then the required probability is given by,

  equation227

The complexity of this method, known as the forward algorithm is proportional to tex2html_wrap_inline2728 , which is linear wrt T whereas the direct calculation mentioned earlier, had an exponential complexity.

In a similar way we can define the backward variable tex2html_wrap_inline2732 as the probability of the partial observation sequence tex2html_wrap_inline2734 , given that the current state is i . Mathematically ,

  equation244

后向变量定义的时t时刻之后产生的某一个观察序列的概率,而前向变量定义的是这个时刻之前的观察序列的概率,根据前面三个假设中的最后一个,因此整个序列的概率等于前向 与后向的乘积

As in the case of tex2html_wrap_inline2708 there is a recursive relationship which can be used to calculate tex2html_wrap_inline2732 efficiently.

  equation257

where,

displaymath2750

Further we can see that,

  equation272

Therefore this gives another way to calculate tex2html_wrap_inline2684 , by using both forward and backward variables as given in eqn. 1.7 .

  equation283

Eqn. 1.7 is very useful, specially in deriving the formulas required for gradient based training.

5.解码问题:

In this case We want to find the most likely state sequence for a given sequence of observations, tex2html_wrap_inline2682 and a model, tex2html_wrap_inline2762

The solution to this problem depends upon the way ``most likely state sequence'' is defined. One approach is to find the most likely state tex2html_wrap_inline2616 at t =t and to concatenate all such ' tex2html_wrap_inline2616 's. But some times this method does not give a physically meaningful state sequence. Therefore we would go for another method which has no such problems.
In this method, commonly known as Viterbi algorithm , the whole state sequence with the maximum likelihood is found. In order to facilitate the computation we define an auxiliary variable,

displaymath2770

which gives the highest probability that partial observation sequence and state sequence up to t =t can have, when the current state is i .

上面的公式定义:在t时刻,结束状态是i,并且观察到的序列是O1...t-1的最大概率,因此解码问题就变成求在T时刻,概率最大的结束状态
It is easy to observe that the following recursive relationship holds.

  equation322

where,

displaymath2778

这个递推公式说明如下:

由前面的第三个假设可知,t时刻转换到t + 1 时刻,这个概率与已经发生的观察序列无关,因此我们只需要保存在每个状态上的最大概率,然后在计算这个状态进行到下一个状态的概率,将二者进行乘积即得到在t + 1时刻该路径的概率,然后在N个值中选择一个最大的值

So the procedure to find the most likely state sequence starts from calculation of tex2html_wrap_inline2780 using recursion in 1.8 , while always keeping a pointer to the ``winning state'' in the maximum finding operation. Finally the state tex2html_wrap_inline2782 , is found where

displaymath2784

and starting from this state, the sequence of states is back-tracked as the pointer in each state indicates.This gives the required set of states.
This whole algorithm can be interpreted as a search in a graph whose nodes are formed by the states of the HMM in each of the time instant tex2html_wrap_inline2786 .

 

From :http://blog.csdn.net/tianqio/archive/2009/06/17/4275895.aspx (THX)

分享到:
评论

相关推荐

    HMM隐马尔可夫模型用于中文分词

    隐马尔可夫模型(Hidden Markov Model,HMM)是一种统计模型,被广泛应用于模式识别、自然语言处理等领域。HMM的核心思想是通过一个可以观察的马尔可夫过程来描述一个隐含的状态序列,其中状态不可直接观察到,但每...

    10.2 基于隐马尔可夫模型(HMM)的孤立字语音识别_隐马尔可夫模型(HMM)的孤立字语音识别_

    在本主题中,我们将深入探讨基于隐马尔可夫模型(HMM)的孤立字语音识别方法,并结合MATLAB程序实现进行讲解。 隐马尔可夫模型(Hidden Markov Model, HMM)是概率统计模型,广泛应用于自然语言处理、生物信息学...

    隐马尔可夫模型的简介以及实例介绍以及三个主要算法

    隐马尔可夫模型(Hidden Markov Model,HMM)是一种广泛应用于模式识别和序列分析的统计模型。自从1870年俄国有机化学家Vladimir V. Markovnikov提出马尔科夫模型以来,该模型在理论研究与实际应用领域都发挥了巨大...

    隐马尔可夫模型及其在自然语言处理中的应用

    隐马尔可夫模型及其在自然语言处理中的应用

    隐马尔可夫模型源代码(matlab)

    隐马尔可夫模型(Hidden Markov Model, HMM)是一种统计建模方法,常用于处理序列数据,如语音识别、自然语言处理、生物信息学等领域。在MATLAB环境中实现HMM,我们可以利用其强大的矩阵运算能力和丰富的工具箱。...

    HMM隐马尔可夫模型MATLAB实现

    隐马尔可夫模型(Hidden Markov Model,简称HMM)是概率统计领域中的一个重要模型,尤其在自然语言处理、语音识别、生物信息学等领域有着广泛的应用。在MATLAB环境中,我们可以利用其强大的数学计算能力和丰富的函数...

    一种基于隐马尔可夫模型的人脸识别方法.pdf

    本文提出了一种基于隐马尔可夫模型的人脸识别方法,该方法利用人脸隐马尔可夫模型的结构特征和Viterbi算法的特点,对特征观察序列进行分割,并使用部分序列对所有隐马尔可夫模型递进地计算最大相似度,同时排除...

    第20章-隐马尔可夫模型

    ### 隐马尔可夫模型详解 #### 一、隐马尔可夫模型概述 隐马尔可夫模型(Hidden Markov Model, HMM)作为一种重要的概率图模型,在序列预测问题中扮演着核心角色。它能够有效地处理一系列数据点间的关系,并且尤其...

    基于隐马尔可夫模型回归HMMR模型的时间序列分割处理matlab仿真+代码仿真操作视频

    2.内容:基于隐马尔可夫模型回归HMMR模型的时间序列分割处理matlab仿真+代码仿真操作视频 3.用处:用于隐马尔可夫模型回归HMMR模型的时间序列分割处理算法编程学习 4.指向人群:本硕博等教研学习使用 5.运行注意...

    连续型隐马尔可夫模型(HMM)参数迭代算法

    ### 连续型隐马尔可夫模型(HMM)参数迭代算法 #### 知识点解析 **一、隐马尔可夫模型(HMM)基础** 隐马尔可夫模型是一种统计模型,用于描述一个含有未知参数的马尔可夫过程。这种模型在自然语言处理、语音识别...

    隐马尔可夫模型c代码

    隐马尔可夫模型(Hidden Markov Model,简称HMM)是概率统计领域的一种重要模型,广泛应用于自然语言处理、语音识别、生物信息学等多个IT领域。本模型的核心思想是,尽管我们无法直接观测到系统的真实状态,但可以...

    隐马尔可夫模型(HMM)简介

    "隐马尔可夫模型(HMM)简介" 隐马尔可夫模型(Hidden Markov Model,HMM)是一种数学模型,用来描述一个系统的隐状态和观察状态之间的关系。在本文中,我们通过一个实例来了解 HMM 的基本概念和应用。 什么是 HMM?...

    隐马尔可夫模型ppt

    此ppt由专业人员编写,内容条例清晰,重点突出,结合了简单易懂的实例,深入浅出的介绍了隐马尔可夫模型。

    隐马尔可夫模型和词性标注笔记

    隐马尔可夫模型(HMM)是一种统计建模方法,尤其在自然语言处理和语音识别领域中广泛应用。它基于马尔可夫模型的概念,但增加了“隐藏”或不可观测的状态,这些状态通过一系列可观察的输出来表现。在HMM中,系统处于...

Global site tag (gtag.js) - Google Analytics