培训隐马尔可夫模型中的问题和用于分类 [英] Issue in training hidden markov model and usage for classification

查看:214
本文介绍了培训隐马尔可夫模型中的问题和用于分类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个艰难的时间在弄清楚如何使用Kevin Murphy的
HMM工具箱工具箱。如果任何有经验的人能够澄清一些概念上的问题,这将是一个很大的帮助。我不知何故理解HMM背后的理论,但它混乱如何实际实现它和提及所有的参数设置,因为我不是一个程序员,因此有有限的编程技能。

I am having a tough time in figuring out how to use Kevin Murphy's HMM toolbox Toolbox. It would be a great help if anyone who has an experience with it could clarify some conceptual questions. I have somehow understood the theory behind HMM but its confusing how to actually implement it and mention all the parameter setting since I am not a programmer hence have limited programming skills. So please bare with me.

有2个类,所以我们需要2个HMM。

让训练向量为:class1 O1 = {4 3 5 1 2}和类O_2 = {1 4 3 2 4}。

现在,系统必须将未知序列O3 = {1 3 2 4 4}分类为class1或class2。

There are 2 classes so we need 2 HMMs.
Let say the training vectors are :class1 O1={ 4 3 5 1 2} and class O_2={ 1 4 3 2 4}.
Now,the system has to classify an unknown sequence O3={1 3 2 4 4} as either class1 or class2.


  1. obsmat0和obsmat1会怎么样?

  2. 如何指定/语法过渡概率transmat0和transmat1?

  3. 在这种情况下,变量数据是什么?

  4. 状态数Q = 5,

  5. 如何提及transmat0和transmat1的转换概率?
  6. >
  1. What is going to go in obsmat0 and obsmat1?
  2. How to specify/syntax for the transition probability transmat0 and transmat1?
  3. what is the variable data going to be in this case?
  4. Would number of states Q=5 since there are five unique numbers/symbols used?
  5. Number of output symbols=5 ?
  6. How do I mention the transition probabilities transmat0 and transmat1?

任何其他指针将非常有用。

Any other pointers would be immensely helpful.

推荐答案

不要回答每个问题,让我举例说明如何使用 HMM工具箱以及一个例子 - 天气示例,通常用于引入隐藏的markov模型。

Instead of answering each individual question, let me illustrate how to use the HMM toolbox with an example -- the weather example which is usually used when introducing hidden markov models.

模型的状态是三种可能的天气类型:晴天,雨天和雾天。在任何一天,我们假设天气只能是这些值中的一个。因此,HMM状态集是:

Basically the states of the model are the three possible types of weather: sunny, rainy and foggy. At any given day, we assume the weather can be only one of these values. Thus the set of HMM states are:

S = {sunny, rainy, foggy}

然而在这个例子中,我们不能直接观察天气(显然我们被锁在地下室!相反,我们唯一的证据是,每天检查你的人是否携带一把伞。在HMM术语中,这些是离散观察:

However in this example, we can't observe the weather directly (apparently we are locked in the basement!). Instead the only evidence we have is whether the person who checks on you every day is carrying an umbrella or not. In HMM terminology, these are the discrete observations:

x = {umbrella, no umbrella}

HMM模型的特点是三件事:

The HMM model is characterized by three things:


  • 先验概率:处于序列的第一状态的概率向量。

  • 转移概率矩阵描述从一种天气状态到另一种状态的概率。 li>
  • 排放因子:矩阵描述了在给定状态(天气)的情况下观察输出(是否有伞)的概率。

接下来我们给出这些概率,或者我们必须从训练集中学习它们。一旦完成,我们可以做推理,例如计算关于HMM模型(或一堆模型,并选择最可能的一个)的观察序列的可能性...

Next we are either given the these probabilities, or we have to learn them from a training set. Once that's done, we can do reasoning like computing likelihood of an observation sequence with respect to an HMM model (or a bunch of models, and pick the most likely one)...

下面是一个示例代码,显示如何填充现有的概率来构建模型:

Here is a sample code that shows how to fill existing probabilities to build the model:

Q = 3;    %# number of states (sun,rain,fog)
O = 2;    %# number of discrete observations (umbrella, no umbrella)

%#  prior probabilities
prior = [1 0 0];

%# state transition matrix (1: sun, 2: rain, 3:fog)
A = [0.8 0.05 0.15; 0.2 0.6 0.2; 0.2 0.3 0.5];

%# observation emission matrix (1: umbrella, 2: no umbrella)
B = [0.1 0.9; 0.8 0.2; 0.3 0.7];

然后我们可以从这个模型中抽取一系列序列:

Then we can sample a bunch of sequences from this model:

num = 20;           %# 20 sequences
T = 10;             %# each of length 10 (days)
[seqs,states] = dhmm_sample(prior, A, B, num, T);

例如,第五个示例为:

>> seqs(5,:)        %# observation sequence
ans =
     2     2     1     2     1     1     1     2     2     2

>> states(5,:)      %# hidden states sequence
ans =
     1     1     1     3     2     2     2     1     1     1

我们可以评估序列的对数似然:

we can evaluate the log-likelihood of the sequence:

dhmm_logprob(seqs(5,:), prior, A, B)

dhmm_logprob_path(prior, A, B, states(5,:))

或计算维特比路径(最可能的状态序列):

or compute the Viterbi path (most probable state sequence):

vPath = viterbi_path(prior, A, multinomial_prob(seqs(5,:),B))

使用EM算法进行训练,最好使用观察序列

Training is performed using the EM algorithm, and is best done with a set of observation sequences.

继续同样的例子,我们可以使用上面生成的数据来训练一个新模型并将其与原始数据进行比较:

Continuing on the same example, we can use the generated data above to train a new model and compare it to the original:

%# we start with a randomly initialized model
prior_hat = normalise(rand(Q,1));
A_hat = mk_stochastic(rand(Q,Q));
B_hat = mk_stochastic(rand(Q,O));  

%# learn from data by performing many iterations of EM
[LL,prior_hat,A_hat,B_hat] = dhmm_em(seqs, prior_hat,A_hat,B_hat, 'max_iter',50);

%# plot learning curve
plot(LL), xlabel('iterations'), ylabel('log likelihood'), grid on

请记住,状态顺序不必匹配。这就是为什么我们需要在比较两个模型之前重置状态。在这个例子中,训练的模型看起来接近原始的:

Keep in mind that the states order don't have to match. That's why we need to permute the states before comparing the two models. In this example, the trained model looks close to the original one:

>> p = [2 3 1];              %# states permutation

>> prior, prior_hat(p)
prior =
     1     0     0
ans =
      0.97401
  7.5499e-005
      0.02591

>> A, A_hat(p,p)
A =
          0.8         0.05         0.15
          0.2          0.6          0.2
          0.2          0.3          0.5
ans =
      0.75967      0.05898      0.18135
     0.037482      0.77118      0.19134
      0.22003      0.53381      0.24616

>> B, B_hat(p,[1 2])
B =
          0.1          0.9
          0.8          0.2
          0.3          0.7
ans =
      0.11237      0.88763
      0.72839      0.27161
      0.25889      0.74111






是更多的事情,你可以做隐藏的markov模型,如分类或模式识别。您将有属于不同类的不同的检验序列集。你首先训练每个集合的模型。然后给定一个新的观察序列,您可以通过计算其相对于每个模型的似然性来对其进行分类,并预测具有最高对数似然性的模型。


There are more things you can do with hidden markov models such as classification or pattern recognition. You would have different sets of obervation sequences belonging to different classes. You start by training a model for each set. Then given a new observation sequence, you could classify it by computing its likelihood with respect to each model, and predict the model with the highest log-likelihood.

argmax[ log P(X|model_i) ] over all model_i

这篇关于培训隐马尔可夫模型中的问题和用于分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆