过渡概率表示 [英] Transition probabilities representation

查看:135
本文介绍了过渡概率表示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想确定活动随时间的变化。下面是我用来计算活动之间转移概率的矩阵示例(从act1_1到act1_16)。



head(活动)将返回
a tibble: 6 x 145

 串行act1_1 act1_2 act1_3 act1_4 act1_5 act1_6 act1_7 act1_8 act1_9 act1_10 
1 1.22e7 110110110110110 110110110110110
2 1.43e7 110110110110110110110110110110
3 2.00e7110110110110110110110110110
4 2.71e7 110110110110 110110110110110110
5 1.61e7 110110110110110110110110110110
6 1.60e7 110110110110110110110110110

#。 ..具有134个以上的变量:act1_11< dbl + lbl> ;、 act1_12< dbl + lbl>,

尺寸活动矩阵为ncol = 144和nrows = 16533; act1_1 ... ac1_144是时间步长,时间以10分钟的间隔表示(例如act1_1 = 4.10am; act1_2 = 4.20am ..)。时间从凌晨4点(act1_1)开始,结束于act1_144(4am)。这些列填充了不同的活动,例如110 =睡眠,111 =正在观看电视,123 =饮食等。



在我要用于。



我该怎么做?如何确定活动之间最频繁的转换?



这是我的目标目标:



解决方案

给出一个转换矩阵 m ,您可以找到最常见的 n 转换,如下所示:

  n<-3#或排序的任何
<-sort(m,减少= TRUE)
其中(m> = sorted [n],arr.ind = TRUE)

关系可能意味着您将获得超过 n 个结果。



鉴于您的数据,您可能希望忽略对角线。您可以使用

  diag(m)做到--0 

,然后使用上面的代码。



一个问题是您没有单独的过渡矩阵每次。如果您以可用形式发布一些数据,则可能会获得帮助。 (不是所有16533行,仅足以使其有趣。)


I would like to identify activity changes across time. Below is an example (from act1_1 to act1_16) of matrix that I was using to calculate transition probabilities between activities.

head (Activities) will return a tibble: 6 x 145

  serial act1_1 act1_2 act1_3 act1_4 act1_5 act1_6 act1_7 act1_8 act1_9  act1_10
     1 1.22e7 110    110    110    110    110    110    110    110    110    110    
     2 1.43e7 110    110    110    110    110    110    110    110    110    110    
     3 2.00e7 110    110    110    110    110    110    110    110    110    110    
     4 2.71e7 110    110    110    110    110    110    110    110    110    110    
     5 1.61e7 110    110    110    110    110    110    110    110    110    110    
     6 1.60e7 110    110    110    110    110    110    110    110    110    110    

# ... with 134 more variables: act1_11 <dbl+lbl>, act1_12 <dbl+lbl>,

The dimension of the "Activities" matrix is ncol=144 and nrows=16533; act1_1...ac1_144 are time-steps, and time is represented in 10 minutes intervals (e.g. act1_1 = 4.10am; act1_2=4.20am..). Time start from 4am (act1_1) and ends at act1_144(4am).The columns are filled in with different activities, such 110=sleep, 111=watching Tv, 123=eating, etc.

Below the function that I am using to calculate the transition probabilities:

transition.matrix <- function(X, prob=T)
{
    tt <- table( c(X[,-ncol(X)]), c(X[,-1]) )
    if(prob) t <- tt / rowSums(tt)
    tt
}
I call the function as:

transitionfunction <- trans.matrix(as.matrix(Activities))

Using this function I managed to calculate the transition probabilities between activities (Activities matrix). Below is an example of this kind of matrix:

Using the transitionfunction I would like to plot on x axis time (10 minutes intervals) and y axis probabilities.

How can I do this? How can I identify the most frequent transition between activities?

This is the plot that I am aiming for:

解决方案

Given one transition matrix m, you can find the most frequent n transitions as follows:

n <- 3 # or whatever
sorted <- sort(m, decreasing = TRUE)
which(m >= sorted[n], arr.ind = TRUE)

Ties may mean you'll get more than n results.

Given your data, you might want to ignore the diagonal. You can do that using

diag(m) <- 0

and then using the code above.

An issue is that you don't have separate transition matrices for each time. If you post some data in a usable form, you're likely to get help with that. (Not all 16533 rows, just enough to make it interesting.)

这篇关于过渡概率表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆