如何计算在另一个变量的级别之间共享的一个变量的出现次数 [英] How to count the number of occurrences of one variable that are shared between levels of another variable

查看:83
本文介绍了如何计算在另一个变量的级别之间共享的一个变量的出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据看起来像

id      camera
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam03
angelina    cam22
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam03
barry   cam22
barry   cam22
barry   cam22
barry   cam22
barry   cam22
barry   cam15
barry   cam25

因此每个人都记录在每个被捕获的相机中在中,我想知道在每对摄像机中看到了多少个人,因此对于摄像机1和2,在两个摄像机中记录了多少个人,在上面的示例中,在摄像机1和2中都仅看到个人A,对于来了ras 1和3,在它们两个中都看到了个体B和E,所以我想要的结果将是这样的表

So each individual is recorded in each camera that it is captured in, I want to know how many individuals are seen in each pair of cameras, therefore for cameraa 1 and 2, how many indivduals are recorded in both, in the example above, only individual A is seen in both cameras 1 and 2, for cameras 1 and 3, individual B and E is seen in them both, so the desired result I would like would be a table like

    0001  0002  0003
0001 -     1      2
0002  1    -      0
0003  2    0      -

如果有人可以在R中显示此代码,我将不胜感激

I would greatly appreciate it if someone could show me the code for this please in R

推荐答案

crossprod 为此:

crossprod(table(mydf))
#       Camera
# Camera 0001 0002 0003
#   0001    4    1    2
#   0002    1    1    0
#   0003    2    0    3

diag 可用于将对角线设置为零或 NA (如果需要)。您可以一次性完成所有操作:

diag can be used to set the diagonal to zero or NA if required. You can do it all in one go with:

`diag<-`(crossprod(table(mydf)), 0)
#       Camera
# Camera 0001 0002 0003
#   0001    0    1    2
#   0002    1    0    0
#   0003    2    0    0

样本数据:

mydf <- data.frame(
    Individual = c("A", "A", "B", "B", "C", "D", "E", "E"),
    Camera = c("0001", "0002", "0001", "0003", "0001", "0003", "0001", "0003"))

编辑:

如果同一个人有重复的摄像机,您可以在 crossprod 调用之前消除重复项:

In case of duplicated cameras for same individual, you can eliminate the duplicates prior to the crossprod call:

`diag<-`(crossprod(table(unique(mydf2))), 0)
       camera
#camera  cam03 cam15 cam22 cam25
  #cam03     0     1     2     1
  #cam15     1     0     1     1
  #cam22     2     1     0     1
  #cam25     1     1     1     0

第二个数据:

mydf2 <- structure(list(id = c("angelina", "angelina", "angelina", "angelina", 
"angelina", "angelina", "angelina", "angelina", "angelina", "angelina", 
"angelina", "angelina", "barry", "barry", "barry", "barry", "barry", 
"barry", "barry", "barry", "barry", "barry", "barry", "barry", 
"barry", "barry", "barry", "barry", "barry"), camera = c("cam03", 
"cam03", "cam03", "cam03", "cam03", "cam03", "cam03", "cam03", 
"cam03", "cam03", "cam03", "cam22", "cam03", "cam03", "cam03", 
"cam03", "cam03", "cam03", "cam03", "cam03", "cam03", "cam03", 
"cam22", "cam22", "cam22", "cam22", "cam22", "cam15", "cam25"
)), .Names = c("id", "camera"), class = "data.frame", row.names = c(NA, 
-29L))

这篇关于如何计算在另一个变量的级别之间共享的一个变量的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆