R:将列联表转换为长数据框 [英] R: Convert contingency table to long data.frame

查看:115
本文介绍了R:将列联表转换为长数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑如下所示的汇总交叉表:

Consider you are given a summarized crosstable like this:

kdat <- data.frame(positive = c(8, 4), negative = c(3, 6),
                   row.names = c("positive", "negative"))
kdat
#>          positive negative
#> positive        8        3
#> negative        4        6

现在您要计算科恩的Kappa,该统计数据可确定两个评估者之间的协议。给定这种格式的数据,可以使用 psych :: cohen.kappa

Now you want to compute Cohen's Kappa, a statistic to determine the agreement between two raters. Given data in this format, you can use psych::cohen.kappa:

psych::cohen.kappa(kdat)$kappa
#> Warning in any(abs(bounds)): coercing argument of type 'double' to logical
#> [1] 0.3287671

这让我很恼火,因为我更喜欢我的数据又细又长,让我使用 irr :: kappa2 。由于种种原因,我更喜欢类似的功能。所以我组装了此函数以重新格式化我的数据:

Which irks me, because I prefer my data to be long and thin, which would let me use irr::kappa2. A similar function that I prefer for arbitrary reasons. So I assembled this function to reformat my data:

longify_xtab <- function(x) {
  nm <- names(x)
  # Convert to table
  x_tab <- as.table(as.matrix(x))
  # Just in case there are now rownames, required for conversion
  rownames(x_tab) <- nm
  # Use appropriate method to get a df
  x_df <- as.data.frame(x_tab)

  # Restructure df in a painful and unsightly way
  data.frame(lapply(x_df[seq_len(ncol(x_df) - 1)], function(col) {
    rep(col, x_df$Freq)
  }))
}

该函数返回以下格式:

longify_xtab(kdat)
#>        Var1     Var2
#> 1  positive positive
#> 2  positive positive
#> 3  positive positive
#> 4  positive positive
#> 5  positive positive
#> 6  positive positive
#> 7  positive positive
#> 8  positive positive
#> 9  negative positive
#> 10 negative positive
#> 11 negative positive
#> 12 negative positive
#> 13 positive negative
#> 14 positive negative
#> 15 positive negative
#> 16 negative negative
#> 17 negative negative
#> 18 negative negative
#> 19 negative negative
#> 20 negative negative
#> 21 negative negative

...让我们通过 irr计算Kappa: kappa2

irr::kappa2(longify_xtab(kdat))$value
#> [1] 0.3287671

我的问题是:

是否有更好的方法(在基本R中或与包一起使用)?它使我觉得这是一个相对简单的问题,但是通过尝试解决它,我意识到至少在我脑海中,它很棘手。

My question is:
Is there a better way to do this (in base R or with a package)? It strikes me as a relatively simple issue, but by trying to solve it I realized that it's oddly tricky, at least in my head.

推荐答案

以下是一些公共领域的代码,来自: http://www.cookbook-r。 com / Manipulating_data / Converting_between_data_frames_and_contingency_tables / ,我曾经完全按照您的要求进行操作。

Here is some public domain code from: http://www.cookbook-r.com/Manipulating_data/Converting_between_data_frames_and_contingency_tables/ which I have used to do exactly what you have asked.




# Convert from data frame of counts to data frame of cases.
# `countcol` is the name of the column containing the counts
countsToCases <- function(x, countcol = "Freq") {
    # Get the row indices to pull from x
    idx <- rep.int(seq_len(nrow(x)), x[[countcol]])

    # Drop count column
    x[[countcol]] <- NULL

    # Get the rows from x
    x[idx, ]
}

这篇关于R:将列联表转换为长数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆