广泛的数据帧到高指标数据帧 [英] Wide tally dataframe to tall indicator dataframe

查看:93
本文介绍了广泛的数据帧到高指标数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这不是一个复杂的问题,我有一个解决方案,但是我不能动摇有一种更好的方式的感觉:



我有一个数据。框架按类别划分成功和机会,如下所示:

  testFrame<  -  data.frame(successes = c (100,150,18),
机会= c(215,194,40),
category = LETTERS [1:3])
testFrame $ category< - as.character testFrame $ category)

我想将其转换为高数据框,其中一列1s和0s表示成功/失败,另一个表示类别标签。我可以使用以下代码:

  tallFrame<  -  lapply(1:nrow(testFrame),function(rr) {
cbind(rep(c(1,0),c(testFrame [rr,successes],testFrame [rr,opportunities] - testFrame [rr,successes])),testFrame [rr ,category])
})
tallFrame< - data.frame(do.call(rbind,tallFrame))

生成的 tallFrame 是一个矩阵,然后我可以将其转换为data.frame,没有任何问题,但这似乎是很多代码为一个简单的任务。当然有一种方法可以更有效地执行此操作,也许可以使用 plyr reshape ,或者也许



提前感谢

解决方案

一个人想知道为什么你需要这样做,但不管... ...

一个 base 使用 rep

  with(testFrame,data.frame(category = rep(category,机会),
指标= unlist(mapply(rep,times = c(successes,opportunities-successes),
MoreArgs = list(x = c(0,1)))))

一个 data.table 解决方案(编码优雅代码高尔夫竞争对手)

 库(data.table)
DT< - data.table(testFrame)
DT [,list(indicator = rep(c(0,1),c(successes,opportunities-successes))),by = category] ​​


This is not a complicated problem, and I have a solution, but I can't shake the feeling that there is a better way:

I have a data.frame with a tally of successes and opportunities by category, like this:

testFrame <- data.frame(successes = c(100, 150, 18),
                        opportunities = c(215, 194, 40),
                        category = LETTERS[1:3])
testFrame$category <- as.character(testFrame$category)

I want to convert this to a "tall" data.frame, with one column of 1s and 0s indicating success/failure and a second with category labels. I can do this with the following code:

tallFrame <- lapply(1:nrow(testFrame), function(rr){
  cbind(rep(c(1, 0), c(testFrame[rr, "successes"], testFrame[rr, "opportunities"]-testFrame[rr, "successes"])), testFrame[rr, "category"])
  })
tallFrame <- data.frame(do.call(rbind, tallFrame))

The resulting tallFrame is a matrix which I can then convert to a data.frame without any issues, but this seems like a lot of code for a simple task. Surely there is a way to do this more code-efficiently, perhaps with plyr or reshape, or maybe I'm just looking for some code golf.

Thanks in advance.

解决方案

One does wonder why you need to do this, but regardless...

A base solution using rep

with(testFrame, data.frame(category = rep(category, opportunities), 
    indicator =  unlist(mapply(rep, times = c(successes,  opportunities-successes), 
      MoreArgs = list(x = c(0,1))))))

A data.table solution (coding elegance (perhaps a code golf competitor)

library(data.table)
DT <- data.table(testFrame)
DT[,list(indicator = rep(c(0,1), c(successes,  opportunities-successes))), by = category]

这篇关于广泛的数据帧到高指标数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆