R中的类似数据透视表的输出? [英] Pivot Table-like Output in R?

查看:64
本文介绍了R中的类似数据透视表的输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一份报告,需要在 Excel 中生成多个数据透视表.我想有一种方法可以在 R 中做到这一点,这样我就可以避免使用 Excel.我想要像下面的屏幕截图一样的输出(教师姓名已编辑).据我所知,我可以使用 reshape 包来计算聚合值,但我需要多次这样做,并以某种方式以正确的顺序获取所有数据.那时,我应该只在 Excel 中进行.有没有人有任何建议或包装建议?谢谢!

I am writing a report that requires the generation of a number of pivot tables in Excel. I would like to think there is a way to do this in R so that I can avoid Excel. I would like output like the screenshot below (teacher names redacted). As far as I can tell, I could use the reshape package to calculate the aggregate values, but I'd need to do that a number of times and somehow get all of the data in the correct order. At that point, I should just be doing it in Excel. Does anyone have any suggestions or package recommendations? Thank you!

(编辑)数据以学生、他们的老师、学校和成长的列表开始.然后汇总这些数据以获得具有平均班级增长的教师列表.请注意,教师是按学校分组的.到目前为止,我预见用 R 执行此操作的最大问题是如何在其中获得小计和总计行(BSA1 Total、Grand Total 等),因为它们与其他观察类型不同?您是否只需手动计算它们并尝试以正确的顺序排列它们,以便它们出现在该组的底部?

(EDIT) The data starts as a list of students, their teacher, school, and growth. This data is then aggregated to get a list of teachers with their average class growth. Please note the teachers are then grouped by school. The largest problem I foresee doing this with R as of now is how do you get the subtotal and total rows (BSA1 Total, Grand Total, etc) in there since they are not the same type of observation as the others? Do you just manually have to calculate them and try to get them in the correct order so they appear at the bottom of that group?


(来源:imgh.us)

推荐答案

这里有一个计算位:

set.seed(1)
school  <- sample(c("BSA1", "BSA2", "HSA1"), 100, replace=T)
teacher <- sample(c("Tom", "Dick", "Harry"), 100, replace=T)
growth <- rnorm(100, 5, 3)

myDf <- data.frame(school, teacher, growth)

require(reshape2)

aggregate(growth ~ school + teacher, data =myDf, FUN=mean)

myDf.melt <- melt(myDf, measured="growth")
dcast(myDf.melt, school + teacher ~ ., fun.aggregate=mean, margins=c("school", "teacher"))

我没有讨论输出格式,只讨论计算.生成的数据框应如下所示:

I've not addressed output formatting, only calculation. The resulting data frame should look like this:

   school teacher       NA
1    BSA1    Dick 4.663140
2    BSA1   Harry 4.310802
3    BSA1     Tom 5.505247
4    BSA1   (all) 4.670451
5    BSA2    Dick 6.110988
6    BSA2   Harry 5.007221
7    BSA2     Tom 4.337063
8    BSA2   (all) 5.196018
9    HSA1    Dick 4.508610
10   HSA1   Harry 4.890741
11   HSA1     Tom 4.721124
12   HSA1   (all) 4.717335
13  (all)   (all) 4.886576

该示例使用 reshape2 包来处理小计.

That example uses the reshape2 package to handle the subtotals.

我认为 R 是适合这里工作的工具.我完全可以理解不确定如何开始进行此分析.几年前我从 Excel 来到 R,一开始可能很难理解.让我指出四个专业提示,以帮助您在 Stack Overflow 中获得更好的答案:

I think R is the right tool for the job here. I can totally understand not being sure how to get started on this analysis. I came to R from Excel a few years ago and it can be tough to grok at first. Let me point out four pro tips to help you get better answers in Stack Overflow:

1) 提供数据,即使是模拟的:您可以看到我在回答开头模拟了一些数据.如果您提供了该模拟,它将 a) 节省我的时间 b) 使用您自己的数据结构为您提供答案,而不是我梦想的答案,并且 c) 其他人会回答.我经常跳过没有数据的问题,因为我已经厌倦了猜测他们被告知的数据,我的答案很糟糕,因为我猜错了.

1) provide data, even if simulated: you can see I simulated some data at the beginning of my answer. If you had provided that simulation it would have a) saved me time b) gotten you an answer that used your own data structure, not one I dreamed up and c) other people would have answered. I often skip questions with no data because I've grown tired of guessing about the data them being told my answer sucked because I guessed wrong.

2) 问一个明确的问题.我如何做我的工作"并不是一个明确的问题.我如何获取此示例数据并在此示例输出中创建汇总中的小计"是一个特定的问题.

2) Ask one clear question. "How do I do my work" is not a single clear question. "How do I take this example data and create subtotals in the aggregation like this example output" is a single specific question.

3) 继续问!我们都会通过练习变得更好.您正在尝试在 R 中做得更多,而在 Excel 中做得更少,因此您的智力显然高于平均水平.继续使用 R 并继续提问.随着时间的推移,一切都会变得更容易.

3) keep asking! We all get better with practice. You're trying to do more in R and less in Excel so you're clearly of above average intelligence. Keep using R and keep asking questions. It will all get easier in time.

4) 描述事物时要小心措辞.您在编辑过的问题中说您有一个清单".R 中的列表是一种特定的数据结构.我怀疑您实际上有一个数据框并且在一般意义上使用术语列表".这可能会造成一些混乱.它还说明了为什么要提供自己的数据.

4) Be careful with your words when you describe things. You say in your edited question you have a "list" of things. A list in R is a specific data structure. I'm suspicious you actually have a data frame and are using the term "list" in a generic sense. This can make for some confusion. It also illustrates why you want to provide your own data.

这篇关于R中的类似数据透视表的输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆