R数据子集重组 [英] R data subset restructuring
问题描述
我对 R/Rstudio 还很陌生,我仍在学习如何进行某些操作.我有以下数据集.对于列,我有运营区域、元素类型(CA、OBU)、子元素和净收入.目前数据相当大(50 000 行),我想按元素、子元素和 NR 获取操作区域的摘要.示例
I am fairly new to R/Rstudio and I am still learning how to do certain operations. I have the following data set. For columns I have Operating Region, type of element(CA,OBU), sub-element and Net Revenue. Currently the data is quite big(50 000 rows) and I want to get a summary of Operating region by element,sub-element and NR. Example
Operating Region Element Sub-Element NR
Asia CA CA123 50 000
America OBU EFK456 35 000
有人可以指导我如何做到这一点吗?任何相关的阅读/示例将不胜感激.
Could someone please guide me on how to accomplish this? Any relevant readings/examples would be much appreciated.
推荐答案
使用下面的数据返回数据框对象data",你可以使用dplyr
包将结果组织成许多不同的方法.下面是一个例子:
Using the data below to return the data frame object "data," you can use the dplyr
package to organize results in many different ways. Here is one example:
data <- data.frame("OperatingRegion" = c("Asia", "America"), "Region" = c("CA", "OBU"), "Element" = c("CA123", "EFK456"), "SubElement" = c(50000, 35000))
require(dplyr)
results <- data %.%
group_by(OperatingRegion) %.%
summarise(SubE = sum(SubElement, na.rm = TRUE))
Source: local data frame [2 x 2]
OperatingRegion SubE
1 America 35000
2 Asia 50000
加载包后,您向 dplyr 提供数据框,然后使用特殊运算符 %.% 或 %>%, group_by 您想要的任何单个或多个变量.然后,调用 summarise
来创建总和、中位数、平均值或任何你想要的计算.
After loading the package, you provide dplyr the data frame and then, using the special operators %.% or %>%, group_by whatever single or multiple variables you want. Then, call summarise
to create sums, medians, averages or whatever computation you want.
这篇关于R数据子集重组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!