R数据子集重组 [英] R data subset restructuring
问题描述
我有以下数据集。对于列,我有操作区域,元素类型(CA,OBU),子元素和净收入。
目前的数据是相当大的(50 000行),我想通过元素,子元素和NR来获取操作区域的总结。
示例
运营区域元素子元素NR
亚洲CA CA123 50 000
美国OBU EFK456 35 000
有人可以指导我如何完成这个?
任何相关的读物/示例都将不胜感激。
使用下面的数据返回数据框架对象数据,您可以使用 dplyr
包以许多不同的方式组织结果。这里有一个例子:
data< - data.frame(OperatingRegion= c(Asia,America ),Region= c(CA,OBU),Element= c(CA123,EFK456),SubElement= c(50000,35000))
require(dplyr)
结果< - data%。%
group_by(OperatingRegion)%。%
summaryize(SubE = sum(SubElement,na.rm = TRUE))
资料来源:本地数据框架[2 x 2]
操作区域SubE
1美国35000
2亚洲50000
加载包后,您可以提供dplyr数据框,然后使用特殊运算符%。%或%>%,group_by,无论单个或多个变量你要。然后,调用总结
以创建总和,中位数,平均值或您想要的任何计算。
I am fairly new to R/Rstudio and I am still learning how to do certain operations. I have the following data set. For columns I have Operating Region, type of element(CA,OBU), sub-element and Net Revenue. Currently the data is quite big(50 000 rows) and I want to get a summary of Operating region by element,sub-element and NR. Example
Operating Region Element Sub-Element NR
Asia CA CA123 50 000
America OBU EFK456 35 000
Could someone please guide me on how to accomplish this? Any relevant readings/examples would be much appreciated.
Using the data below to return the data frame object "data," you can use the dplyr
package to organize results in many different ways. Here is one example:
data <- data.frame("OperatingRegion" = c("Asia", "America"), "Region" = c("CA", "OBU"), "Element" = c("CA123", "EFK456"), "SubElement" = c(50000, 35000))
require(dplyr)
results <- data %.%
group_by(OperatingRegion) %.%
summarise(SubE = sum(SubElement, na.rm = TRUE))
Source: local data frame [2 x 2]
OperatingRegion SubE
1 America 35000
2 Asia 50000
After loading the package, you provide dplyr the data frame and then, using the special operators %.% or %>%, group_by whatever single or multiple variables you want. Then, call summarise
to create sums, medians, averages or whatever computation you want.
这篇关于R数据子集重组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!