合并R中具有不同列(变量)的4个数据对象 [英] Merge 4 data objects with different columns (variables) in R

查看:233
本文介绍了合并R中具有不同列(变量)的4个数据对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以最初我有以下对象:

So initially I had the following object:

> head(gs)
  year disturbance lek_id  complex tot_male
1 2006           N     3T  Diamond        3
2 2007           N     3T  Diamond       17
3 1981           N   bare 3corners        4
4 1982           N   bare 3corners        7
5 1983           N   bare 3corners        2
6 1985           N   bare 3corners        5

由此,我计算了复杂年内tot_male的常规统计最小值,最大值,均值和sd.我使用了R数据拆分功能,并在合适的地方分配了逻辑列名称,最终使它们成为不同的对象.

With that I computed general statistics min, max, mean, and sd of tot_male for year within complex. I used R data splitting functions, and assigned logical column names where it seemed appropriate and ultimately made them different objects.

> tyc_min = aggregate(gs$tot_male, by=list(gs$year, gs$complex), FUN=min)
> names(tyc_min) = c("year", "complex", "tot_male_min")
> tyc_max = aggregate(gs$tot_male, by=list(gs$year, gs$complex), FUN=max)
> names(tyc_max) = c("year", "complex", "tot_male_max")
> tyc_mean = aggregate(gs$tot_male, by=list(gs$year, gs$complex), FUN=mean)
> names(tyc_mean) = c("year", "complex", "tot_male_mean")
> tyc_sd = aggregate(gs$tot_male, by=list(gs$year, gs$complex), FUN=sd)
> names(tyc_sd) = c("year", "complex", "tot_male_sd")

示例输出(第二个对象-Tyc_max):

Example Output (2nd Object - Tyc_max):

year  complex tot_male_max
1 2003                     0
2 1970 3corners           26
3 1971 3corners           22
4 1972 3corners           26
5 1973 3corners           32
6 1974 3corners           18

现在,我还需要添加每年/复杂组合的样本数量.然后,我需要将它们合并到单个数据对象中,并导出为.csv文件

Now I need to add the number of samples per year/complex combination as well. Then I need to merge these into single data object, and export as a .csv file

我知道我需要与all.y一起使用merge()函数,但不知道如何处理此错误:

I know I need to use merge() function along with all.y but have no idea how to handle this error:

Error in fix.by(by.x, x) : 
  'by' must specify one or more columns as numbers, names or logical

或..添加每年和复杂的样本数量.有什么建议吗?

Or.. add the number of samples per year and complex. Any suggestions?

推荐答案

这可能有效(但是如果没有

This might work (but hard to check without a reproducible example):

gsnew <- Reduce(function(...) merge(..., all = TRUE, by = c("year","complex")), 
                list(tyc_min, tyc_max, tyc_mean, tyc_sd))

但是,除了汇总单独的统计信息然后合并之外,您还可以一次将所有内容汇总到一个新的 dataframe / datatable 中,例如,data.table c1>或基本R.那么您以后就不必合并(有关基本R解决方案,请参见其他答案):

But instead of aggregating for the separate statistics and then merging, you can also aggregate everything at once into a new dataframe / datatable with for example data.table, dplyr or base R. Then you don't have to merge afterwards (for a base R solution see the other answer):

library(data.table)
gsnew <- setDT(gs)[, .(male_min = min(tot_male),
                       male_max = max(tot_male),
                       male_mean = mean(tot_male),
                       male_sd = sd(tot_male), by = .(year, complex)]

library(dplyr)
gsnew <- gs %>% group_by(year, complex) %>%
  summarise(male_min = min(tot_male),
            male_max = max(tot_male),
            male_mean = mean(tot_male),
            male_sd = sd(tot_male))

这篇关于合并R中具有不同列(变量)的4个数据对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆