如何匹配,替换和求和R中另一个数据集的标题行? [英] How to match, replace and sum header rows from another dataset in R?

查看:179
本文介绍了如何匹配,替换和求和R中另一个数据集的标题行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据集:

a.看起来像这样的数据框:

a. A data frame that looks like this:

        SpeciesA  SpeciesB  SpeciesC  SpeciesD  SpeciesE  SpeciesF
Site1     1          0        4        6          2        5
Site2     1          0        4        6          2        5
Site3     1          0        4        6          2        5
Site4     1          0        4        6          2        5

(注意:行值不相同.这仅是出于表示目的)

(Note: The row values are NOT identical. This is just for the purpose of representation here)

b.另一个看起来像这样的数据集:

b. Another data-set that looks like this:

Family          Species
Family1         SpeciesA
Family1         SpeciesB
Family1         SpeciesC
Family2         SpeciesD
Family3         SpeciesE
Family4         SpeciesF

我想将数据集(2)中的Family列与data-frame(1)中的对应Species匹配,并在同一Family下将值(如果有多个种类)相加.我知道我可以使用merge函数,但是我不知道如何使用它,或者如何在标题行中调用它,然后对其全部求和.

I want to match the Family column in data-set (2) to the corresponding Species in data-frame(1) and add up the values (if there are multiple species) under the same Family. I know I can use the merge function, but I don't know how to use it, or how to call it in the header row and then sum it all.

最终输出

         Family1    Family1   Family1  Family2  Family3  Family4
Site1     1          0        4        6          2        5 
Site2     1          0        4        6          2        5 
Site3     1          0        4        6          2        5 
Site5     1          0        4        6          2        5 

最终输出

         Family1      Family2    Family3   Family4
Site1     5             6          2        5           
Site2     5             6          2        5             
Site3     5             6          2        5             
Site4     5             6          2        5     

推荐答案

如果我理解正确,则可以将第一个data.frame从宽"格式重塑为长"格式,并用第二个data.frame来重塑merge格式,并使用适当的聚合将结果重铸为宽格式:

If I understand correctly, you can reshape your first data.frame from "wide" to "long" format, merge it with the second data.frame, and recast the result to wide format, using appropriate aggregation:

dfa$id <- row.names(dfa)
mdfa <- reshape2::melt(dfa, id.vars = "id", variable.name = "Species")

reshape2::dcast(
    merge(dfb, mdfa, by = "Species"), 
    id ~ Family, 
    fun.aggregate = sum
)
#      id Family1 Family2 Family3 Family4
# 1 Site1       5       6       2       5
# 2 Site2       5       6       2       5
# 3 Site3       5       6       2       5
# 4 Site4       5       6       2       5


数据:

dfa <- read.table(text = "SpeciesA  SpeciesB  SpeciesC  SpeciesD  SpeciesE  SpeciesF
Site1     1          0        4        6          2        5
Site2     1          0        4        6          2        5
Site3     1          0        4        6          2        5
Site4     1          0        4        6          2        5",
header = TRUE, stringsAsFactors = FALSE)

dfb <- read.table(text = "Family          Species
Family1         SpeciesA
Family1         SpeciesB
Family1         SpeciesC
Family2         SpeciesD
Family3         SpeciesE
Family4         SpeciesF",
header = TRUE, stringsAsFactors = FALSE)

这篇关于如何匹配,替换和求和R中另一个数据集的标题行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆