如何匹配,替换和求和R中另一个数据集的标题行? [英] How to match, replace and sum header rows from another dataset in R?
问题描述
我有两个数据集:
a.看起来像这样的数据框:
a. A data frame that looks like this:
SpeciesA SpeciesB SpeciesC SpeciesD SpeciesE SpeciesF
Site1 1 0 4 6 2 5
Site2 1 0 4 6 2 5
Site3 1 0 4 6 2 5
Site4 1 0 4 6 2 5
(注意:行值不相同.这仅是出于表示目的)
(Note: The row values are NOT identical. This is just for the purpose of representation here)
b.另一个看起来像这样的数据集:
b. Another data-set that looks like this:
Family Species
Family1 SpeciesA
Family1 SpeciesB
Family1 SpeciesC
Family2 SpeciesD
Family3 SpeciesE
Family4 SpeciesF
我想将数据集(2)中的Family列与data-frame(1)中的对应Species匹配,并在同一Family下将值(如果有多个种类)相加.我知道我可以使用merge
函数,但是我不知道如何使用它,或者如何在标题行中调用它,然后对其全部求和.
I want to match the Family column in data-set (2) to the corresponding Species in data-frame(1) and add up the values (if there are multiple species) under the same Family. I know I can use the merge
function, but I don't know how to use it, or how to call it in the header row and then sum it all.
最终输出
Family1 Family1 Family1 Family2 Family3 Family4
Site1 1 0 4 6 2 5
Site2 1 0 4 6 2 5
Site3 1 0 4 6 2 5
Site5 1 0 4 6 2 5
最终输出
Family1 Family2 Family3 Family4
Site1 5 6 2 5
Site2 5 6 2 5
Site3 5 6 2 5
Site4 5 6 2 5
推荐答案
如果我理解正确,则可以将第一个data.frame
从宽"格式重塑为长"格式,并用第二个data.frame
来重塑merge
格式,并使用适当的聚合将结果重铸为宽格式:
If I understand correctly, you can reshape your first data.frame
from "wide" to "long" format, merge
it with the second data.frame
, and recast the result to wide format, using appropriate aggregation:
dfa$id <- row.names(dfa)
mdfa <- reshape2::melt(dfa, id.vars = "id", variable.name = "Species")
reshape2::dcast(
merge(dfb, mdfa, by = "Species"),
id ~ Family,
fun.aggregate = sum
)
# id Family1 Family2 Family3 Family4
# 1 Site1 5 6 2 5
# 2 Site2 5 6 2 5
# 3 Site3 5 6 2 5
# 4 Site4 5 6 2 5
数据:
dfa <- read.table(text = "SpeciesA SpeciesB SpeciesC SpeciesD SpeciesE SpeciesF
Site1 1 0 4 6 2 5
Site2 1 0 4 6 2 5
Site3 1 0 4 6 2 5
Site4 1 0 4 6 2 5",
header = TRUE, stringsAsFactors = FALSE)
dfb <- read.table(text = "Family Species
Family1 SpeciesA
Family1 SpeciesB
Family1 SpeciesC
Family2 SpeciesD
Family3 SpeciesE
Family4 SpeciesF",
header = TRUE, stringsAsFactors = FALSE)
这篇关于如何匹配,替换和求和R中另一个数据集的标题行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!