R:按类别计数唯一值 [英] R: Count unique values by category
问题描述
我在R中有以下数据:
Cnty Yr Plt Spp DBH Ht Age
1 185 1999 20001 Bitternut 8.0 54 47
2 185 1999 20001 Bitternut 7.2 55 50
3 31 1999 20001 Pignut 7.4 71 60
4 31 1999 20001 Pignut 11.4 85 114
5 189 1999 20001 WO 14.5 80 82
6 189 1999 20001 WO 12.1 72 79
我想知道每个县(Cnty)的独特物种数量(Spp)。 unique(dfname $ Spp)给出了在数据框中的唯一物种的总数,但我想由县。
I would like to know the quantity of unique species (Spp) in each county (Cnty). "unique(dfname$Spp)" gives me a total count of unique species in the data frame, but I would like it by county.
任何帮助是赞赏!对于古怪的格式,这是我对SO的第一个问题。
Any help is appreciated! Sorry for the weird formatting, this is my first ever question on SO.
谢谢。
推荐答案
我试图让你的示例数据有点更有趣。您的示例数据目前每个Cnty只有一个唯一的Spp。
I've tried to make your sample data a little bit more interesting. Your sample data presently has just one unique "Spp" per "Cnty".
set.seed(1)
mydf <- data.frame(
Cnty = rep(c("185", "31", "189"), times = c(5, 3, 2)),
Yr = c(rep(c("1999", "2000"), times = c(3, 2)),
"1999", "1999", "2000", "2000", "2000"),
Plt = "20001",
Spp = sample(c("Bitternut", "Pignut", "WO"), 10, replace = TRUE),
DBH = runif(10, 0, 15)
)
mydf
# Cnty Yr Plt Spp DBH
# 1 185 1999 20001 Bitternut 3.089619
# 2 185 1999 20001 Pignut 2.648351
# 3 185 1999 20001 Pignut 10.305343
# 4 185 2000 20001 WO 5.761556
# 5 185 2000 20001 Bitternut 11.547621
# 6 31 1999 20001 WO 7.465489
# 7 31 1999 20001 WO 10.764278
# 8 31 2000 20001 Pignut 14.878591
# 9 189 2000 20001 Pignut 5.700528
# 10 189 2000 20001 Bitternut 11.661678
接下来,根据建议, tapply
是一个很好的候选人。合并 unique
和 length
以获取您要查找的数据。
Next, as suggested, tapply
is a good candidate here. Combine unique
and length
to get the data you are looking for.
with(mydf, tapply(Spp, Cnty, FUN = function(x) length(unique(x))))
# 185 189 31
# 3 2 2
with(mydf, tapply(Spp, list(Cnty, Yr), FUN = function(x) length(unique(x))))
# 1999 2000
# 185 2 2
# 189 NA 2
# 31 1 1
'感兴趣的简单列表(不是唯一值),那么你可以探索表
和 ftable
:
If you're interested in simple tabulation (not of unique values), then you can explore table
and ftable
:
with(mydf, table(Spp, Cnty))
# Cnty
# Spp 185 189 31
# Bitternut 2 1 0
# Pignut 2 1 1
# WO 1 0 2
ftable(mydf, row.vars="Spp", col.vars=c("Cnty", "Yr"))
# Cnty 185 189 31
# Yr 1999 2000 1999 2000 1999 2000
# Spp
# Bitternut 1 1 0 1 0 0
# Pignut 2 0 0 1 0 1
# WO 0 1 0 0 2 0
这篇关于R:按类别计数唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!