R:按类别计算唯一值 [英] R: Count unique values by category
问题描述
我在 R 中有这样的数据:
I have data in R that looks like this:
Cnty Yr Plt Spp DBH Ht Age
1 185 1999 20001 Bitternut 8.0 54 47
2 185 1999 20001 Bitternut 7.2 55 50
3 31 1999 20001 Pignut 7.4 71 60
4 31 1999 20001 Pignut 11.4 85 114
5 189 1999 20001 WO 14.5 80 82
6 189 1999 20001 WO 12.1 72 79
我想知道每个县 (Cnty) 中独特物种 (Spp) 的数量.unique(dfname$Spp)"给了我数据框中独特物种的总数,但我希望按县分列.
I would like to know the quantity of unique species (Spp) in each county (Cnty). "unique(dfname$Spp)" gives me a total count of unique species in the data frame, but I would like it by county.
感谢任何帮助!抱歉,奇怪的格式,这是我关于 SO 的第一个问题.
Any help is appreciated! Sorry for the weird formatting, this is my first ever question on SO.
谢谢.
推荐答案
我已尝试让您的示例数据更有趣一些.您的样本数据目前每个Cnty"只有一个唯一的Spp".
I've tried to make your sample data a little bit more interesting. Your sample data presently has just one unique "Spp" per "Cnty".
set.seed(1)
mydf <- data.frame(
Cnty = rep(c("185", "31", "189"), times = c(5, 3, 2)),
Yr = c(rep(c("1999", "2000"), times = c(3, 2)),
"1999", "1999", "2000", "2000", "2000"),
Plt = "20001",
Spp = sample(c("Bitternut", "Pignut", "WO"), 10, replace = TRUE),
DBH = runif(10, 0, 15)
)
mydf
# Cnty Yr Plt Spp DBH
# 1 185 1999 20001 Bitternut 3.089619
# 2 185 1999 20001 Pignut 2.648351
# 3 185 1999 20001 Pignut 10.305343
# 4 185 2000 20001 WO 5.761556
# 5 185 2000 20001 Bitternut 11.547621
# 6 31 1999 20001 WO 7.465489
# 7 31 1999 20001 WO 10.764278
# 8 31 2000 20001 Pignut 14.878591
# 9 189 2000 20001 Pignut 5.700528
# 10 189 2000 20001 Bitternut 11.661678
接下来,正如建议的那样,tapply
是一个很好的候选者.结合 unique
和 length
以获得您要查找的数据.
Next, as suggested, tapply
is a good candidate here. Combine unique
and length
to get the data you are looking for.
with(mydf, tapply(Spp, Cnty, FUN = function(x) length(unique(x))))
# 185 189 31
# 3 2 2
with(mydf, tapply(Spp, list(Cnty, Yr), FUN = function(x) length(unique(x))))
# 1999 2000
# 185 2 2
# 189 NA 2
# 31 1 1
如果您对简单的制表(不是唯一值)感兴趣,那么您可以探索 table
和 ftable
:
If you're interested in simple tabulation (not of unique values), then you can explore table
and ftable
:
with(mydf, table(Spp, Cnty))
# Cnty
# Spp 185 189 31
# Bitternut 2 1 0
# Pignut 2 1 1
# WO 1 0 2
ftable(mydf, row.vars="Spp", col.vars=c("Cnty", "Yr"))
# Cnty 185 189 31
# Yr 1999 2000 1999 2000 1999 2000
# Spp
# Bitternut 1 1 0 1 0 0
# Pignut 2 0 0 1 0 1
# WO 0 1 0 0 2 0
这篇关于R:按类别计算唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!