比较R中的2个数据集 [英] Comparing 2 datasets in R
问题描述
我从名为babies2009(3个向量计数,名称,性别)的数据集中提取了2个数据集
一个是girls2009包含所有女孩和其他男孩。
我想知道男孩和女孩之间有什么相似的名字。
我试过这个
common.names =(boys2009 $ name%in%girls2009 $ name)
$ b b
当我尝试
babies2009 [common.names,] [1:10,]
所有我得到的是女孩的名字,而不是通用名。
boys2009 [1: 10,]
girsl2009 [1:10,]
数据集并确定它们共享的值。
感谢,
解决方案 common.names =(boys2009 $ name%in%girls2009 $ name)
给你一个长度 length(boys2009 $ name)
的逻辑向量。所以,当你尝试从更长的data.frame babies2009 [common.names,] [1:10,]
中选择时,你会用废话来结束。
解决方案:在正确的data.frame上使用逻辑向量。
boys2009< - data.frame(names = c(Billy,Bob),data = runif(2),gender =M,stringsAsFactors = FALSE)
girls2009< (Billy,Mae,Sue),data = runif(3),gender =F,stringsAsFactors = FALSE)
babies2009 < - rbind(boys2009,girls2009)
common.names< - (boys2009 $ name%in%girls2009 $ name)
> boys2009 [common.names,] $ names
[1]Billy
I have 2 extracted data sets from a dataset called babies2009( 3 vectors count, name, gender )
One is girls2009 containing all the girls and the other boys2009.
I want to find out what similar names exist between boys and girls.
I tried this
common.names = (boys2009$name %in% girls2009$name)
When I try
babies2009[common.names, ] [1:10, ]
all I get is the girl names not the common names.
I have confirmed that both data sets indeed contain boys and girls respectively by doing taking a 10 sample...
boys2009 [1:10,]
girsl2009 [1:10,]
How else can I compare the 2 datasets and determine what values they both share.
Thanks,
解决方案 common.names = (boys2009$name %in% girls2009$name)
gives you a logical vector of length length(boys2009$name)
. So when you try selecting from a much longer data.frame babies2009[common.names, ] [1:10, ]
, you wind up with nonsense.
Solution: use that logical vector on the proper data.frame!
boys2009 <- data.frame( names=c("Billy","Bob"),data=runif(2), gender="M" , stringsAsFactors=FALSE)
girls2009 <- data.frame( names=c("Billy","Mae","Sue"),data=runif(3), gender="F" , stringsAsFactors=FALSE)
babies2009 <- rbind(boys2009,girls2009)
common.names <- (boys2009$name %in% girls2009$name)
> boys2009[common.names, ]$names
[1] "Billy"
这篇关于比较R中的2个数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!