请参考R中的名称范围 [英] refer to range of columns by name in R
问题描述
但是,鉴于我不知道列号,我想通过名称引用它们。有没有办法做到这一点?在sas或spss中,通过名称引用一系列变量是相当容易的。或者,有没有一个简单的方法来确定哪个列号对应于R中的变量名?
使用%in%
与 names()
相结合。从数据框中抓取一组列是有用的。当您想保留一个子集并放弃其余部分时,您可以否定表达式。在R控制台提示下输入?%%in
。
set.seed(1234)
mydf< - data.frame(A = runif(5,1,2),
B = runif(5,3,4),
C = runif(5,5,6),
D = runif(5,7,8),
E = runif(5,9,10))
mydf
keep.cols< - c('A','D','E')
mydf [,name(mydf)%in%keep.cols]
drop.cols < - c('A','B','C')
mydf [,!name(mydf)%in%drop.cols]
数据框:
> mydf
ABCDE
1 1.113703 3.640311 5.693591 7.837296 9.316612
2 1.622299 3.009496 5.544975 7.286223 9.302693
3 1.609275 3.232551 5.282734 7.266821 9.159046
4 1.623379 3.666084 5.923433 7.186723 9.039996
5 1.860915 3.514251 5.292316 7.232226 9.218800
列的一部分:
> mydf [,名称(mydf)%在%keep.cols]
ADE
1 1.113703 7.837296 9.316612
2 1.622299 7.286223 9.302693
3 1.609275 7.266821 9.159046
4 1.623379 7.186723 9.039996
5 1.860915 7.232226 9.218800
保留列的一部分并删除其余部分: p>
> mydf [,!name(mydf)%in%drop.cols]
DE
1 7.837296 9.316612
2 7.286223 9.302693
3 7.266821 9.159046
4 7.186723 9.039996
5 7.232226 9.218800
I need help with something that might be fairly simple in R. I want to refer to a range of columns in a data frame (e.g., extracting a few select variables). However, I don't know their column numbers. Normally, if I wanted to extract columns 4-10 i would say mydata[,4:10].
However, given that I don't know the column numbers, I would want to refer to them by name. Is there an easy way to do this? in sas or spss it is fairly easy to refer to a range of variables by name. Alternatively, is there an easy way to figure out which column number corresponds to a variable name in R?
Use %in%
in combination with names()
. It's useful for grabbing a group of columns from a data frame. You can negate the expression when you want to keep just a subset and drop the rest. Type ?"%in%"
at the R Console prompt for more details.
set.seed(1234)
mydf <- data.frame(A = runif(5, 1, 2),
B = runif(5, 3, 4),
C = runif(5, 5, 6),
D = runif(5, 7, 8),
E = runif(5, 9, 10))
mydf
keep.cols <- c('A','D','E')
mydf[, names(mydf) %in% keep.cols]
drop.cols <- c('A','B','C')
mydf[, !names(mydf) %in% drop.cols]
The data frame:
> mydf
A B C D E
1 1.113703 3.640311 5.693591 7.837296 9.316612
2 1.622299 3.009496 5.544975 7.286223 9.302693
3 1.609275 3.232551 5.282734 7.266821 9.159046
4 1.623379 3.666084 5.923433 7.186723 9.039996
5 1.860915 3.514251 5.292316 7.232226 9.218800
A subset of columns:
> mydf[, names(mydf) %in% keep.cols]
A D E
1 1.113703 7.837296 9.316612
2 1.622299 7.286223 9.302693
3 1.609275 7.266821 9.159046
4 1.623379 7.186723 9.039996
5 1.860915 7.232226 9.218800
Keeping a subset of columns and dropping the rest:
> mydf[, !names(mydf) %in% drop.cols]
D E
1 7.837296 9.316612
2 7.286223 9.302693
3 7.266821 9.159046
4 7.186723 9.039996
5 7.232226 9.218800
这篇关于请参考R中的名称范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!