如何按数据框中的名称删除列 [英] How to drop columns by name in a data frame
问题描述
我有一个大数据集,我想阅读具体的列或删除所有其他列。
I have a large data set and I would like to read specific columns or drop all the others.
data <- read.dta("file.dta")
我选择了我不感兴趣的列in:
I select the columns that I'm not interested in:
var.out <- names(data)[!names(data) %in% c("iden", "name", "x_serv", "m_serv")]
而不是我想请执行以下操作:
and than I'd like to do something like:
for(i in 1:length(var.out)) {
paste("data$", var.out[i], sep="") <- NULL
}
删除所有不需要的列。这是最佳解决方案吗?
to drop all the unwanted columns. Is this the optimal solution?
推荐答案
您应该使用索引或子集
函数。例如:
You should use either indexing or the subset
function. For example :
R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8)
R> df
x y z u
1 1 2 3 4
2 2 3 4 5
3 3 4 5 6
4 4 5 6 7
5 5 6 7 8
然后你可以使用这个
函数和列索引中的 -
运算符:
Then you can use the which
function and the -
operator in column indexation :
R> df[ , -which(names(df) %in% c("z","u"))]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
或者,更简单, 选择
参数:然后可以使用子集
函数的 -
操作符直接在列名称的向量上,甚至可以忽略名称周围的引号!
Or, much simpler, use the select
argument of the subset
function : you can then use the -
operator directly on a vector of column names, and you can even omit the quotes around the names !
R> subset(df, select=-c(z,u))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
请注意,您还可以选择所需的列,而不是删除其他列: p>
Note that you can also select the columns you want instead of dropping the others :
R> df[ , c("x","y")]
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
R> subset(df, select=c(x,y))
x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
这篇关于如何按数据框中的名称删除列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!