合并R中的两个因子列 [英] Merge two factor columns in R
问题描述
我在R中遇到了一些麻烦.我正在尝试合并(合并?)数据帧中的两个(因子)列.对于每一行,只有一列中有一个值,我想将它们组合起来,以便所有行都有一个值.作为一个简化的示例,假设我运行了以下代码:df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "), z=c(" ", " ", " ", "p"))
,我得到了以下数据帧
Hi I am having trouble with something in R. I'm trying to merge (combine?) two (factor) columns in a data frame. For each row, there is a value in only one of the columns and I want to combine them so that all the rows have a value. As a simplified example, suppose I've run the following code: df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "), z=c(" ", " ", " ", "p"))
, I get the following data frame
x y
1 a
2 b
3 q
x和y列合并后,结果为
After the x and y columns are merged, The result would be
x y merged
1 a a
2 b b
3 q q
我尝试使用df$merged = ifelse(df$x == " ", df$y, df$x)
,但是它给了我这些数字.知道它们是什么意思吗?
I have tried using df$merged = ifelse(df$x == " ", df$y, df$x)
, but it gives me these numbers. Any idea what they mean?
x y merged
1 a 2
2 b 3
3 q 2
我遇到的所有其他有用信息都适用于数字,但不适用于字符.到目前为止,我在尝试的方法是否正确?
All the other helpful information I have come across works well with numbers, but not characters. Am I on the right track with what I have tried so far?
这似乎是一个简单的问题,但我找不到解决方法.任何帮助,将不胜感激.
It seems like such a simple problem but I have not been able to find a solution. Any help would be appreciated.
谢谢.
推荐答案
在示例数据集中,共有三列.当有多列时,可以使用以下方法. (在这里,我假设您每行只有一个值")
In your example dataset, there were three columns. The below approach could be used when there are multiple columns. (Here, I assumed that you have only a single "value" in each row)
df$merged <- df[cbind(1:nrow(df),max.col(df!=' ', 'first'))]
df
# x y z merged
#1 a a
#2 b b
#3 q q
#4 p p
或者循环方法将是:
apply(df, 1, function(x) x[x!=' '])
#[1] "a" "b" "q" "p"
如果每行有多个值",则可以将paste
个值放在一起. toString
是paste(., collapse=", ")
If there are more than one "value" per row, you can paste
the values together. toString
is a wrapper for paste(., collapse=", ")
apply(df,1, function(x) toString(x[x!=' ']))
或者您可以melt
数据集,然后使用aggregate
来paste
值
Or you could melt
the dataset and then use aggregate
to paste
the values
library(reshape2)
aggregate(value~Var1, subset(melt(as.matrix(df)), value!= ' '),
toString)$value
数据
df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "),
z=c(" ", " ", " ", "p"))
这篇关于合并R中的两个因子列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!