合并R中的两个因子列 [英] Merge two factor columns in R

查看:709
本文介绍了合并R中的两个因子列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中遇到了一些麻烦.我正在尝试合并(合并?)数据帧中的两个(因子)列.对于每一行,只有一列中有一个值,我想将它们组合起来,以便所有行都有一个值.作为一个简化的示例,假设我运行了以下代码:df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "), z=c(" ", " ", " ", "p")),我得到了以下数据帧

Hi I am having trouble with something in R. I'm trying to merge (combine?) two (factor) columns in a data frame. For each row, there is a value in only one of the columns and I want to combine them so that all the rows have a value. As a simplified example, suppose I've run the following code: df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "), z=c(" ", " ", " ", "p")), I get the following data frame

    x   y
1   a   
2   b        
3       q

x和y列合并后,结果为

After the x and y columns are merged, The result would be

  x y merged
1 a        a
2 b        b
3   q      q

我尝试使用df$merged = ifelse(df$x == " ", df$y, df$x),但是它给了我这些数字.知道它们是什么意思吗?

I have tried using df$merged = ifelse(df$x == " ", df$y, df$x), but it gives me these numbers. Any idea what they mean?

  x y merged
1 a        2
2 b        3
3   q      2

我遇到的所有其他有用信息都适用于数字,但不适用于字符.到目前为止,我在尝试的方法是否正确?

All the other helpful information I have come across works well with numbers, but not characters. Am I on the right track with what I have tried so far?

这似乎是一个简单的问题,但我找不到解决方法.任何帮助,将不胜感激.

It seems like such a simple problem but I have not been able to find a solution. Any help would be appreciated.

谢谢.

推荐答案

在示例数据集中,共有三列.当有多列时,可以使用以下方法. (在这里,我假设您每行只有一个值")

In your example dataset, there were three columns. The below approach could be used when there are multiple columns. (Here, I assumed that you have only a single "value" in each row)

df$merged <- df[cbind(1:nrow(df),max.col(df!=' ', 'first'))]
df
#  x y z merged
#1 a          a
#2 b          b
#3   q        q
#4     p      p

或者循环方法将是:

apply(df, 1, function(x) x[x!=' '])
#[1] "a" "b" "q" "p"

如果每行有多个值",则可以将paste个值放在一起. toStringpaste(., collapse=", ")

If there are more than one "value" per row, you can paste the values together. toString is a wrapper for paste(., collapse=", ")

apply(df,1, function(x) toString(x[x!=' ']))

或者您可以melt数据集,然后使用aggregatepaste

Or you could melt the dataset and then use aggregate to paste the values

library(reshape2)
aggregate(value~Var1, subset(melt(as.matrix(df)), value!= ' '), 
                      toString)$value

数据

df <- data.frame(x=c("a", "b", " ", " "), y=c(" ", " ", "q", " "), 
                z=c(" ", " ", " ", "p"))

这篇关于合并R中的两个因子列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆