用来自另一个data.frame的值替换data.frame列中的值 [英] Replacing values in a column of a data.frame with values from another data.frame

查看:243
本文介绍了用来自另一个data.frame的值替换data.frame列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的情况是我有一个数据框,列中填满了整数1到6.我想用更多描述性的标签替换这些整数,在另一个数据帧中提供了一个key:

  V1 V2 
1 1 LABEL1
2 2 LABEL2
3 3 LABEL3
4 4 LABEL4
5 5 LABEL5
6 6 LABEL6

所以每当我找到第一个数据帧列(df $ colX)中的数字1,我​​想用LABEL1替换它(即标签列2,其中df $ colX ==标签列1)。



尝试
替换(df $ colX,labels [,1],labels [,2])
但这只是转整数由于某种原因成为引用的整数。



我可以用for循环来做到这一点,但这似乎很慢。


$ b $我也在StackOverflow上关于因素的一些建议,但是我没有任何一个列这里似乎涉及因素(用stringsAsFactors = FALSE读取)。任何想法?

解决方案

您可以尝试 match

  df $ colX<  -  labels [,2] [match(df $ colX,labels [,1])] 

甚至以下应该可以工作

  labels [,2] [df $ colX] 
#[1]LABEL3LABEL5LABEL1LABEL6LABEL1LABEL6LABEL4LABEL3
# [9]LABEL1LABEL2LABEL2LABEL3LABEL6LABEL4LABEL5LABEL1
#[17]LABEL4LABEL5 LABEL3LABEL1LABEL1
#[25]LABEL2





  labels<  -  structure(list(V1 = 1:6,V2 = c(LABEL1,LABEL2 LABEL3,
LABEL4,LABEL5,LABEL6)),.Names = c(V1,V2),class =data.frame,row.names = c 1,
2,3,4,5,6))

set.seed(25)
df< - data.frame(colX = sample(1:6,25,replace = TRUE),colY = rnorm(25))


My situation is that I have a data frame with a column filled with the integers 1 to 6. I would like to replace these integers with more descriptive labels, provided in another data frame which acts as a "key":

  V1                 V2
1  1             LABEL1
2  2             LABEL2
3  3             LABEL3
4  4             LABEL4
5  5             LABEL5
6  6             LABEL6

So whenever I find a number 1 in the first data frame column (df$colX), I want to replace it with LABEL1 (i.e., label column 2, where df$colX == label column 1).

I have tried replace(df$colX,labels[,1],labels[,2]) but this just turns the integers into quoted integers for some reason.

I could do this with a for loop, but that seems very slow.

I have also followed some advice on StackOverflow about factors, but none of the columns I'm working with here seem to involve factors (read with stringsAsFactors = FALSE). Any ideas?

解决方案

You could try match

 df$colX <- labels[,2][match(df$colX, labels[,1])]

Or even the below should work

 labels[,2][df$colX]
 #[1] "LABEL3" "LABEL5" "LABEL1" "LABEL6" "LABEL1" "LABEL6" "LABEL4" "LABEL3"
 #[9] "LABEL1" "LABEL2" "LABEL2" "LABEL3" "LABEL6" "LABEL4" "LABEL5" "LABEL1"
 #[17] "LABEL4" "LABEL5" "LABEL3" "LABEL5" "LABEL1" "LABEL3" "LABEL1" "LABEL1"
 #[25] "LABEL2"

data

 labels <- structure(list(V1 = 1:6, V2 = c("LABEL1", "LABEL2", "LABEL3", 
 "LABEL4", "LABEL5", "LABEL6")), .Names = c("V1", "V2"), class = "data.frame", row.names = c("1", 
 "2", "3", "4", "5", "6"))

 set.seed(25)
 df <- data.frame(colX= sample(1:6,25, replace=TRUE), colY=rnorm(25))

这篇关于用来自另一个data.frame的值替换data.frame列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆