使用查找表更改数据框的多个列中的值 [英] Change values in multiple columns of a dataframe using a lookup table

查看:54
本文介绍了使用查找表更改数据框的多个列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用查找表一次更改多个列的值.它们都使用相同的查找表.我知道如何仅对一列执行此操作-我只会使用merge,但是在处理多列时遇到了麻烦.

I am trying to change the value of a number of columns at once using a lookup table. They all use the same lookup table. I know how to do this for just one column -- I'd just use a merge, but am having trouble with multiple columns.

下面是一个示例数据框和一个示例查找表.我的实际数据要大得多(约8列10K列).

Below is an example dataframe and an example lookup table. My actual data is much larger (~10K columns with 8 rows).

example <- data.frame(a = seq(1,5), b = seq(5,1), c=c(1,4,3,2,5))

lookup <- data.frame(number = seq(1,5), letter = LETTERS[seq(1,5)])

理想情况下,我最终得到一个看起来像这样的数据框:

Ideally, I would end up with a dataframe which looks like this:

example_of_ideal_output <- data.frame(a = LETTERS[seq(1,5)], b = LETTERS[seq(5,1)], c=LETTERS[c(1,4,3,2,5)])

当然,在我的实际数据中,数据帧是数字,但是查找表要复杂得多,所以我不能仅仅使用LETTERS之类的函数来解决问题.

Of course, in my actual data the dataframe is numbers, but the lookup table is a lot more complicated, so I can't just use a function like LETTERS to solve things.

提前谢谢!

推荐答案

以下是使用lapply()依次对每个列起作用的解决方案:

Here's a solution that works on each column successively using lapply():

as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
##   a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E

或者,如果您不介意切换到矩阵,则可以实现更加矢量化"的解决方案,因为矩阵将允许您为整个输入仅调用一次match()和索引lookup$letter:

Alternatively, if you don't mind switching over to a matrix, you can achieve a "more vectorized" solution, as a matrix will allow you to call match() and index lookup$letter just once for the entire input:

matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
##      [,1] [,2] [,3]
## [1,] "A"  "E"  "A"
## [2,] "B"  "D"  "D"
## [3,] "C"  "C"  "C"
## [4,] "D"  "B"  "B"
## [5,] "E"  "A"  "E"

(当然,您可以随后通过as.data.frame()强制返回到data.frame,尽管如果需要,还必须还原列名,这可以通过setNames(...,names(example))完成.但是如果您真的想坚持使用data.frame,我的第一个解决方案可能更可取.)

(And of course you can coerce back to data.frame via as.data.frame() afterward, although you'll have to restore the column names as well if you want them, which can be done with setNames(...,names(example)). But if you really want to stick with a data.frame, my first solution is probably preferable.)

这篇关于使用查找表更改数据框的多个列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆