根据查找表替换数据框中的值 [英] Replace values in a dataframe based on lookup table
问题描述
我在替换数据框中的值时遇到了一些麻烦.我想替换基于单独表的值.以下是我要执行的操作的示例.
I am having some trouble replacing values in a dataframe. I would like to replace values based on a separate table. Below is an example of what I am trying to do.
我有一张桌子,每一行都是客户,每一列都是他们购买的动物.让我们将此数据帧称为table
.
I have a table where every row is a customer and every column is an animal they purchased. Lets call this dataframe table
.
> table
# P1 P2 P3
# 1 cat lizard parrot
# 2 lizard parrot cat
# 3 parrot cat lizard
我还有一个要引用的表,称为lookUp
.
I also have a table that I will reference called lookUp
.
> lookUp
# pet class
# 1 cat mammal
# 2 lizard reptile
# 3 parrot bird
我想做的是创建一个名为new
的新表,该表具有一个函数,将table
中的所有值替换为lookUp
中的class
列.我自己使用lapply
函数尝试了此操作,但收到以下警告.
What I want to do is create a new table called new
with a function replaces all values in table
with the class
column in lookUp
. I tried this myself using an lapply
function, but I got the following warnings.
new <- as.data.frame(lapply(table, function(x) {
gsub('.*', lookUp[match(x, lookUp$pet) ,2], x)}), stringsAsFactors = FALSE)
Warning messages:
1: In gsub(".*", lookUp[match(x, lookUp$pet), 2], x) :
argument 'replacement' has length > 1 and only the first element will be used
2: In gsub(".*", lookUp[match(x, lookUp$pet), 2], x) :
argument 'replacement' has length > 1 and only the first element will be used
3: In gsub(".*", lookUp[match(x, lookUp$pet), 2], x) :
argument 'replacement' has length > 1 and only the first element will be used
关于如何进行这项工作的任何想法?
Any ideas on how to make this work?
推荐答案
您在问题中发布了一种不错的方法.这是一种更简单的方法:
You posted an approach in your question which was not bad. Here's a smiliar approach:
new <- df # create a copy of df
# using lapply, loop over columns and match values to the look up table. store in "new".
new[] <- lapply(df, function(x) look$class[match(x, look$pet)])
一种更快的替代方法是:
An alternative approach which will be faster is:
new <- df
new[] <- look$class[match(unlist(df), look$pet)]
请注意,在两种情况下,我都使用空括号([]
)来保持new
的结构(data.frame).
Note that I use empty brackets ([]
) in both cases to keep the structure of new
as it was (a data.frame).
(我在回答中使用的是df
而不是table
和look
而不是lookup
)
(I'm using df
instead of table
and look
instead of lookup
in my answer)
这篇关于根据查找表替换数据框中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!