匹配并替换data.table中的许多值 [英] Match and replace many values in data.table

查看:94
本文介绍了匹配并替换data.table中的许多值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中包含许多错误命名的条目。我创建了两列.csv,其中一列包含旧(不正确)名称,第二列包含相应的新(正确)名称。现在,我需要告诉R用正确的名称替换数据中的每个旧名称。

I have a dataset with many misnamed entries. I created a two column .csv that includes the old (incorrect) names in one column and the corresponding new (correct) names in the second column. Now I need to tell R to replace every old name in the data with the correct name.

testData = data.table(oldName = c("Nu York", "Was DC", "Buston",  "Nu York"))
replacements = data.table(oldName = c("Buston", "Nu York", "Was DC"), 
    newName = c("Boston", "New York", "Washington DC"))

    # The next line fails.
holder = replace(testData, testData[, oldName]==replacements[, oldName], 
    replacements[, newName]


推荐答案

这是我的替换方法:

setkey(testData, oldName)
setkey(replacements, oldName)

testData[replacements, oldName := newName]
testData
#         oldName
#1:        Boston
#2:      New York
#3:      New York
#4: Washington DC

如果您喜欢原始订单,可以添加一个索引,并在末尾将其放回原始顺序。

You can add an index if you like the original order and put it back in original order at the end.

这篇关于匹配并替换data.table中的许多值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆