如何在R data.table中对变量的单词进行排序? [英] How can I sort words of variable in R data.table?
本文介绍了如何在R data.table中对变量的单词进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个凌乱的数据,其中包含1-3个代码的字符串.
I have a messy data that consists of a string with 1-3 codes.
library(data.table)
data <- data.table(ID = c(1, 2), text = c("3TC ABC DTG", "3TC DTG ABC"))
不幸的是,代码不是按字母顺序编写的,我希望它们以这样的形式出现.两条记录都应转换为 3TC ABC DTG
Unfortunately the codes are not written in alphabetical order and I would like them to appear so. Both records should translate to
3TC ABC DTG
我尝试用字符串分割来模拟
I tried mocking around with splitting string
data[, c("text1", "text2", "text3") := tstrsplit(text, " ", fixed = TRUE)]
但是找不到找到一种方法来对这三个:/
but cannot find a way to sort and combine these three :/
我还考虑过重塑,但随后我的 dcast
似乎遇到了麻烦:
I also thought about reshaping but then my dcast
seems to have troubles:
data_long <- melt(data,
id.vars = c("ID"),
measure.vars = c("text1", "text2", "text3"),
na.rm = TRUE)
result <- dcast(data,
ID ~ variable,
function (x) paste(x, collapse = " "))
可以解决吗?
推荐答案
您距离很近.尝试
data[, text_new := unlist( lapply( strsplit( text, " " ),
function(x) paste0( sort(x), collapse = " "))) ]
ID text text_new
1: 1 3TC ABC DTG 3TC ABC DTG
2: 2 3TC DTG ABC 3TC ABC DTG
这篇关于如何在R data.table中对变量的单词进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文