如何在R data.table中对变量的单词进行排序? [英] How can I sort words of variable in R data.table?

查看:33
本文介绍了如何在R data.table中对变量的单词进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个凌乱的数据,其中包含1-3个代码的字符串.

I have a messy data that consists of a string with 1-3 codes.

library(data.table)
data <- data.table(ID = c(1, 2), text = c("3TC ABC DTG", "3TC DTG ABC"))

不幸的是,代码不是按字母顺序编写的,我希望它们以这样的形式出现.两条记录都应转换为 3TC ABC DTG

Unfortunately the codes are not written in alphabetical order and I would like them to appear so. Both records should translate to 3TC ABC DTG

我尝试用字符串分割来模拟

I tried mocking around with splitting string

data[, c("text1", "text2", "text3") := tstrsplit(text, " ", fixed = TRUE)]

但是找不到找到一种方法来对这三个:/

but cannot find a way to sort and combine these three :/

我还考虑过重塑,但随后我的 dcast 似乎遇到了麻烦:

I also thought about reshaping but then my dcast seems to have troubles:

data_long <- melt(data, 
                  id.vars = c("ID"),
                  measure.vars =  c("text1", "text2", "text3"), 
                  na.rm = TRUE)

result <- dcast(data,
                ID ~ variable,
                function (x) paste(x, collapse = " "))

可以解决吗?

推荐答案

您距离很近.尝试

data[, text_new := unlist( lapply( strsplit( text, " " ), 
                                   function(x) paste0( sort(x), collapse = " "))) ]

   ID        text    text_new
1:  1 3TC ABC DTG 3TC ABC DTG
2:  2 3TC DTG ABC 3TC ABC DTG

这篇关于如何在R data.table中对变量的单词进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆