在字符串向量的元素之间获取最小的共享部分 [英] Get minimal shared part between elements of string's vector

查看:50
本文介绍了在字符串向量的元素之间获取最小的共享部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有字符串向量列表:

xx <- c("concord wanderer basic set air snug beige",
  "concord wanderer basic set air snug black noir", 
  "concord wanderer basic set air snug blue bleu", 
  "concord wanderer basic set air snug brown marron", 
  "concord wanderer basic set air snug green vert", 
   "concord wanderer basic set air snug grey gris", 
   "concord wanderer basic set air snug red rouge", 
   "concord wanderer basic set air snug rose" )

我试图使向量元素之间的共享部分最小化,例如,在这里我应该得到:

I tried to get minimal shared part between elements of the vector, for example, here I should get:

"concord wanderer basic set air snug"

xx是先前过程的结果,所以我确定元素之间存在共享部分.但是删除的部分并不总是在字符串的结尾.

xx is a result of a previous process, so I am sure that there is a shared part between the elements. But the removed part is not always at the end of he strings.

使用 strsplit `table 我得到了部分解决方案,但这有点棘手,我失去了原来的单词顺序:

Using strsplit and `table I get this partial solution, but it is a little bit tricky and I loose the original order of words:

table_x <- table(unlist(strsplit(xx,' ')))
paste(names(table_x[table_x==max(table_x)]),collapse=' ')
[1] "air basic concord set snug wanderer"

我很确定有更好的解决方案.我尝试使用 agrep adist ,但没有成功.

I am pretty sure that there is better solution. I tried with agrep or adist but without a lot of success.

推荐答案

您可以将 intersect Reduce 结合使用以获取所需的输出.

You could use intersect with Reduce to get the output you want.

paste(Reduce(intersect, strsplit(xx, " ")), collapse=" ")
#[1] "concord wanderer basic set air snug"

这篇关于在字符串向量的元素之间获取最小的共享部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆