从R中的字符串中提取唯一的数字 [英] Extracting unique numbers from string in R

查看:193
本文介绍了从R中的字符串中提取唯一的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含随机字符的字符串列表,例如:

  list = list()
list [5]
list [3] =2tu,g7gka5

我想知道哪些数字至少存在一次( unique())在这个列表中。我的例子的解决方案是:



解决方案: c(7,667,11,5,2)



如果有人有一种不认为11是十一而是一一的方法,那也是有用的。这种情况下的解决方案是:

解决方案: c(7,6,1,5,2)



(我在相关主题上发现了这篇文章:从字符串的向量中提取数字)解决方案

对于第二个答案,您可以使用 gsub 从字符串中删除所有不是数字的内容,然后按如下方式拆分字符串:

  unique(as.numeric(unlist(strsplit(gsub([^ 0-9],,unlist(ll)),)))
#[1] 7 6 1 5 2

对于第一个答案,同样使用 strsplit #[1] 7 667 11 5 2

PS:不要将变量命名为 list (因为内置函数 list )。我已将您的数据命名为 ll


I have a list of strings which contain random characters such as:

list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"

I'd like to know which numbers are present at least once (unique()) in this list. The solution of my example is:

solution: c(7,667,11,5,2)

If someone has a method that does not consider 11 as "eleven" but as "one and one", it would also be useful. The solution in this condition would be:

solution: c(7,6,1,5,2)

(I found this post on a related subject: Extracting numbers from vectors of strings)

解决方案

For the second answer, you can use gsub to remove everything from the string that's not a number, then split the string as follows:

unique(as.numeric(unlist(strsplit(gsub("[^0-9]", "", unlist(ll)), ""))))
# [1] 7 6 1 5 2

For the first answer, similarly using strsplit,

unique(na.omit(as.numeric(unlist(strsplit(unlist(ll), "[^0-9]+")))))
# [1]   7 667  11   5   2

PS: don't name your variable list (as there's an inbuilt function list). I've named your data as ll.

这篇关于从R中的字符串中提取唯一的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆