从R中的字符串中提取唯一的数字 [英] Extracting unique numbers from string in R
问题描述
我有一个包含随机字符的字符串列表,例如:
list = list()
list [5]
list [3] =2tu,g7gka5
我想知道哪些数字至少存在一次( unique()
)在这个列表中。我的例子的解决方案是:
解决方案: c(7,667,11,5,2)
如果有人有一种不认为11是十一而是一一的方法,那也是有用的。这种情况下的解决方案是:
解决方案: c(7,6,1,5,2)
(我在相关主题上发现了这篇文章:从字符串的向量中提取数字)解决方案
对于第二个答案,您可以使用 gsub
从字符串中删除所有不是数字的内容,然后按如下方式拆分字符串:
unique(as.numeric(unlist(strsplit(gsub([^ 0-9],,unlist(ll)),)))
#[1] 7 6 1 5 2
对于第一个答案,同样使用 strsplit $ (无)(strsplit(unlist(ll),[code $,$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ ^ 0-9] +)))))
#[1] 7 667 11 5 2
PS:不要将变量命名为 list
(因为内置函数 list
)。我已将您的数据命名为 ll
。
I have a list of strings which contain random characters such as:
list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"
I'd like to know which numbers are present at least once (unique()
) in this list. The solution of my example is:
solution: c(7,667,11,5,2)
If someone has a method that does not consider 11 as "eleven" but as "one and one", it would also be useful. The solution in this condition would be:
solution: c(7,6,1,5,2)
(I found this post on a related subject: Extracting numbers from vectors of strings)
For the second answer, you can use gsub
to remove everything from the string that's not a number, then split the string as follows:
unique(as.numeric(unlist(strsplit(gsub("[^0-9]", "", unlist(ll)), ""))))
# [1] 7 6 1 5 2
For the first answer, similarly using strsplit
,
unique(na.omit(as.numeric(unlist(strsplit(unlist(ll), "[^0-9]+")))))
# [1] 7 667 11 5 2
PS: don't name your variable list
(as there's an inbuilt function list
). I've named your data as ll
.
这篇关于从R中的字符串中提取唯一的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!