带有多个无序拆分参数的 R strsplit? [英] R strsplit with multiple unordered split arguments?

查看:24
本文介绍了带有多个无序拆分参数的 R strsplit?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个字符串

test_1<-"abc def,ghi klm"
test_2<-"abc, def ghi klm"

我想获得

"abc"
"def"
"ghi"

但是,使用 strsplit 时,必须知道字符串中拆分值的顺序,因为 strsplit 使用第一个值进行第一次拆分,第二个值进行第二次......然后循环使用.

However, using strsplit, one must know the order of the splitting values in the string, as strsplit uses the first value to do the first split, the second to do the second... and then recycles.

但这不会:

strsplit(test_1, c(",", " "))
strsplit(test_2, c(" ", ","))

strsplit(test_2, split=c("[:punct:]","[:space:]"))[[1]]

我希望在单个步骤中找到任何拆分值的地方拆分字符串.

I am looking to split the string wherever I find any of my splitting values in a single step.

推荐答案

实际上 strsplit 也使用了 grep 模式.(逗号是正则表达式元字符,而空格不是;因此需要在模式参数中对逗号进行双重转义.因此,使用 "\s" 将更多地提高可读性,而不是必要时):

Actually strsplit uses grep patterns as well. (A comma is a regex metacharacter whereas a space is not; hence the need for double escaping the commas in the pattern argument. So the use of "\s" would be more to improve readability than of necessity):

> strsplit(test_1, "\, |\,| ")  # three possibilities OR'ed
[[1]]
[1] "abc" "def" "ghi" "klm"

> strsplit(test_2, "\, |\,| ")
[[1]]
[1] "abc" "def" "ghi" "klm"

如果不同时使用 \,\, (注意 SO 未显示的额外空间),您将获得一些字符 (0) 值.如果我写了可能会更清楚:

Without using both \, and \, (note extra space that SO does not show) you would have gotten some character(0) values. Might have been clearer if I had written:

> strsplit(test_2, "\,\s|\,|\s")
[[1]]
[1] "abc" "def" "ghi" "klm"

@Fojtasek 说得很对:使用字符类通常可以简化任务,因为它创建了一个隐式逻辑 OR:

@Fojtasek is so right: Using character classes often simplifies the task because it creates an implicit logical OR:

> strsplit(test_2, "[, ]+")
[[1]]
[1] "abc" "def" "ghi" "klm"

> strsplit(test_1, "[, ]+")
[[1]]
[1] "abc" "def" "ghi" "klm"

这篇关于带有多个无序拆分参数的 R strsplit?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆