R在保留两个定界符之前(和之后)strsplit [英] R strsplit before ( and after ) keeping both delimiters

查看:74
本文介绍了R在保留两个定界符之前(和之后)strsplit的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似于以下内容的字符串:

I have a string that looks like the following:

x <- "01(01)121210(01)0001"

我想将其拆分为向量,以便获得以下信息:

I want to split this into a vector so that i get the following:

[1] "0" "1" "(01)" "1" "2" "1" "2" "1" "0" "(01)" "0" "0" "0" "1"

|)可以是[|]或{|},括号之间的位数可以是2个或更多。

The (|) could be [|] or {|} and the number of digits between the brackets can be 2 or more.

我一直在尝试通过分隔首先放在括号中:

I've been trying to do this by separating on the brackets first:

unlist(strsplit(x, "(?<=[\\]\\)\\}])", perl=T))
[1] "01(01)" "121210(01)" "0001"

or unlist(strsplit(x, "(?<=[\\[\\(\\{])", perl=T))
[1] "01(" "01)121210(" "01)0001"

但我找不到将两者结合在一起的方法。 b $ b然后,我希望拆分不包含方括号的元素。

but I can't find a way to combine the two together. Then, I was hoping to split the elements not containing the brackets.

如果有人可以帮我解决这个问题或知道更优雅

I'd be really grateful if someone can help me out with this or know of a more elegant way to do this.

非常感谢!

推荐答案

此是另一种方式:

unlist(strsplit(x, '\\([^)]*\\)(*SKIP)(*F)|(?=)', perl=T))
# [1] "0"    "1"    "(01)" "1"    "2"    "1"    "2"    "1"    "0"    "(01)" "0"    "0"    "0"    "1" 

\\([^)] * \\)匹配括号中的任何内容,而(* SKIP)(* F )告诉正则表达式引擎在此模式上失败,如果它在字符串中找到该模式,请不要使用<$另一侧的替代模式来重新测试字符串的该部分c $ c> | 。 | 另一侧的模式是(?=),它与字符之间的空格匹配。

\\([^)]*\\) matches anything in parentheses, and (*SKIP)(*F) tells the regular expression engine to fail on this pattern and if it finds that pattern in the string, do not re-test that part of the string using the alternative pattern on the other side of the |. The pattern on the other side of the | is (?=), and this matches the space between characters.

这篇关于R在保留两个定界符之前(和之后)strsplit的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆