用括号分割单词 [英] splitting words with brackets
问题描述
我有一些要拆分的字符串。它们是主要的词,但有些词语
在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作
的工作。
例如:
" a(bc)d [efg] h i"
应分拆为
[" a","(bc)"," d"," [efg]"" h"," i"]
因为速度是一个要考虑的因素,最好是有一行
正则表达式可以处理这个问题。我尝试了但失败了:
re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c
但不起作用a(b c) :(
任何提示?
I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.
Example:
"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]
As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(
Any hint?
推荐答案
re.findall(''\ ([^ \]] * \)| \ [[^ \]] * | \ S +'',s)
Qiangning Hong写道:
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)
Qiangning Hong wrote:
我有一些要拆分的字符串。它们是主要的词,但有些词语
在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作
的工作。
例如:
" a(bc)d [efg] h i"
应分拆为
[" a","(bc)"," d"," [efg]"" h"," i"]
因为速度是一个要考虑的因素,最好是有一行
正则表达式可以处理这个问题。我尝试了但失败了:
re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c
但不起作用a(b c) :(
任何提示?
I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.
Example:
"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]
As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(
Any hint?
呃,
.... | \ [[^ \]] * \] | ...
^ _ ^
faulkner写道:
er,
....|\[[^\]]*\]|...
^_^
faulkner wrote:
re.findall(''\([^ \)] * \)| \ [[^ \]] * | \ S +'',s)< br $>
Qiangning Hong写道:
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)
Qiangning Hong wrote:
我有一些字符串需要拆分。它们是主要的词,但有些词语
在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作
的工作。
例如:
" a(bc)d [efg] h i"
应分拆为
[" a","(bc)"," d"," [efg]"" h"," i"]
因为速度是一个要考虑的因素,最好是有一行
正则表达式可以处理这个问题。我尝试了但失败了:
re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c
但不起作用a(b c) :(
任何提示?
I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.
Example:
"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]
As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(
Any hint?
faulkner写道:
faulkner wrote:
re.findall(''\([^ \)] * \)| \ [[^ \]] * | \ S +'',s)
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)
对不起,我忘了给出限制:如果一个字母在括号旁边,
它们应该被视为一个单词。即:
a(b c)d变为[a(b c),d]
因为a与a之间没有空白。和(
sorry i forgot to give a limitation: if a letter is next to a bracket,
they should be considered as one word. i.e.:
"a(b c) d" becomes ["a(b c)", "d"]
because there is no blank between "a" and "(".
这篇关于用括号分割单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!