用括号分割单词 [英] splitting words with brackets

查看:118
本文介绍了用括号分割单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些要拆分的字符串。它们是主要的词,但有些词语

在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作

的工作。


例如:


" a(bc)d [efg] h i"

应分拆为

[" a","(bc)"," d"," [efg]"" h"," i"]


因为速度是一个要考虑的因素,最好是有一行

正则表达式可以处理这个问题。我尝试了但失败了:

re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c

但不起作用a(b c) :(


任何提示?

I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.

Example:

"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]

As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(

Any hint?

推荐答案

re.findall(''\ ([^ \]] * \)| \ [[^ \]] * | \ S +'',s)


Qiangning Hong写道:
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)

Qiangning Hong wrote:

我有一些要拆分的字符串。它们是主要的词,但有些词语

在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作

的工作。


例如:


" a(bc)d [efg] h i"

应分拆为

[" a","(bc)"," d"," [efg]"" h"," i"]


因为速度是一个要考虑的因素,最好是有一行

正则表达式可以处理这个问题。我尝试了但失败了:

re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c

但不起作用a(b c) :(


任何提示?
I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.

Example:

"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]

As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(

Any hint?


呃,

.... | \ [[^ \]] * \] | ...

^ _ ^


faulkner写道:
er,
....|\[[^\]]*\]|...
^_^

faulkner wrote:

re.findall(''\([^ \)] * \)| \ [[^ \]] * | \ S +'',s)< br $>
Qiangning Hong写道:
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)

Qiangning Hong wrote:

我有一些字符串需要拆分。它们是主要的词,但有些词语

在一对括号内,应该被视为一个单位。我希望
更喜欢使用re.split,但是几小时后还没写好工作

的工作。


例如:


" a(bc)d [efg] h i"

应分拆为

[" a","(bc)"," d"," [efg]"" h"," i"]


因为速度是一个要考虑的因素,最好是有一行

正则表达式可以处理这个问题。我尝试了但失败了:

re.split(r"(?![\(\ []。*?)\ + +(?!。*?[\} \ ]])",s)。它适用于(a b)c

但不起作用a(b c) :(


任何提示?
I''ve got some strings to split. They are main words, but some words
are inside a pair of brackets and should be considered as one unit. I
prefer to use re.split, but haven''t written a working one after hours
of work.

Example:

"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]

As speed is a factor to consider, it''s best if there is a single line
regular expression can handle this. I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s). It work for "(a b) c"
but not work "a (b c)" :(

Any hint?


faulkner写道:
faulkner wrote:

re.findall(''\([^ \)] * \)| \ [[^ \]] * | \ S +'',s)
re.findall(''\([^\)]*\)|\[[^\]]*|\S+'', s)



对不起,我忘了给出限制:如果一个字母在括号旁边,

它们应该被视为一个单词。即:

a(b c)d变为[a(b c),d]

因为a与a之间没有空白。和(

sorry i forgot to give a limitation: if a letter is next to a bracket,
they should be considered as one word. i.e.:
"a(b c) d" becomes ["a(b c)", "d"]
because there is no blank between "a" and "(".


这篇关于用括号分割单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆