为什么这个正则表达式会产生四个项目? [英] Why does this regex result in four items?

查看:49
本文介绍了为什么这个正则表达式会产生四个项目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用 ->=> 或用多个空格包裹的那些分割字符串,这意味着我分割后的字符串可以得到shehe两项:
"she he", "she he", "she he", "she he", "she->he", "she->he", "she=>he", "she=> he", " she-> he ", " she => he \n"

I want to split a string by , ->, =>, or those wrapped with several spaces, meaning that I can get two items, she and he, from the following strings after being split:
"she he", "she he", "she he ", "she he ", "she->he", "she ->he", "she=>he", "she=> he", " she-> he ", " she => he \n"

我试过用这个:

re.compile("(?<!^)((\\s*[-=]>\\s*)|[\\s+\t])(?!$\n)(?=[^\s])").split(' she  -> he \n')

我得到的是一个包含四个项目的列表:[' she', ' ->', ' ->', '他\n'].

What I get is a list with four items: [' she', ' -> ', ' -> ', 'he \n'].

为此,

re.compile("(?<!^)((\\s*[-=]>\\s*)|[\\s+\t])(?!$\n)(?=[^\s])").split('she he')

我明白了:['she', ' ', None, 'he'].

为什么有四个项目?如果没有中间两个,我怎么能只得到两个?

Why are there four items? And how can I get only two without the middle two?

推荐答案

如果你能去掉你的输入字符串.根据您的描述,您只需要在 \s+\s*->\s*\s*=> 上拆分单词即可.\s*

If you can just strip your input string. From your description, all you need is to split the words on either \s+ or \s*->\s* or \s*=>\s*

所以这是我的解决方案:

So here is my solution:

p = re.compile(r'\s*[-=]>\s*|\s+')
input1 = "she he"
input2 = " she  -> he \n".strip()

print p.split(input1)
print p.split(input2)

您的输出将只是她"和他":

Your output would be just 'she' and 'he':

['she', 'he']

这篇关于为什么这个正则表达式会产生四个项目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆