带连字符的 Python 正则表达式 [英] Python Regex for hyphenated words

查看:62
本文介绍了带连字符的 Python 正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个正则表达式来匹配 python 中带连字符的单词.

I'm looking for a regex to match hyphenated words in python.

我设法得到的最接近的是:'\w+-\w+[-w+]*'

The closest I've managed to get is: '\w+-\w+[-w+]*'

text = "one-hundered-and-three- some text foo-bar some--text"
hyphenated = re.findall(r'\w+-\w+[-\w+]*',text)

返回列表 ['one-hundered-and-three-', 'foo-bar'].

which returns list ['one-hundered-and-three-', 'foo-bar'].

这几乎是完美的,除了三"后面的连字符.如果后跟单词",我只想要额外的连字符.即而不是 '[-\w+]*' 我需要类似 '(-\w+)*' 的东西,我认为它可以工作,但没有(它返回 ['-three, '']).即匹配 |word 后跟连字符后跟 word 后跟 hyphen_word 零次或多次|.

This is almost perfect except for the trailing hyphen after 'three'. I only want the additional hyphen if followed by a 'word'. i.e. instead of the '[-\w+]*' I need something like '(-\w+)*' which I thought would work, but doesn't (it returns ['-three, '']). i.e. something that matches |word followed by hyphen followed by word followed by hyphen_word zero or more times|.

推荐答案

试试这个:

re.findall(r'\w+(?:-\w+)+',text)

这里我们认为带连字符的单词是:

Here we consider a hyphenated word to be:

  • 多个单词字符
  • 后跟任意数量的:
    • 一个连字符
    • 后跟字符字符

    这篇关于带连字符的 Python 正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆