python re.split() 按空格、逗号和句点分割,但不是在 1,000 或 1.50 等情况下 [英] python re.split() to split by spaces, commas, and periods, but not in cases like 1,000 or 1.50
问题描述
我想使用 python re.split()
将字符串按空格、逗号和句点拆分为单个单词.但我不希望 "1,200"
被拆分成 ["1", "200"]
或 ["1.2"]
拆分为["1", "2"]
.
I want to use python re.split()
to split a string into individual words by spaces, commas and periods. But I don't want "1,200"
to be split into ["1", "200"]
or ["1.2"]
to be split into ["1", "2"]
.
示例
l = "one two 3.4 5,6 seven.eight nine,ten"
结果应该是["一", "二", "3.4", "5,6", "七", "八", "九", "十"]
推荐答案
使用否定前瞻和否定后视:
Use a negative lookahead and a negative lookbehind:
> s = "one two 3.4 5,6 seven.eight nine,ten"
> parts = re.split('\s|(?<!\d)[,.](?!\d)', s)
['one', 'two', '3.4', '5,6', 'seven', 'eight', 'nine', 'ten']
换句话说,你总是用\s
(空格)分割,如果not后面跟着(?!\d)
或在 (? 前加一个数字.
In other words, you always split by \s
(whitespace), and only split by commas and periods if they are not followed (?!\d)
or preceded (?<!\d)
by a digit.
演示.
编辑:根据@verdesmarald 评论,您可能需要使用以下内容:
EDIT: As per @verdesmarald comment, you may want to use the following instead:
> s = "one two 3.4 5,6 seven.eight nine,ten,1.2,a,5"
> print re.split('\s|(?<!\d)[,.]|[,.](?!\d)', s)
['one', 'two', '3.4', '5,6', 'seven', 'eight', 'nine', 'ten', '1.2', 'a', '5']
这会将 "1.2,a,5"
拆分为 ["1.2", "a", "5"]
.
演示.
这篇关于python re.split() 按空格、逗号和句点分割,但不是在 1,000 或 1.50 等情况下的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!