如何在Python中的非打印ASCII字符处分割线 [英] How to split line at non-printing ascii character in Python

查看:181
本文介绍了如何在Python中的非打印ASCII字符处分割线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在Python中以非打印ascii字符(例如,长减号hex 0x97,八进制227)分隔行? 我不需要角色本身.其后的信息将另存为变量.

How can I split a line in Python at a non-printing ascii character (such as the long minus sign hex 0x97 , Octal 227)? I won't need the character itself. The information after it will be saved as a variable.

推荐答案

您可以使用

You can use re.split.

>>> import re
>>> re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']

调整图案,使其仅包含要保留的字符.

Adjust the pattern to only include the characters you want to keep.

另请参阅: stripping-non-printable-characters-来自Python中的字符串

示例(带有长减号):

>>> # \xe2\x80\x93 represents a long dash (or long minus)
>>> s = 'hello – world'
>>> s
'hello \xe2\x80\x93 world'
>>> import re
>>> re.split("\xe2\x80\x93", s)
['hello ', ' world']

或者,与Unicode相同:

Or, the same with unicode:

>>> # \u2013 represents a long dash, long minus or so called en-dash
>>> s = u'hello – world'
>>> s
u'hello \u2013 world'
>>> import re
>>> re.split(u"\u2013", s)
[u'hello ', u' world']

这篇关于如何在Python中的非打印ASCII字符处分割线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆