python用引号和空格分隔文本 [英] python split text by quotes and spaces

查看:63
本文介绍了python用引号和空格分隔文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下文字

text = 'This is "a simple" test'

我需要以两种方式将其拆分,首先用引号将其分隔,然后将其分隔为:

And I need to split it in two ways, first by quotes and then by spaces, resulting in:

res = ['This', 'is', '"a simple"', 'test']

但是使用 str.split(),我只能使用引号或空格作为分隔符.有多个分隔符的内置函数吗?

But with str.split() I'm only able to use either quotes or spaces as delimiters. Is there a built in function for multiple delimiters?

推荐答案

您可以使用 shlex.split ,方便解析带引号的字符串:

You can use shlex.split, handy for parsing quoted strings:

>>> import shlex
>>> text = 'This is "a simple" test'
>>> shlex.split(text, posix=False)
['This', 'is', '"a simple"', 'test']

non-posix 模式执行此操作可防止从拆分结果中删除内部引号.默认情况下, posix 设置为 True :

Doing this in non-posix mode prevents the removal of the inner quotes from the split result. posix is set to True by default:

>>> shlex.split(text)
['This', 'is', 'a simple', 'test']

如果您有多行这种类型的文本,或者您正在从流中读取内容,则可以使用

If you have multiple lines of this type of text or you're reading from a stream, you can split efficiently (excluding the quotes in the output) using csv.reader:

import io
import csv

s = io.StringIO(text.decode('utf8')) # in-memory streaming
f = csv.reader(s, delimiter=' ', quotechar='"')
print list(f)
# [['This', 'is', 'a simple', 'test']]

如果在Python 3上,您不需要将字符串解码为unicode,因为所有字符串都已经是unicode.

这篇关于python用引号和空格分隔文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆