拆分分隔的字符串 [英] splitting delimited strings

查看:85
本文介绍了拆分分隔的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

处理分隔字符串的文本文件的最佳方法是什么?

我有一个文件,其中字符串用at符号引用,@ like this @。

字符串中的符号表示为double @@@。


处理此问题的最有效方法是什么?失败所有

否则我会将字符串分成字符并使用FSM,

但似乎'不是很pythonesqe。


@ rv @ 2 @ db.locks @@ // depot / hello.txt @@ mh @@ mh @ 1 1 44

@ pv @ 0 @ db.changex @ 44 44 @ mh @@ mh @ 1118875308 0 @:@@ :: @@@@:@


(这是来自perforce日志文件,顺便说一句)

>
很多TIA!

马克


-

马克哈里森

皮克斯动画工作室

解决方案

Mark Harrison写道:

处理分隔字符串文本文件的最佳方法是什么?<我有一个文件,其中字符串用at符号引用,@ like this @。
字符串中的符号表示为doubled @@。


你看过csv模块了吗?没有保证,但它可能只需要工作。你必须将分隔符设置为''''并将quotechar设置为'''''。你可能需要手动处理双@的东西,但为什么你不明白

你能用csv接近多少?

@ rv @ 2 @ db.locks @@ // depot / hello.txt @@ mh @@ mh @ 1 1 44
@ pv @ 0 @ db.changex @ 44 44 @ mh @@ mh @ 1118875308 0 @:@@::@@@@:@

(这是来自perforce日志文件,顺便说一句)



-

Paul McNett
http://paulmcnett.com


你可以使用正则表达式...它是某种类型的FSM,但它更快* b $ b * g *

检查这个片段:


def mysplit(s):

pattern =''((?:" [^ "] *")|(?:[^] +))''

tmp = re.split(pattern,s)

res = [ifelse( i [0] in('''',''''"),lambda:i [1:-1],lambda:i)for i in

tmp if i。 strip()]

返回res

mysplit(''foo bar" baz foo" bar" baz"'')



[''foo'',''bar'',''baz foo'','' bar'',''baz'']


2005年6月15日星期三23:03:55 +0000,Mark Harrison写道:

处理此问题的最有效方法是什么?失败所有其他我将把字符串分成字符并使用FSM,
但似乎'不是很pythonesqe。




像这样?

s =" @ hello @ world @@ foo @ bar"
s .split(" @")
['''',''hello'',''world'','''',''foo'',''bar''] s2 = hello @ world @@ foo @ bar
s2
''hello @ world @@ foo @ bar''s2.split(" @")
[''hello '',''world'',''',''foo'',''bar'']




再见


What is the best way to process a text file of delimited strings?
I''ve got a file where strings are quoted with at-signs, @like this@.
At-signs in the string are represented as doubled @@.

What''s the most efficient way to process this? Failing all
else I will split the string into characters and use a FSM,
but it seems that''s not very pythonesqe.

@rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
@pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :@@@@: @

(this is from a perforce journal file, btw)

Many TIA!
Mark

--
Mark Harrison
Pixar Animation Studios

解决方案

Mark Harrison wrote:

What is the best way to process a text file of delimited strings?
I''ve got a file where strings are quoted with at-signs, @like this@.
At-signs in the string are represented as doubled @@.
Have you taken a look at the csv module yet? No guarantees, but it may
just work. You''d have to set delimiter to '' '' and quotechar to ''@''. You
may need to manually handle the double-@ thing, but why don''t you see
how close you can get with csv?
@rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
@pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :@@@@: @

(this is from a perforce journal file, btw)


--
Paul McNett
http://paulmcnett.com


You could use regular expressions... it''s an FSM of some kind but it''s
faster *g*
check this snippet out:

def mysplit(s):
pattern = ''((?:"[^"]*")|(?:[^ ]+))''
tmp = re.split(pattern, s)
res = [ifelse(i[0] in (''"'',"''"), lambda:i[1:-1], lambda:i) for i in
tmp if i.strip()]
return res

mysplit(''foo bar "baz foo" bar "baz"'')


[''foo'', ''bar'', ''baz foo'', ''bar'', ''baz'']


On Wed, 15 Jun 2005 23:03:55 +0000, Mark Harrison wrote:

What''s the most efficient way to process this? Failing all
else I will split the string into characters and use a FSM,
but it seems that''s not very pythonesqe.



like this ?

s = "@hello@world@@foo@bar"
s.split("@") ['''', ''hello'', ''world'', '''', ''foo'', ''bar''] s2 = "hello@world@@foo@bar"
s2 ''hello@world@@foo@bar'' s2.split("@") [''hello'', ''world'', '''', ''foo'', ''bar'']



bye


这篇关于拆分分隔的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆