解析搜索字符串 [英] Parsing a search string

查看:77
本文介绍了解析搜索字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

新年快乐!由于我的酒精已经用完了,我会问一个问题,即我还没有找到答案。是否有一种优雅的方式来转向

类似于:

moo cow" farmer john" -zug




进入:


[''moo'',''牛'',''农民约翰' '],[''zug'']


我正在尝试解析搜索字符串,以便我可以将它用于SQL WHERE约束,

最好没有可怕的正则表达式。嗯,是的。


从2005年起,

Freddie

解决方案

< blockquote>怎么样,

我刚发布了类似的东西;)

好​​的,首先你可能想尝试shlex它是标准的

库。

如果你不知道cStringIO不担心什么,那就是

给一个像object这样的文件来传递给shlex。 />
如果你有一个文件只是在打开时传递它。

例如:a = shlex.shlex(open(''mytxt.txt'',''r''))


py> import cStringIO

py> d = cStringIO.StringIO()

py> d.write(''moo牛农夫约翰-zug'')

py> d.seek(0)

py> a = shlex.shlex(d)

py> a.get_token()

''moo''

py> a.get_token()

''牛' '

py> a.get_token()

''" farmer john"''

py> a.get_token()

'' - ''

py> a.get_token()

''zug''

py> ; a.get_token()

''''

#ok我们再次尝试添加 - 有效的字符,以便我们可以得到它

分组为单个令牌。

py> d.seek(0)

py> a = shlex.shlex(d)

py> a.wordchars + ='' - ''#添加连字符

py> a.get_token()

''moo''

py> a.get_token()

''牛''

py> a.get_token()

''"农民john"''

py> a.get_token()

''-zug''

py> a.get_token()

''''


Hth,

MEFarmer


考虑到早上只有6美元的酒精饮料,并且*然后*询问python问题,那就不错了。


无论如何 - 你可以写一个逐个字符的解析器函数那个

会在几分钟内完成...


我的'listquote''模块有一个 - 但它用逗号分隔而不是空白。

听起来你正在寻找一个单行...虽然常规

表达式*可以*做到这一点........... ....


问候,


模糊
http://www.voidspace.org.uk/atlantib...tml#llistquote


Freddie写道:

新年快乐!由于我已经没有酒精,我会问一个问题,我还没有找到答案。是否有一种优雅的方式来转向

> moo cow农夫约翰 -zug



进入:

[''moo'',''牛'',''农民约翰''],[''zug' ']

我正在尝试解析一个搜索字符串,这样我就可以将它用于SQL WHERE约束,
最好没有可怕的正则表达式。嗯,是的。




shlex方法,结束:


searchstring =''moo cow" farmer john" -zug''

lexer = shlex.shlex(searchstring)

lexer.wordchars + ='' - ''

poslist,neglist = [],[]

而1:

token = lexer.get_token()

#tord''''在eof上/>
如果不是令牌:中断

#删除引号

如果''" \''''中的令牌[0]:

token = token [1:-1]

#选择放入哪个列表

如果令牌[0] =='' - '' :

neglist.append(令牌[1:])

否则:

poslist.append(令牌)

问候,

Reinhold


Happy new year! Since I have run out of alcohol, I''ll ask a question that I
haven''t really worked out an answer for yet. Is there an elegant way to turn
something like:

moo cow "farmer john" -zug



into:

[''moo'', ''cow'', ''farmer john''], [''zug'']

I''m trying to parse a search string so I can use it for SQL WHERE constraints,
preferably without horrifying regular expressions. Uhh yeah.

From 2005,
Freddie

解决方案

How ,
I just posted on something similar earlier ;)
Ok first of all you might want to try shlex it is in the standard
library.
If you don''t know what cStringIO is dont worry about it it is just to
give a file like object to pass to shlex.
If you have a file just pass it in opened.
example: a = shlex.shlex(open(''mytxt.txt'',''r''))

py>import cStringIO
py>d = cStringIO.StringIO()
py>d.write(''moo cow "farmer john" -zug'')
py>d.seek(0)
py>a = shlex.shlex(d)
py>a.get_token()
''moo''
py>a.get_token()
''cow''
py>a.get_token()
''"farmer john"''
py>a.get_token()
''-''
py>a.get_token()
''zug''
py>a.get_token()
''''
# ok we try again this time we add - to valid chars so we can get it
grouped as a single token .
py>d.seek(0)
py>a = shlex.shlex(d)
py>a.wordchars += ''-'' # add the hyphen
py>a.get_token()
''moo''
py>a.get_token()
''cow''
py>a.get_token()
''"farmer john"''
py>a.get_token()
''-zug''
py>a.get_token()
''''

Hth,
M.E.Farmer


That''s not bad going considering you''ve only run out of alcohol at 6 in
the morning and *then* ask python questions.

Anyway - you could write a charcter-by-character parser function that
would do that in a few minutes...

My ''listquote'' module has one - but it splits on commas not whitespace.
Sounds like you''re looking for a one-liner though.... regular
expressions *could* do it...............

Regards,

Fuzzy
http://www.voidspace.org.uk/atlantib...tml#llistquote


Freddie wrote:

Happy new year! Since I have run out of alcohol, I''ll ask a question that I
haven''t really worked out an answer for yet. Is there an elegant way to turn
something like:

> moo cow "farmer john" -zug



into:

[''moo'', ''cow'', ''farmer john''], [''zug'']

I''m trying to parse a search string so I can use it for SQL WHERE constraints,
preferably without horrifying regular expressions. Uhh yeah.



The shlex approach, finished:

searchstring = ''moo cow "farmer john" -zug''
lexer = shlex.shlex(searchstring)
lexer.wordchars += ''-''
poslist, neglist = [], []
while 1:
token = lexer.get_token()
# token is '''' on eof
if not token: break
# remove quotes
if token[0] in ''"\'''':
token = token[1:-1]
# select in which list to put it
if token[0] == ''-'':
neglist.append(token[1:])
else:
poslist.append(token)

regards,
Reinhold


这篇关于解析搜索字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆