搜索字符串并拉动字符 [英] searching through a string and pulling characters

查看:62
本文介绍了搜索字符串并拉动字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



这与我上一篇文章类似,但有点不同。这就是我想要做的事情。


让我说我有一个文本文件。内容看起来像这样,只有A

很多相同的东西。


()由承销商提供的注册商标(如Lloyd'' s)船舶在

一流的条件下。劣势等级用A 2和A 3表示。

()字母表的前三个字母,用于整个字母表。

()在教堂或教堂风格; - 说在老教堂唱的作品

风格,没有乐器伴奏;作为一个群体,一个capella,i。 e。,

质量纯粹的声音。

()跨骑;每边都有一部分; - 使用特定。指定一支军队的位置,其翼由一些分界线隔开,如河流或道路。


现在,我正在谈论其中的1000个。我需要做这样的事情。我将

有一个数字,而我想要做的就是通过这个文本文件,就像

这样的例子。诀窍是,那些()的是我需要匹配的,所以如果

这个数字是245我需要找到第245个()然后得到所有文本

从它之后直到下一个()。如果您对

的最佳方式有所了解,我会很乐意为您提供帮助。如果你一直通过感谢!

;)

-

在上下文中查看此消息: http://www.nabble.com/searching- thro ... p19039594.html

从Nabble.com的Python-python-list邮件列表存档发送。

解决方案

2008年8月18日星期一13:40:13 -0700(PDT),Alexnb写道:


现在,我是谈论1000'的这些。我需要做这样的事情。我将

有一个数字,而我想要做的就是通过这个文本文件,就像

这样的例子。诀窍是,那些()的是我需要匹配的,所以如果

这个数字是245我需要找到第245个()然后得到所有文本

从它之后直到下一个()。如果您对

的最佳方式有所了解,我会很乐意为您提供帮助。如果你一直通过谢谢!

;)



想到findall:


>> a ="""(string1)



....(string2)

....(string3)

....(string4)

....(string5)

....(string6)"""


>> import re
pat = re.compile(" (\(。*?\))")



现在让我们说你想获得第四名元素:


>> pat.findall(a)[3]



''(string4)''


保存一些内存使用finditer(只要你不必搜索

中的太多这些):


>> for i in enumerate(pat.finditer(a)):



....如果我[0] == 2:

....打印我[1] .group()

....

(string3)


>>>



-

问候,

Wojtek Walczak, http://www.stud.umk.pl/~wojtekwa/


2008年8月18日星期一21:43:43 +0000(UTC),Wojtek Walczak写道:


2008年8月18日星期一13:40:13 -0700(PDT),Alexnb写道:


>现在,我说的是1000''这些。我需要做这样的事情。我会有一个数字,而我想要做的就是浏览这个文本文件,就像这个例子一样。诀窍是,那些()的是我需要匹配的,所以如果
数字是245我需要找到第245个()然后从它之后获得所有文本
直到下一个()。如果您对如何做到这一点的最佳方式有所了解,我会非常乐意为您提供帮助。如果你一直通过感谢!
;)



findall浮现在想:



....算了吧,我误读了你的帖子:)


-

问候,

Wojtek Walczak ,
http://www.stud.umk.pl/~ wojtekwa /


8月19日上午6:40,Alexnb< alexnbr ... @ gmail.comwrote:


这与我的上一篇文章相似,



哦,好孩子好礼物,我喜欢猜猜游戏!


但有点不同。这就是我想要做的事情。


让我说我有一个文本文件。内容看起来像这样,只有A

很多相同的东西。


()由承销商提供的注册商标(如Lloyd'' s)船舶在

一流的条件下。劣势等级用A 2和A 3表示。

()字母表的前三个字母,用于整个字母表。

()在教堂或教堂风格; - 说在老教堂唱的作品

风格,没有乐器伴奏;作为一个群体,一个capella,i。 e。,

质量纯粹的声音。

()跨骑;每边都有一部分; - 使用特定。在指定一支军队的位置时,两翼的分界线分开,如河流或公路一样。



这看起来像是值缩写/首字母缩略词的一部分

字典...键发生了什么部分(A1,ABC,AC,?

跨越?,...)


是否()总是出现在一行的开头(或许前面有一些空白的b / b
),还是会出现在一行中间?


你确定吗?关于A 2和A 3?我原以为A2。和

A3。换句话说,上面是一些输入的精确副本还是你重新输入了它?


"()"划分事物是一种奇怪的方式...


好​​的,这是我的猜测:你已经获得了一个有两张桌子的数据库。

表K地图例如" ABC"表2将表2映射到字母表的前三个字母,用于整个字母表。您已经使用了

一些实用程序或者从V中完成了select'()''+ column2


>

现在,我正在谈论其中的1000个。我需要做这样的事情。我将

有一个数字,而我想要做的就是通过这个文本文件,就像

这样的例子。诀窍是,那些()的是我需要匹配的,所以如果

这个数字是245我需要找到第245个()然后得到所有文本

从它之后直到下一个()。如果您对

的最佳方式有所了解,我会很乐意为您提供帮助。



执行此操作的最佳方法是编写一个简单的Python脚本。我建议您试试这个,如果遇到困难,请在此处发布您的

尝试以及对所感知的

问题的清晰描述。


然而,搜索一个大文件(多少Mb?)寻找

次出现的()关于

第10次这样做之后听起来不是一个好主意。也许值得花费额外的努力来处理文本文件一次并将结果插入(例如)SQLite

数据库中,以便以后可以执行从V中选择column2

column1 = 245"


一个非常愚蠢的问题:你说我会有一个数字 (例如245);

这个序数的来源或出处是什么?随机数

发电机?通过检票口的票上的铭文?从K中选择

column2,其中column1 =''A1''"? IOW,也许你可能需要

考虑更大的问题。


干杯,

John



This is similar to my last post, but a little different. Here is what I would
like to do.

Lets say I have a text file. The contents look like this, only there is A
LOT of the same thing.

() A registry mark given by underwriters (as at Lloyd''s) to ships in
first-class condition. Inferior grades are indicated by A 2 and A 3.
() The first three letters of the alphabet, used for the whole alphabet.
() In church or chapel style; -- said of compositions sung in the old church
style, without instrumental accompaniment; as, a mass a capella, i. e., a
mass purely vocal.
() Astride; with a part on each side; -- used specif. in designating the
position of an army with the wings separated by some line of demarcation, as
a river or road.

Now, I am talking 1000''s of these. I need to do something like this. I will
have a number, and what I want to do is go through this text file, just like
the example. The trick is this, those "()''s" are what I need to match, so if
the number is 245 I need to find the 245th () and then get the all the text
from after it until the next (). If you have an idea about the best way to
do this I would love your help. If you made it all the way through thanks!
;)
--
View this message in context: http://www.nabble.com/searching-thro...p19039594.html
Sent from the Python - python-list mailing list archive at Nabble.com.

解决方案

On Mon, 18 Aug 2008 13:40:13 -0700 (PDT), Alexnb wrote:

Now, I am talking 1000''s of these. I need to do something like this. I will
have a number, and what I want to do is go through this text file, just like
the example. The trick is this, those "()''s" are what I need to match, so if
the number is 245 I need to find the 245th () and then get the all the text
from after it until the next (). If you have an idea about the best way to
do this I would love your help. If you made it all the way through thanks!
;)

findall comes to mind:

>>a="""(string1)

.... (string2)
.... (string3)
.... (string4)
.... (string5)
.... (string6)"""

>>import re
pat = re.compile("(\(.*?\))")

and now let''s say you want to get fourth element:

>>pat.findall(a)[3]

''(string4)''

To save some memory use finditer (as long as you don''t have to search
for too many of these):

>>for i in enumerate(pat.finditer(a)):

.... if i[0] == 2:
.... print i[1].group()
....
(string3)

>>>


--
Regards,
Wojtek Walczak,
http://www.stud.umk.pl/~wojtekwa/


On Mon, 18 Aug 2008 21:43:43 +0000 (UTC), Wojtek Walczak wrote:

On Mon, 18 Aug 2008 13:40:13 -0700 (PDT), Alexnb wrote:

>Now, I am talking 1000''s of these. I need to do something like this. I will
have a number, and what I want to do is go through this text file, just like
the example. The trick is this, those "()''s" are what I need to match, so if
the number is 245 I need to find the 245th () and then get the all the text
from after it until the next (). If you have an idea about the best way to
do this I would love your help. If you made it all the way through thanks!
;)


findall comes to mind:

....forget it, I misread your post :)

--
Regards,
Wojtek Walczak,
http://www.stud.umk.pl/~wojtekwa/


On Aug 19, 6:40 am, Alexnb <alexnbr...@gmail.comwrote:

This is similar to my last post,

Oh, goodie goodie goodie, I love guessing games!

but a little different. Here is what I would
like to do.

Lets say I have a text file. The contents look like this, only there is A
LOT of the same thing.

() A registry mark given by underwriters (as at Lloyd''s) to ships in
first-class condition. Inferior grades are indicated by A 2 and A 3.
() The first three letters of the alphabet, used for the whole alphabet.
() In church or chapel style; -- said of compositions sung in the old church
style, without instrumental accompaniment; as, a mass a capella, i. e., a
mass purely vocal.
() Astride; with a part on each side; -- used specif. in designating the
position of an army with the wings separated by some line of demarcation, as
a river or road.

This looks like the "values" part of an abbreviation/acronym
dictionary ... what has happened to the "keys" part (A1, ABC, AC, ?
astride?, ...)

Does "()" appear always at the start of a line (perhaps preceded by
some whitespace), or can it appear in the middle of a line?

Are you sure about "A 2" and "A 3"? I would have expected "A2" and
"A3". In other words, is the above an exact copy of some input or have
you re-typed it?

"()" is a strange way of delimiting things ...

OK, here''s my guess: You have acquired a database with two tables.
Table K maps e.g. "ABC" to 2. Table V maps 2 to "The first three
letters of the alphabet, used for the whole alphabet." You have used
some utility or done "select ''() '' + column2 from V.

>
Now, I am talking 1000''s of these. I need to do something like this. I will
have a number, and what I want to do is go through this text file, just like
the example. The trick is this, those "()''s" are what I need to match, so if
the number is 245 I need to find the 245th () and then get the all the text
from after it until the next (). If you have an idea about the best way to
do this I would love your help.

The best way to do this is to write a small simple Python script. I
suggest that you try this, and if you have difficulties, post your
attempt here together with a lucid description of the perceived
problem.

However searching through a large file (how many Mb?) looking for the
nth occurrence of "()" doesn''t sound like a good idea after about the
10th time you do it. Perhaps it might be worth the extra effort to
process the text file once and insert the results in a (say) SQLite
data base so that later you can do "select column2 from V where
column1 = 245".

A really silly question: You say "I will have a number" (e.g. 245);
what is the source or provenance of this ordinal? A random number
generator? Inscription on a ticket passed through a wicket? "select
column2 from K where column1 = ''A1''"? IOW, perhaps you may need to
consider the larger problem.

Cheers,
John


这篇关于搜索字符串并拉动字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆