多行正则表达式帮助 [英] Multiline regex help

查看：58 发布时间：2019/6/5 13:41:23 python

本文介绍了多行正则表达式帮助的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Hey Folks，

我在一堆看起来像这样的文件中有一些信息：

Gibberish

53

MoreGarbage

12

RelevantInfo1

10/10/04

NothingImportant

ThisDoesNotMatter

44

RelevantInfo2

22

BlahBlah

343

RelevantInfo3

23

Hubris

废话

等...

无论如何，这些字段等等。在给定文件中重复几次（

重复次数因文件而异）。

" RelevantInfo"后面的行号。线条真的是我追求的。理想情况下，我想要这样的东西：

RelevantInfo1 = 10/10/04＃变量名称实际上并不重要

RelevantInfo3 = 23＃它只是用于说明我是什么信息

＃试图阻止。

分数[RelevantInfo1] [RelevantInfo3] = 22＃来自RelevantInfo2的值

从所有文件中收集。

所以，会有几个这些得分中的每个文件都有一堆

的文件。最终，我有兴趣将它们打印成csv文件但是

一旦被困在我的厄运阵列中就应该相对容易

< cue evil laughter> 。

我有一个相当难看的解决方案（我使用这个术语*非常*松散）

使用awk和他的faithfail伴侣sed，但我更喜欢

python中的内容。

感谢您的时间。

-

McGowan的麦迪逊大道公理：

如果是项目被宣传为低于50美元，你可以打赌它不是19.95美元。

Hey Folks,

I''ve got some info in a bunch of files that kind of looks like so:

Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34

and so on...

Anyhow, these "fields" repeat several times in a given file (number of
repetitions varies from file to file). The number on the line following the
"RelevantInfo" lines is really what I''m after. Ideally, I would like to have
something like so:

RelevantInfo1 = 10/10/04 # The variable name isn''t actually important
RelevantInfo3 = 23 # it''s just there to illustrate what info I''m
# trying to snag.

Score[RelevantInfo1][RelevantInfo3] = 22 # The value from RelevantInfo2

Collected from all of the files.

So, there would be several of these "scores" per file and there are a bunch
of files. Ultimately, I am interested in printing them out as a csv file but
that should be relatively easy once they are trapped in my array of doom
<cue evil laughter>.

I''ve got a fairly ugly "solution" (I am using this term *very* loosely)
using awk and his faithfail companion sed, but I would prefer something in
python.

Thanks for your time.

--
McGowan''s Madison Avenue Axiom:
If an item is advertised as "under $50", you can bet it''s not $19.95.

推荐答案

50"，你可以打赌它'不是

50", you can bet it''s not

19.95。

19.95.

Yatima写道：

嘿嘿伙计，
<我有一些文件中有一些看起来像这样的信息：

Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
相关信息2
22
BlahBlah
343
RelevantInfo3
23
傲慢
Crap
34

等等......无论如何，这些领域都是如此。在给定文件中重复多次（
重复次数因文件而异）。
RelevantInfo之后的行上的数字。线条真的是我追求的。理想情况下，我希望有类似的东西：

RelevantInfo1 = 10/10/04＃变量名称实际上并不重要
RelevantInfo3 = 23＃it'只是在那里说明我是什么信息？试图阻止。

这是一种创建[RelevantInfo，value]对列表的方法：

import cStringIO

raw_data =''''''Gibberish

53

MoreGarbage

12

RelevantInfo1

10/10/04

NothingImportant

ThisDoesNotMatter

44

RelevantInfo2

22

BlahBlah

343

RelevantInfo3

23

Hubris

废话

34'''''

raw_data = cStringIO.StringIO（raw_data）

data = []

for raw_data中的行：

if line.startswith（''RelevantInfo''）：

key = line.strip （）

value = raw_data.next（）。strip（）

data.append（[key，value]）

打印数据

分数[RelevantInfo1] [RelevantInfo3] = 22＃来自RelevantInfo2的值

我不知道你的意思这样。你想建立一个乐谱词典吗？

肯特

收集所有文件。

所以，那里这些得分中的几个将是得分。每个文件，有一堆文件。最终，我有兴趣将它们打印成csv文件，但是一旦它们被困在我的厄运阵列中就应该相对容易
< cue evil laughter> ;.
我有一个相当丑陋的解决方案（我非常*松散地使用这个术语）
使用awk和他的信仰伴侣sed，但我更喜欢
python中的内容。

感谢您的时间。

Hey Folks,

I''ve got some info in a bunch of files that kind of looks like so:

Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34

and so on...

Anyhow, these "fields" repeat several times in a given file (number of
repetitions varies from file to file). The number on the line following the
"RelevantInfo" lines is really what I''m after. Ideally, I would like to have
something like so:

RelevantInfo1 = 10/10/04 # The variable name isn''t actually important
RelevantInfo3 = 23 # it''s just there to illustrate what info I''m
# trying to snag.
Here is a way to create a list of [RelevantInfo, value] pairs:
import cStringIO

raw_data = ''''''Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34''''''
raw_data = cStringIO.StringIO(raw_data)

data = []
for line in raw_data:
if line.startswith(''RelevantInfo''):
key = line.strip()
value = raw_data.next().strip()
data.append([key, value])

print data

Score[RelevantInfo1][RelevantInfo3] = 22 # The value from RelevantInfo2
I''m not sure what you mean by this. Do you want to build a Score dictionary as well?

Kent

Collected from all of the files.

So, there would be several of these "scores" per file and there are a bunch
of files. Ultimately, I am interested in printing them out as a csv file but
that should be relatively easy once they are trapped in my array of doom
<cue evil laughter>.

I''ve got a fairly ugly "solution" (I am using this term *very* loosely)
using awk and his faithfail companion sed, but I would prefer something in
python.

Thanks for your time.

这篇关于多行正则表达式帮助的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

多行正则表达式帮助 [英] Multiline regex help

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

多行正则表达式帮助 [英] Multiline regex help

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭