邮件提取问题（拆分方法有问题） [英] Mail extraction problem (something's wrong with split methods)

查看：104 发布时间：2019/6/5 10:59:25 python

本文介绍了邮件提取问题（拆分方法有问题）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好，

我有一点问题，虽然它很小但是非常困难

我来形容它，但是我会试试。

我写了一个程序，提取我收到的某些部分

电子邮件。电子邮件的内容实际上是可以预测的，它有一个非常长的数字列表，看起来像这样：

[34234,35435， 657789,6756735,12312378,09678567,23424]

当然连接到POP3服务器时我无法操纵我的邮件，

所以我决定转发邮件在本地并将其写入文件然后

操纵它。另一个问题是，在电子邮件中有很多输出，

垃圾字符和各种令人讨厌的东西，但不知何故，我设法

来解决它（下载电子邮件和提取有趣的部分），这里

是如何（我只会显示有趣的部分部分）：

temp = [mail.read（）]

enc_txt =" \ n" .join（temp）

begin = enc_txt.find（"，''[" ;）+ len（"，''["）

ending = enc_txt.find（"]''，"）

enc_txt2 =（enc_txt [开头：结尾]）

mail.close（）

lines = enc_txt2.splitlines（）

enc_txt3 =''' '.join（[line.strip（）for line in lines]）

split = re.split（"，"，enc_txt3）

enc = [int （elem）for elem in split]

enc = map（int，split）

此代码有效！但有个问题！当数字列表是
超过350字节时，在第350个地方我没有得到一个数字，但我得到了

一些引号和逗号和奇怪的事情。当列表长于

700字节时，这个问题会发生两次（实际上它不会发生，因为

解释器抱怨，但这种类型有两个错误）。是否有一个我缺少的东西，可以分割方法处理超过350个字节的

拆分文本？实际上发生了什么。

为了使它更清晰（因为我认为你不会理解它完全是b $ b）我可以上传错误，但它很大，所以我会尽量减少

日志。

[6964,7086,3211,7522,9472,3265,3610,104 ，9729,6706,8035,5439，

7142,360,677,1667,1382,9417,4493,8289,9613,3470,889,1021,3381，

3480,2483,6579,8928,3240,4437,5908,2290,9587,866,202,859,2184，

8328，..........] - 数字列表长705个字节。

当我运行程序时（在我的代码中使用命令print split，看看

发生了什么）：

[''6964''，''7086''，''3211''，''7522''，''9472''，''3265''， ''3610''，''104''，''

9729''，''6706''，''8035''，''5439''，''7142''， ''360''，''677''，''1667 ''，''

1382''，''9417''，''4493''，''8289''，''9613''，''3470''，''889 ''，'''1021''，''

3381''，''3480''，''2483''，''6579''，''8928''，''3240 ''，'''4437''，'''5908''，''

2290''，''9587''，''866''，''202''，''859 ''，''2184''，''8328''，....." 6730''"，

" ''"，''6793''......， ''"，" ''6573"'''869''...]

文件OTPAenc_dec.py，第258行，在decr

enc = [ int（elem）for elem in split]

ValueError：int（）的文字无效：6730''

请帮助我，任何帮助将不胜感激。

提前致谢。

抱歉我的英语不好，表达方式不好，我真的不知道。 >
如何更彻底地解释它。

Hello,

I have a little problem and although it''s little it''s extremely difficult
for me to describe it, but I''ll try.
I have written a program which extracts certain portions of my received
e-mail. The content of the e-mail is actually predictable, it has one very
long list of numbers, something looking like this:

[34234,35435,657789,6756735,12312378,09678567,23424]

Of course I cannot manipulate my mail while connected to the POP3 server,
so I decided to transfer mail locally and write it to a file and then
manipulate it. Another problem is that in e-mails there is lot of output,
garbage characters and all sorts of nasty things, but somehow, I managed
to solve it (to download e-mail and extract interesting parts), and here
is how (i''ll only show the "interesting parts" part):

temp = [mail.read()]
enc_txt = "\n".join(temp)
begin = enc_txt.find(", ''[")+len(", ''[")
ending = enc_txt.find("]'', ")

enc_txt2 = (enc_txt[begin:ending])
mail.close()
lines = enc_txt2.splitlines()
enc_txt3 = '' ''.join([line.strip() for line in lines])
split = re.split(",", enc_txt3)
enc = [int(elem) for elem in split]
enc = map(int, split)

And this code works! But, there is a problem! When the list of numbers is
longer than 350 bytes, on the 350''th place I don''t get a number, but I get
some quotes and commas and strange things. When the list is longer than
700 bytes, this problem occurs twice (actually it does not occur because
interpretor complains, but there are two mistakes of this type). Is there
a thing I''m missing, can split methods handle more than 350 bytes of
splitting text? What''s actually happening.

To make it more clear (because I think you will not understand it
completely) i could upload errors, but it''s large, so I''ll minimize the
log.

[6964, 7086, 3211, 7522, 9472, 3265, 3610, 104, 9729, 6706, 8035, 5439,
7142, 360, 677, 1667, 1382, 9417, 4493, 8289, 9613, 3470, 889, 1021, 3381,
3480, 2483, 6579, 8928, 3240, 4437, 5908, 2290, 9587, 866, 202, 859, 2184,
8328, ..........] - the list of numbers 705 bytes long.

When I run the program (with command print split inside my code, to see
what''s going on):

[''6964'', '' 7086'', '' 3211'', '' 7522'', '' 9472'', '' 3265'', '' 3610'', '' 104'', ''
9729'', '' 6706'', '' 8035'', '' 5439'', '' 7142'', '' 360'', '' 677'', '' 1667'', ''
1382'', '' 9417'', '' 4493'', '' 8289'', '' 9613'', '' 3470'', '' 889'', '' 1021'', ''
3381'', '' 3480'', '' 2483'', '' 6579'', '' 8928'', '' 3240'', '' 4437'', '' 5908'', ''
2290'', '' 9587'', '' 866'', '' 202'', '' 859'', '' 2184'', '' 8328'', ..... " 6730''",
" ''", '' 6793''...... , " ''", " ''6573", '' 869''...]

File "OTPAenc_dec.py", line 258, in decr
enc = [int(elem) for elem in split]
ValueError: invalid literal for int(): 6730''

Please help me, any help will be appreciated.

Thanks in advance.

Sorry for my bad English and my bad expression style, I really don''t know
how to explain it more throughly.

邮件提取问题（拆分方法有问题） [英] Mail extraction problem (something's wrong with split methods)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

邮件提取问题（拆分方法有问题） [英] Mail extraction problem (something's wrong with split methods)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭