Python IMAP下载所有附件 [英] Python IMAP download all attachments

查看:78
本文介绍了Python IMAP下载所有附件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要遍历所有邮件到GMAIL收件箱中.另外,我需要为每个邮件下载所有附件(某些邮件具有4-5个附件).我在这里找到了一些帮助: https://stackoverflow.com/a/27556667/8996442

I need to iterate over all the mail into a GMAIL inbox. Also I need to download all the attachments for each mail (some mails have 4-5 attachments). I found some helps here : https://stackoverflow.com/a/27556667/8996442

def save_attachments(self, msg, download_folder="/tmp"):
    for part in msg.walk():
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        print(filename)
        att_path = os.path.join(download_folder, filename)
        if not os.path.isfile(att_path):
            fp = open(att_path, 'wb')
            fp.write(part.get_payload(decode=True))
            fp.close()
        return att_path

但是,它每封电子邮件仅下载一个附件(但该帖子的作者提到,它通常下载所有附件,不是吗?). print(filename)仅显示一个附件知道为什么吗?

But, it download only one attachment per e-mail (but the author of the post mention that norammly it download all, no?). The print(filename) show me only one attachment Any idea why ?

推荐答案

正如注释中已经指出的那样,直接的问题是 return 退出 for 循环并离开该功能,并且在保存第一个附件后立即执行此操作.

As already pointed out in comments, the immediate problem is that return exits the for loop and leaves the function, and you do this immediately when you have saved the first attachment.

根据您要完成的操作,更改代码,以便仅在完成 msg.walk()的所有迭代后才返回.这是一次尝试返回附件文件名列表的尝试:

Depending on what exactly you want to accomplish, change your code so you only return when you have finished all iterations of msg.walk(). Here is one attempt which returns a list of attachment filenames:

def save_attachments(self, msg, download_folder="/tmp"):
    att_paths = []

    for part in msg.walk():
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        # Don't print
        # print(filename)
        att_path = os.path.join(download_folder, filename)
        if not os.path.isfile(att_path):
            # Use a context manager for robustness
            with open(att_path, 'wb') as fp:
                fp.write(part.get_payload(decode=True))
            # Then you don't need to explicitly close
            # fp.close()
        # Append this one to the list we are collecting
        att_paths.append(att_path)

    # We are done looping and have processed all attachments now
    # Return the list of file names
    return att_paths

请参阅在线注释,以获取有关我更改的内容以及原因的解释.

See the inline comments for explanations of what I changed and why.

通常,避免从辅助函数中进行 print()填充;或者使用 logging 以调用者可以控制的方式打印诊断,或者仅返回信息并让调用者决定是否将其呈现给用户.

In general, avoid print()ing stuff from inside a worker function; either use logging to print diagnostics in a way that the caller can control, or just return the information and let the caller decide whether or not to present it to the user.

并非所有的MIME部分都具有 Content-Disposition:;实际上,我希望这会错过大多数附件,并可能提取一些内联部分.更好的方法可能是查看零件是否具有 Content-Disposition:附件,否则,如果没有 Content-Disposition: Content,则继续提取-Type:既不是 text/plain 也不是 text/html .也许也请参见什么是部件"?在多部分电子邮件中?

Not all MIME parts have a Content-Disposition:; in fact, I would expect this to miss the majority of attachments, and possibly extract some inline parts. A better approach is probably to look whether the part has Content-Disposition: attachment and otherwise proceed to extract if either there is no Content-Disposition: or the Content-Type: is not either text/plain or text/html. Perhaps see also What are the "parts" in a multipart email?

这篇关于Python IMAP下载所有附件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆