如何使用JavaMail处理多部分/备用邮件? [英] How to handle multipart/alternative mail with JavaMail?

查看:83
本文介绍了如何使用JavaMail处理多部分/备用邮件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个应用程序,该应用程序从收件箱中获取所有电子邮件,过滤包含特定字符串的电子邮件,然后将这些电子邮件放入ArrayList中.

I wrote an application which gets all emails from an inbox, filters the emails which contain a specific string and then puts those emails in an ArrayList.

将电子邮件放入列表后,我将对这些电子邮件的主题和内容进行一些处理.对于没有附件的电子邮件,这一切正常.但是,当我开始使用带有附件的电子邮件时,一切都无法正常工作了.

After the emails are put in the List, I am doing some stuff with the subject and content of said emails. This works all fine for e-mails without an attachment. But when I started to use e-mails with attachments it all didn't work as expected anymore.

这是我的代码:

public void getInhoud(Message msg) throws IOException {
    try {
        cont = msg.getContent();
    } catch (MessagingException ex) {
        Logger.getLogger(ReadMailNew.class.getName()).log(Level.SEVERE, null, ex);
    }
    if (cont instanceof String) {
        String body = (String) cont;


    } else if (cont instanceof Multipart) {
        try {
            Multipart mp = (Multipart) msg.getContent();
            int mp_count = mp.getCount();
            for (int b = 0; b < 1; b++) {
                    dumpPart(mp.getBodyPart(b));
            }
        } catch (Exception ex) {
            System.out.println("Exception arise at get Content");
            ex.printStackTrace();
        }
    }
}

public void dumpPart(Part p) throws Exception {
    email = null;
    String contentType = p.getContentType();
    System.out.println("dumpPart" + contentType);
    InputStream is = p.getInputStream();
    if (!(is instanceof BufferedInputStream)) {
        is = new BufferedInputStream(is);
    }
    int c;
    final StringWriter sw = new StringWriter();
    while ((c = is.read()) != -1) {
        sw.write(c);
    }

    if (!sw.toString().contains("<div>")) {
        mpMessage = sw.toString();
        getReferentie(mpMessage);
    }
}

电子邮件中的内容存储在字符串中.

The content from the e-mail is stored in a String.

当我尝试阅读不带附件的邮件时,此代码可以正常工作.但是,如果我使用带有附件的电子邮件,则字符串还包含HTML代码,甚至包含附件代码.最终,我想存储电子邮件的附件和内容,但我的首要任务是获取不带任何HTML或附件编码的文本.

This code works all fine when I try to read mails without attachment. But if I use an e-mail with attachment the String also contains HTML code and even the attachment coding. Eventually I want to store the attachment and the content of an e-mail, but my first priority is to get just the text without any HTML or attachment coding.

现在,我尝试了一种不同的方法来处理不同的部分:

Now I tried an different approach to handle the different parts:

public void getInhoud(Message msg) throws IOException {
    try {
        Object contt = msg.getContent();

        if (contt instanceof Multipart) {
            System.out.println("Met attachment");
            handleMultipart((Multipart) contt);
        } else {
            handlePart(msg);
            System.out.println("Zonder attachment");

        }
    } catch (MessagingException ex) {
        ex.printStackTrace();
    }
}

public static void handleMultipart(Multipart multipart)
        throws MessagingException, IOException {
    for (int i = 0, n = multipart.getCount(); i < n; i++) {
        handlePart(multipart.getBodyPart(i));
        System.out.println("Count "+n);
    }
}

 public static void handlePart(Part part)
        throws MessagingException, IOException {

    String disposition = part.getDisposition();
    String contentType = part.getContentType();
    if (disposition == null) { // When just body
        System.out.println("Null: " + contentType);
        // Check if plain
        if ((contentType.length() >= 10)
                && (contentType.toLowerCase().substring(
                0, 10).equals("text/plain"))) {
            part.writeTo(System.out);
        } else if ((contentType.length() >= 9)
                && (contentType.toLowerCase().substring(
                0, 9).equals("text/html"))) {
            part.writeTo(System.out);
        } else if ((contentType.length() >= 9)
                && (contentType.toLowerCase().substring(
                0, 9).equals("text/html"))) {
            System.out.println("Ook html gevonden");
            part.writeTo(System.out);
        }else{
            System.out.println("Other body: " + contentType);
            part.writeTo(System.out);
        }
    } else if (disposition.equalsIgnoreCase(Part.ATTACHMENT)) {
        System.out.println("Attachment: " + part.getFileName()
                + " : " + contentType);
    } else if (disposition.equalsIgnoreCase(Part.INLINE)) {
        System.out.println("Inline: "
                + part.getFileName()
                + " : " + contentType);
    } else {
        System.out.println("Other: " + disposition);
    }
}

这是System.out.printlns

Null: multipart/alternative; boundary=047d7b6220720b499504ce3786d7
Other body: multipart/alternative; boundary=047d7b6220720b499504ce3786d7
Content-Type: multipart/alternative; boundary="047d7b6220720b499504ce3786d7"

--047d7b6220720b499504ce3786d7
Content-Type: text/plain; charset="ISO-8859-1"

'Text of the message here in normal text'

--047d7b6220720b499504ce3786d7
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable

'HTML code of the message'

此方法返回电子邮件的普通文本以及邮件的HTML编码.我真的不明白为什么会这样,我已经用谷歌搜索了一下,但是似乎没有其他人遇到这个问题.

This approach returns the normal text of the e-mail but also the HTML coding of the mail. I really don't understand why this happens, I've googled it but it seems like there is no one else with this problem.

感谢您的帮助,

谢谢!

推荐答案

我发现用JavaMail库阅读电子邮件比预期的要困难得多.我不怪JavaMail API,而是怪我对 RFC-822 -Internet电子邮件的正式定义.

I found reading e-mail with the JavaMail library much more difficult than expected. I don't blame the JavaMail API, rather I blame my poor understanding of RFC-822 -- the official definition of Internet e-mail.

作为思想实验:考虑电子邮件在现实世界中可能变得多么复杂.可以将消息无限"嵌入消息中.每个消息本身可能具有多个附件(二进制或人类可读文本).现在想象一下,解析后,这种结构在JavaMail API中变得多么复杂.

As a thought experiment: Consider how complicated an e-mail message can become in the real world. It is possible to "infinitely" embed messages within messages. Each message itself may have multiple attachments (binary or human-readable text). Now imagine how complicated this structure becomes in the JavaMail API after parsing.

使用JavaMail遍历电子邮件时可能有用的一些技巧:

A few tips that may help when traversing e-mail with JavaMail:

MessageMultipartBodyPart都实现Part.尽可能将所有内容都视为Part.这样可以更轻松地构建通用遍历方法.

Message, Multipart, and BodyPart all implement Part. Where possible, treat everything as a Part. This will allow generic traversal methods to be built more easily.

这些Part方法将有助于遍历:

These Part methods will help to traverse:

  • String getContentType():以MIME类型开头.您可能会试图将其视为MIME类型(带有某些黑客行为/剪切/匹配),但事实并非如此.最好只在调试器中使用此方法进行检查.
    • 奇怪的是,无法直接提取MIME类型.而是使用boolean isMimeType(String)进行匹配.仔细阅读文档,以了解强大的通配符,例如"multipart/*".
    • String getContentType(): Starts with the MIME type. You may be tempted to treat this as a MIME type (with some hacking/cutting/matching), but don't. Better to only use this method inside the debugger for inspection.
      • Oddly, MIME type cannot be extracted directly. Instead use boolean isMimeType(String) to match. Read docs carefully to learn about powerful wildcards, such as "multipart/*".
      • Multipart-更多Part的容器
        • 投射到Multipart,然后使用int getCount()BodyPart getBodyPart(int)迭代为从零开始的索引
          • 注意:BodyPart实现Part
          • Multipart -- container for more Parts
            • Cast to Multipart, then iterate as zero-based index with int getCount() and BodyPart getBodyPart(int)
              • Note: BodyPart implements Part
              • 要匹配纯文本,请尝试:Part.isMimeType("text/plain")
              • 要匹配HTML,请尝试:Part.isMimeType("text/html")
              • To match plain text, try: Part.isMimeType("text/plain")
              • To match HTML, try: Part.isMimeType("text/html")
              • 请参阅上面有关Microsoft Exchange服务器的注释.
              • 如果Part.ATTACHMENT.equalsIgnoreCase(getDisposition()),则调用getInputStream()以获取附件的原始字节.
              • if Part.ATTACHMENT.equalsIgnoreCase(getDisposition()), then call getInputStream() to get raw bytes of the attachment.

              最后,我发现官方Javadocs 排除了com.sun.mail中的所有内容包(可能还有更多).如果需要这些,请直接阅读代码,或通过下载源代码并在项目的mail项目模块中运行mvn javadoc:javadoc.

              Finally, I found the official Javadocs exclude everything in the com.sun.mail package (and possibly more). If you need these, read the code directly, or generate the unfiltered Javadocs by downloading the source and running mvn javadoc:javadoc in the mail project module of the project.

              这篇关于如何使用JavaMail处理多部分/备用邮件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆