基于 SSL 的 JavaMail IMAP 相当慢 - 批量获取多条消息 [英] JavaMail IMAP over SSL quite slow - Bulk fetching multiple messages

查看:31
本文介绍了基于 SSL 的 JavaMail IMAP 相当慢 - 批量获取多条消息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试使用 JavaMail 从 IMAP 服务器(Gmail 和其他服务器)获取电子邮件.基本上,我的代码有效:我确实可以获得标题、正文内容等.我的问题如下:在 IMAP 服务器(无 SSL)上工作时,处理消息基本上需要 1-2 毫秒.当我使用 IMAPS 服务器(因此使用 SSL,例如 Gmail)时,我达到了大约 250 米/消息.我只测量处理消息的时间(不考虑连接、握手等).

I am currently trying to use JavaMail to get emails from IMAP servers (Gmail and others). Basically, my code works: I indeed can get the headers, body contents and so on. My problem is the following: when working on an IMAP server (no SSL), it basically takes 1-2ms to process a message. When I go on an IMAPS server (hence with SSL, such as Gmail) I reach around 250m/message. I ONLY measure the time when processing the messages (the connection, handshake and such are NOT taken into account).

我知道因为这是 SSL,所以数据是加密的.不过,解密的时间应该没那么重要吧?

I know that since this is SSL, the data is encrypted. However, the time for decryption should not be that important, should it?

我曾尝试设置更高的 ServerCacheSize 值、更高的连接池大小,但我真的没有想法了.有人遇到过这个问题吗?解决了一个人可能希望的问题?

I have tried setting a higher ServerCacheSize value, a higher connectionpoolsize, but am seriously running out of ideas. Anyone confronted with this problem? Solved it one might hope?

我担心 JavaMail API 每次从 IMAPS 服务器获取邮件时都会使用不同的连接(包括握手的开销...).如果是这样,有没有办法覆盖这种行为?

My fear is that the JavaMail API uses a different connection each time it fetches a mail from the IMAPS server (involving the overhead for handshake...). If so, is there a way to override this behavior?

这是从 Main() 类调用的我的代码(虽然很标准):

Here is my code (although quite standard) called from the Main() class:

 public static int connectTest(String SSL, String user, String pwd, String host) throws IOException,
                                                                               ProtocolException,
                                                                               GeneralSecurityException {

    Properties props = System.getProperties();
    props.setProperty("mail.store.protocol", SSL);
    props.setProperty("mail.imaps.ssl.trust", host);
    props.setProperty("mail.imaps.connectionpoolsize", "10");

    try {


        Session session = Session.getDefaultInstance(props, null);

        // session.setDebug(true);

        Store store = session.getStore(SSL);
        store.connect(host, user, pwd);      
        Folder inbox = store.getFolder("INBOX");

        inbox.open(Folder.READ_ONLY);                
        int numMess = inbox.getMessageCount();
        Message[] messages = inbox.getMessages();

        for (Message m : messages) {

            m.getAllHeaders();
            m.getContent();
        }

        inbox.close(false);
        store.close();
        return numMess;
    } catch (MessagingException e) {
        e.printStackTrace();
        System.exit(2);
    }
    return 0;
}

提前致谢.

推荐答案

经过大量工作以及 JavaMail 人员的帮助,这种缓慢"的根源在于 API 中的 FETCH 行为.事实上,正如 pjaol 所说,每次我们需要消息的信息(标题或消息内容)时,我们都会返回服务器.

after a lot of work, and assistance from the people at JavaMail, the source of this "slowness" is from the FETCH behavior in the API. Indeed, as pjaol said, we return to the server each time we need info (a header, or message content) for a message.

如果 FetchProfile 允许我们批量获取许多消息的标头信息或标志,则无法直接获取多条消息的内容.

If FetchProfile allows us to bulk fetch header information, or flags, for many messages, getting contents of multiple messages is NOT directly possible.

幸运的是,我们可以编写自己的 IMAP 命令来避免这种限制"(这样做是为了避免内存不足错误:在一个命令中获取内存中的每封邮件可能非常繁重).

Luckily, we can write our own IMAP command to avoid this "limitation" (it was done this way to avoid out of memory errors: fetching every mail in memory in one command can be quite heavy).

这是我的代码:

import com.sun.mail.iap.Argument;
import com.sun.mail.iap.ProtocolException;
import com.sun.mail.iap.Response;
import com.sun.mail.imap.IMAPFolder;
import com.sun.mail.imap.protocol.BODY;
import com.sun.mail.imap.protocol.FetchResponse;
import com.sun.mail.imap.protocol.IMAPProtocol;
import com.sun.mail.imap.protocol.UID;

public class CustomProtocolCommand implements IMAPFolder.ProtocolCommand {
    /** Index on server of first mail to fetch **/
    int start;

    /** Index on server of last mail to fetch **/
    int end;

    public CustomProtocolCommand(int start, int end) {
        this.start = start;
        this.end = end;
    }

    @Override
    public Object doCommand(IMAPProtocol protocol) throws ProtocolException {
        Argument args = new Argument();
        args.writeString(Integer.toString(start) + ":" + Integer.toString(end));
        args.writeString("BODY[]");
        Response[] r = protocol.command("FETCH", args);
        Response response = r[r.length - 1];
        if (response.isOK()) {
            Properties props = new Properties();
            props.setProperty("mail.store.protocol", "imap");
            props.setProperty("mail.mime.base64.ignoreerrors", "true");
            props.setProperty("mail.imap.partialfetch", "false");
            props.setProperty("mail.imaps.partialfetch", "false");
            Session session = Session.getInstance(props, null);

            FetchResponse fetch;
            BODY body;
            MimeMessage mm;
            ByteArrayInputStream is = null;

            // last response is only result summary: not contents
            for (int i = 0; i < r.length - 1; i++) {
                if (r[i] instanceof IMAPResponse) {
                    fetch = (FetchResponse) r[i];
                    body = (BODY) fetch.getItem(0);
                    is = body.getByteArrayInputStream();
                    try {
                        mm = new MimeMessage(session, is);
                        Contents.getContents(mm, i);
                    } catch (MessagingException e) {
                        e.printStackTrace();
                    }
                }
            }
        }
        // dispatch remaining untagged responses
        protocol.notifyResponseHandlers(r);
        protocol.handleResult(response);

        return "" + (r.length - 1);
    }
}

getContents(MimeMessage mm, int i) 函数是一个经典的函数,它递归地将消息的内容打印到一个文件中(网上有很多例子).

the getContents(MimeMessage mm, int i) function is a classic function that recursively prints the contents of the message to a file (many examples available on the net).

为了避免内存不足的错误,我简单地设置了一个 maxDocs 和 maxSize 限制(这个是随意做的,可能可以改进!)使用如下:

To avoid out of memory errors, I simply set a maxDocs and maxSize limit (this has been done arbitrarily and can probably be improved!) used as follows:

public int efficientGetContents(IMAPFolder inbox, Message[] messages)
        throws MessagingException {
    FetchProfile fp = new FetchProfile();
    fp.add(FetchProfile.Item.FLAGS);
    fp.add(FetchProfile.Item.ENVELOPE);
    inbox.fetch(messages, fp);
    int index = 0;
    int nbMessages = messages.length;
    final int maxDoc = 5000;
    final long maxSize = 100000000; // 100Mo

    // Message numbers limit to fetch
    int start;
    int end;

    while (index < nbMessages) {
        start = messages[index].getMessageNumber();
        int docs = 0;
        int totalSize = 0;
        boolean noskip = true; // There are no jumps in the message numbers
                                           // list
        boolean notend = true;
        // Until we reach one of the limits
        while (docs < maxDoc && totalSize < maxSize && noskip && notend) {
            docs++;
            totalSize += messages[index].getSize();
            index++;
            if (notend = (index < nbMessages)) {
                noskip = (messages[index - 1].getMessageNumber() + 1 == messages[index]
                        .getMessageNumber());
            }
        }

        end = messages[index - 1].getMessageNumber();
        inbox.doCommand(new CustomProtocolCommand(start, end));

        System.out.println("Fetching contents for " + start + ":" + end);
        System.out.println("Size fetched = " + (totalSize / 1000000)
                + " Mo");

    }

    return nbMessages;
}

不要说这里我使用的是不稳定的消息号(如果消息从服务器中删除,这些会改变).更好的方法是使用 UID!然后您将命令从 FETCH 更改为 UID FETCH.

Do not that here I am using message numbers, which is unstable (these change if messages are erased from the server). A better method would be to use UIDs! Then you would change the command from FETCH to UID FETCH.

希望这会有所帮助!

这篇关于基于 SSL 的 JavaMail IMAP 相当慢 - 批量获取多条消息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆