如何将格式化的电子邮件转换为Java中的纯文本? [英] How do I convert a formatted email into plain text in Java?

查看:98
本文介绍了如何将格式化的电子邮件转换为Java中的纯文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个程序可以将电子邮件作为短信转发给客户。

I have a program that forwards an email as a text message to a Customer.

现在简单回复一封文字420的电子邮件写在其邮件正文中转换为

Now a Simple reply to an email with text "420" written in its message body gets converted to

*

    <div dir="ltr">420</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 8, 2013 at 4:14 PM, <span dir="ltr">&lt; 3:50 AM+11111111111: (2/6)<a href="mailto:xxxxxx@gmail.com" target="_blank">xxxxxx@gmail.com</a>&gt;</span> wrote:<br> <blockquote class="gmail_quot 3:50 AM +14411111111: (3/6)e" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">414<div class="HOEnZb"><div class="h5"><br>DO_NOT_REPLY:This i 3:50 AM
 : (4/6)s an email notification that you have received a text message from a customer in . If you reply to this email, a text message or 3:50 AM
 (5/6)email message will NOT go to the customer. Access the customer text message to send a reply. </div></div></blockquote></div> 3:50 AM
    (6/6)<br></div>

*

如何从文本中删除所有格式并仅转发消息正文?

How to I remove all formatting from Text and only forward the message body ?

推荐答案

import java.io.IOException;
import java.io.StringReader;

import javax.swing.text.MutableAttributeSet;
import javax.swing.text.html.HTML.Attribute;
import javax.swing.text.html.HTML.Tag;
import javax.swing.text.html.HTMLEditorKit.Parser;
import javax.swing.text.html.HTMLEditorKit.ParserCallback;
import javax.swing.text.html.parser.ParserDelegator;

public class ExtractEmailBody
{
    public static void main(String[] args) throws IOException
    {
        String email = "<div dir=\"ltr\">420</div><div class=\"gmail_extra\"><br><br><div class=\"gmail_quote\">On Thu, Aug 8, 2013 at 4:14 PM, <span dir=\"ltr\">&lt; 3:50 AM+11111111111: (2/6)<a href=\"mailto:xxxxxx@gmail.com\" target=\"_blank\">xxxxxx@gmail.com</a>&gt;</span> wrote:<br> <blockquote class=\"gmail_quot 3:50 AM +14411111111: (3/6)e\" style=\"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex\">414<div class=\"HOEnZb\"><div class=\"h5\"><br>DO_NOT_REPLY:This i 3:50 AM" +
                ": (4/6)s an email notification that you have received a text message from a customer in Kaarma. If you reply to this email, a text message or 3:50 AM" +
                "(5/6)email message will NOT go to the customer. Access the customer text message to send a reply. </div></div></blockquote></div> 3:50 AM" +
                   "(6/6)<br></div>";

        class EmailCallback extends ParserCallback
        {
            private String body_;
            private boolean divStarted_;

            public String getBody()
            {
                return body_;
            }

            @Override
            public void handleStartTag(Tag t, MutableAttributeSet a, int pos)
            {
                if (t.equals(Tag.DIV) && "ltr".equals(a.getAttribute(Attribute.DIR)))
                {
                    divStarted_ = true;
                }
            }

            @Override
            public void handleEndTag(Tag t, int pos)
            {
                if (t.equals(Tag.DIV))
                {
                    divStarted_ = false;
                }
            }

            @Override
            public void handleText(char[] data, int pos)
            {
                if (divStarted_)
                {
                    body_ = new String(data);
                }
            }
        }
        EmailCallback callback = new EmailCallback();
        Parser parser = new ParserDelegator();
        StringReader reader = new StringReader(email);
        parser.parse(reader, callback, true);
        reader.close();
        System.out.println(callback.getBody());
    }
}

这篇关于如何将格式化的电子邮件转换为Java中的纯文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆