如何获取.MSG文件的MIME类型? [英] How to get the MIME Type of a .MSG file?

查看:150
本文介绍了如何获取.MSG文件的MIME类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了这些方法来查找文件的MIME类型...

Path source = Paths
                .get("C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg");
        System.out.println(Files.probeContentType(source));

上面的代码返回null ...
而且,如果我使用Apache的TIKA API来获取MIME类型,那么它将以文本/纯文本形式给出...

但是我希望结果为application/vnd.ms-outlook

更新

我还使用MIME-Util.jar如下代码...

MimeUtil2 mimeUtil = new MimeUtil2();
        mimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
        RandomAccessFile file1 = new RandomAccessFile(
                "C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg",
                "r");
        System.out.println(file1.length());
        byte[] file = new byte[624128];
        file1.read(file, 0, 624128);
        String mimeType = MimeUtil2.getMostSpecificMimeType(mimeUtil.getMimeTypes(file)).toString();

这给我输出为application/msword

更新:

Tika API超出范围,因为它太大而无法包含在项目中...

那我怎么找到MIME类型呢?

解决方案

我尝试了一些可能的方法,使用tika可以得到您期望的结果,但我看不到您使用的代码,因此无法再次检查./p>

我尝试了不同的方式,而不是代码片段中的全部方式:

  1. Java 7 Files.probeContentType(path)
  2. 从文件名和内容类型猜测中
  3. URLConnection哑剧检测
  4. JDK 6 JAF API javax.activation.MimetypesFileTypeMap
  5. MimeUtil具有我发现的所有MimeDetector子类
  6. Apache Tika
  7. Apache POI暂存器

这里是测试班:

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URLConnection;
import java.util.Collection;

import javax.activation.MimetypesFileTypeMap;

import org.apache.tika.detect.Detector;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;

import eu.medsea.mimeutil.MimeUtil;

public class FindMime {

    public static void main(String[] args) {
        File file = new File("C:\\Users\\qwerty\\Desktop\\test.msg");

        System.out.println("urlConnectionGuess " + urlConnectionGuess(file));

        System.out.println("fileContentGuess " + fileContentGuess(file));

        MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();

        System.out.println("mimeTypesMap.getContentType " + mimeTypesMap.getContentType(file));

        System.out.println("mimeutils " + mimeutils(file));

        System.out.println("tika " + tika(file));

    }

    private static String mimeutils(File file) {
        try {
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.ExtensionMimeDetector");
//          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.OpendesktopMimeDetector");
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.WindowsRegistryMimeDetector");
//          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.TextMimeDetector");
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            Collection<?> mimeTypes = MimeUtil.getMimeTypes(is);
            return mimeTypes.toString();
        } catch (Exception e) {
            // TODO: handle exception
        }
        return null;
    }

    private static String tika(File file) {
        try {
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            AutoDetectParser parser = new AutoDetectParser();
            Detector detector = parser.getDetector();
            Metadata md = new Metadata();
            md.add(Metadata.RESOURCE_NAME_KEY, "test.msg");
            MediaType mediaType = detector.detect(is, md);
            return mediaType.toString();
        } catch (Exception e) {
            // TODO: handle exception
        }
        return null;
    }

    private static String urlConnectionGuess(File file) {
        String mimeType = URLConnection.guessContentTypeFromName(file.getName());
        return mimeType;
    }

    private static String fileContentGuess(File file) {
        try {
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            return URLConnection.guessContentTypeFromStream(is);
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }

}

这是输出:

urlConnectionGuess null
fileContentGuess null
mimeTypesMap.getContentType application/octet-stream
mimeutils application/msword,application/x-hwp
tika application/vnd.ms-outlook

已更新,我添加了此方法以通过Tika测试其他方式:

private static void tikaMore(File file) {
    Tika defaultTika = new Tika();
    Tika mimeTika = new Tika(new MimeTypes());
    Tika typeTika = new Tika(new TypeDetector());
    try {
        System.out.println(defaultTika.detect(file));
        System.out.println(mimeTika.detect(file));
        System.out.println(typeTika.detect(file));
    } catch (Exception e) {
        // TODO: handle exception
    }
}

使用不含扩展名的msg文件进行了测试:

application/vnd.ms-outlook
application/octet-stream
application/octet-stream

使用重命名为msg的txt文件进行了测试:

text/plain
text/plain
application/octet-stream

在这种情况下,使用空构造函数的最简单方法似乎是最可靠的.

更新,您可以使用Apache POI暂存器进行自己的检查,例如,这是一种获取消息的哑剧的简单实现,如果文件格式不正确(通常为org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature):

import org.apache.poi.hsmf.MAPIMessage;

public class PoiMsgMime {

    public String getMessageMime(String fileName) {
        try {
            new MAPIMessage(fileName);
            return "application/vnd.ms-outlook";
        } catch (Exception e) {
            return null;
        }
    }
}

I have tried these ways of finding the MIME type of a file...

Path source = Paths
                .get("C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg");
        System.out.println(Files.probeContentType(source));

The above code returns null...
And if I use the TIKA API from Apache to get the MIME type then it gives it as text/plain...

But I want the result as application/vnd.ms-outlook

UPDATE

I also used MIME-Util.jar as follows with code...

MimeUtil2 mimeUtil = new MimeUtil2();
        mimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
        RandomAccessFile file1 = new RandomAccessFile(
                "C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg",
                "r");
        System.out.println(file1.length());
        byte[] file = new byte[624128];
        file1.read(file, 0, 624128);
        String mimeType = MimeUtil2.getMostSpecificMimeType(mimeUtil.getMimeTypes(file)).toString();

This gives me output as application/msword

UPDATE:

Tika API is out of scope as it is too large to include in the project...

So how can I find the MIME type?

解决方案

I tried some of the possible ways and using tika gives the result you expected, I don't see the code you used so i cannot double check it.

I tried different ways, not all in the code snippet:

  1. Java 7 Files.probeContentType(path)
  2. URLConnection mime detection from file name and content type guessing
  3. JDK 6 JAF API javax.activation.MimetypesFileTypeMap
  4. MimeUtil with all available subclass of MimeDetector I found
  5. Apache Tika
  6. Apache POI scratchpad

Here the test class:

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URLConnection;
import java.util.Collection;

import javax.activation.MimetypesFileTypeMap;

import org.apache.tika.detect.Detector;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;

import eu.medsea.mimeutil.MimeUtil;

public class FindMime {

    public static void main(String[] args) {
        File file = new File("C:\\Users\\qwerty\\Desktop\\test.msg");

        System.out.println("urlConnectionGuess " + urlConnectionGuess(file));

        System.out.println("fileContentGuess " + fileContentGuess(file));

        MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();

        System.out.println("mimeTypesMap.getContentType " + mimeTypesMap.getContentType(file));

        System.out.println("mimeutils " + mimeutils(file));

        System.out.println("tika " + tika(file));

    }

    private static String mimeutils(File file) {
        try {
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.ExtensionMimeDetector");
//          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.OpendesktopMimeDetector");
            MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.WindowsRegistryMimeDetector");
//          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.TextMimeDetector");
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            Collection<?> mimeTypes = MimeUtil.getMimeTypes(is);
            return mimeTypes.toString();
        } catch (Exception e) {
            // TODO: handle exception
        }
        return null;
    }

    private static String tika(File file) {
        try {
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            AutoDetectParser parser = new AutoDetectParser();
            Detector detector = parser.getDetector();
            Metadata md = new Metadata();
            md.add(Metadata.RESOURCE_NAME_KEY, "test.msg");
            MediaType mediaType = detector.detect(is, md);
            return mediaType.toString();
        } catch (Exception e) {
            // TODO: handle exception
        }
        return null;
    }

    private static String urlConnectionGuess(File file) {
        String mimeType = URLConnection.guessContentTypeFromName(file.getName());
        return mimeType;
    }

    private static String fileContentGuess(File file) {
        try {
            InputStream is = new BufferedInputStream(new FileInputStream(file));
            return URLConnection.guessContentTypeFromStream(is);
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }

}

and this is the output:

urlConnectionGuess null
fileContentGuess null
mimeTypesMap.getContentType application/octet-stream
mimeutils application/msword,application/x-hwp
tika application/vnd.ms-outlook

Updated I added this method to test other ways with Tika:

private static void tikaMore(File file) {
    Tika defaultTika = new Tika();
    Tika mimeTika = new Tika(new MimeTypes());
    Tika typeTika = new Tika(new TypeDetector());
    try {
        System.out.println(defaultTika.detect(file));
        System.out.println(mimeTika.detect(file));
        System.out.println(typeTika.detect(file));
    } catch (Exception e) {
        // TODO: handle exception
    }
}

tested with a msg file without extension:

application/vnd.ms-outlook
application/octet-stream
application/octet-stream

tested with a txt file renamed to msg:

text/plain
text/plain
application/octet-stream

seems that the most simple way by using the empty constructor is the most reliable in this case.

Update you can make your own checker using Apache POI scratchpad, for example this is a simple implementation to get the mime of the message or null if the file is not in the proper format (usually org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature):

import org.apache.poi.hsmf.MAPIMessage;

public class PoiMsgMime {

    public String getMessageMime(String fileName) {
        try {
            new MAPIMessage(fileName);
            return "application/vnd.ms-outlook";
        } catch (Exception e) {
            return null;
        }
    }
}

这篇关于如何获取.MSG文件的MIME类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆