提卡1.13 RuntimeException [英] Tika 1.13 RuntimeException

查看:155
本文介绍了提卡1.13 RuntimeException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近更新了现有的tika项目,以使用1.13而不是1.10.我所做的唯一一件事就是将依赖性版本从1.10更改为1.13.该项目已成功构建.但是,每当我尝试运行该应用程序时,都会出现此异常:

I recently updated my existing tika project to use tika 1.13 instead of 1.10. The only thing I did was changing the dependency version from 1.10 to 1.13. The project was built successfully. Yet whenever I try and run the application I get this exception:

java.lang.RuntimeException: Unable to parse the default media type registry
    at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:580)
    at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:69)
    at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:218)
    at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:341)
    at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:51)
    at com.app.tikamanager.MetaParser.<init>(MetaParser.java:54)
    at com.app.services.MyService.HandleItemInThread(IntelligentDocumentsService.java:260)
    at com.app.intelligentservicebase.ItemHandlerThread.run(ItemHandlerThread.java:41)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.mime.MimeTypeException: Invalid type configuration
    at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:126)
    at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:64)
    at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:93)
    at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:170)
    at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:577)
    ... 10 more
Caused by: org.xml.sax.SAXNotRecognizedException: http://javax.xml.XMLConstants/feature/secure-processing
    at org.apache.xerces.parsers.AbstractSAXParser.setFeature(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl.setFeatures(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl.<init>(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserFactoryImpl.newSAXParserImpl(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserFactoryImpl.setFeature(Unknown Source)
    at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:119)
    ... 14 more

从我的MetaParser类的构造函数中抛出异常,唯一的事情是AutoDetectParser的初始化:

The exception is thrown from the constructor of my MetaParser class, the only thing there is the initialization of the AutoDetectParser:

private final AutoDetectParser _tikaExtractor;
public MetaParser()
    {
        _tikaExtractor = new AutoDetectParser();
    }

我正在使用Oracle JDK 1.8.0_91-b14在Ubuntu 14.04上运行该应用程序.

I am running the application on Ubuntu 14.04 with Oracle JDK 1.8.0_91-b14.

我看了网上,曾两次提到该例外情况,一次可能的修复是安装OpenJDK,但这是针对Tika的旧版本的,并且由于旧版本以前可以与同一个JDK正常工作,所以我没有认为是问题所在.

I looked online and this exception was mentioned a couple of times, once a probable fix was to install OpenJDK but that was for an old version of Tika and since the old version used to work fine with the same JDK I don't think that is the problem.

在调用AutoDetectParser构造函数之前,我需要做些什么或初始化什么吗?

Is there something I need to do or initialize before calling the AutoDetectParser constructor?

推荐答案

将注释升级为答案-您的类路径上有一个非常旧版本的Xerces.您的JVM选择了它作为默认的XML解析器,因此当Tika说JVM,我能拥有一个安全的XML解析器"时,它将失败.

Promoting comments to an answer - you have a very old version of Xerces on your classpath. Your JVM is picking that as the default XML Parser, so when Tika says "Hi JVM, can I have a safe XML Parser" it fails.

(Tika在1.10到1.13期间对XML解析的完成方式进行了改进,包括设置更安全的默认值,这就是这种情况开始发生的原因)

(Tika made improvements in the 1.10 to 1.13 period to how XML Parsing is done, including setting safer defaults, which is why this has started happening)

您需要删除旧的Xerces jar,以便开始使用JVM提供的XML Parser,或将其替换为较新的Xerces版本

You either need to remove your old Xerces jars, so that the JVM-supplied XML Parser starts being used, or replace them with a more recent Xerces version

您还可以在很有帮助,尤其是在您难以在构建中找到讨厌的旧Xerces jar的情况下!

You may also find some of the advice in Error unmarshalling XML in Java 8 "secure-processing org.xml.sax.SAXNotRecognizedException" helpful, especially if you're struggling to locate the pesky old Xerces jar in your build!

这篇关于提卡1.13 RuntimeException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆