Apache Tika ArchiveStreamFactory.detect 错误 [英] Apache Tika ArchiveStreamFactory.detect error

查看:23
本文介绍了Apache Tika ArchiveStreamFactory.detect 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 java 和 apache tika 1.18 将一些文件转换为 TXT.当我尝试使用 AutoDetectParser() 时,出现错误:

I'm using java with apache tika 1.18 to convert some files to TXT. When I try to use the AutoDetectParser(), I'm getting the error :

[ERROR] 错误处理时出错,放弃!org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;[错误] SRVE0777E:应用程序类org.apache.cxf.service.invoker.AbstractInvoker.createFault:162"抛出异常org.apache.cxf.interceptor.Fault: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;在 org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162)在[内部课程]引起:java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;

[ERROR ] Error occurred during error handling, give up! org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; [ERROR ] SRVE0777E: Exception thrown by application class 'org.apache.cxf.service.invoker.AbstractInvoker.createFault:162' org.apache.cxf.interceptor.Fault: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162) at [internal classes] Caused by: java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;

我在 Internet 上搜索并发现此错误与 commom_compress 的错误版本有关,在 commom_compress 1.14 之前的版本中似乎不存在此方法.就我而言,版本是 1.16.1.

I was dinging on internet and found this error related wrong version of commom_compress, appears this method doesn't exist in versions previous of 1.14 of commom_compress. In my case the version is 1.16.1.

构建项目后,我检查了里面的库,只有正确的版本.

After build the project, I checked the libs inside and there is only the correct version.

我正在使用 IBM Liberty 18.0 ......现在我真的对解决这个问题的选择感到迷茫.

I'm using IBM Liberty 18.0 ... and now I'm really lost about options to solve this problem.

当我使用特定的解析器时,如 PDFParser(),一切正常!

When I use the specific parser, like PDFParser(), everything works fine!

有什么想法吗?

谢谢

推荐答案

问题来源:

Spark 2.x 发行版包括旧版本的 commons-compress,而 Tika 库依赖于 commons-compress 库的 1.18 版.

Source of the issue:

Spark 2.x distributions include old versions of commons-compress, while Tika library depends on version 1.18 of commons-compress library.

在 spark-shell 或 spark-submit 中使用 --driver-class-path 参数指向正确版本的 commons-compress 库.

Use --driver-class-path argument in your spark-shell or spark-submit to point to a the right version of commons-compress library.

spark-submit 
     --driver-class-path ~/.m2/repository/org/apache/commons/commons-compress/1.18/commons-compress-1.18.jar
     --class {you.main.class}
....

检查我的详细回答在这里.

这篇关于Apache Tika ArchiveStreamFactory.detect 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆