Apache Tika ArchiveStreamFactory.detect 错误 [英] Apache Tika ArchiveStreamFactory.detect error
问题描述
我使用 java 和 apache tika 1.18 将一些文件转换为 TXT.当我尝试使用 AutoDetectParser() 时,出现错误:
I'm using java with apache tika 1.18 to convert some files to TXT. When I try to use the AutoDetectParser(), I'm getting the error :
[ERROR] 错误处理时出错,放弃!org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;[错误] SRVE0777E:应用程序类org.apache.cxf.service.invoker.AbstractInvoker.createFault:162"抛出异常org.apache.cxf.interceptor.Fault: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;在 org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162)在[内部课程]引起:java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;
[ERROR ] Error occurred during error handling, give up! org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; [ERROR ] SRVE0777E: Exception thrown by application class 'org.apache.cxf.service.invoker.AbstractInvoker.createFault:162' org.apache.cxf.interceptor.Fault: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162) at [internal classes] Caused by: java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;
我在 Internet 上搜索并发现此错误与 commom_compress 的错误版本有关,在 commom_compress 1.14 之前的版本中似乎不存在此方法.就我而言,版本是 1.16.1.
I was dinging on internet and found this error related wrong version of commom_compress, appears this method doesn't exist in versions previous of 1.14 of commom_compress. In my case the version is 1.16.1.
构建项目后,我检查了里面的库,只有正确的版本.
After build the project, I checked the libs inside and there is only the correct version.
我正在使用 IBM Liberty 18.0 ......现在我真的对解决这个问题的选择感到迷茫.
I'm using IBM Liberty 18.0 ... and now I'm really lost about options to solve this problem.
当我使用特定的解析器时,如 PDFParser(),一切正常!
When I use the specific parser, like PDFParser(), everything works fine!
有什么想法吗?
谢谢
推荐答案
问题来源:
Spark 2.x
发行版包括旧版本的 commons-compress,而 Tika
库依赖于 commons-compress
库的 1.18 版.
Source of the issue:
Spark 2.x
distributions include old versions of commons-compress, while Tika
library depends on version 1.18 of commons-compress
library.
在 spark-shell 或 spark-submit 中使用 --driver-class-path 参数指向正确版本的 commons-compress 库.
Use --driver-class-path argument in your spark-shell or spark-submit to point to a the right version of commons-compress library.
spark-submit
--driver-class-path ~/.m2/repository/org/apache/commons/commons-compress/1.18/commons-compress-1.18.jar
--class {you.main.class}
....
这篇关于Apache Tika ArchiveStreamFactory.detect 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!