Stanford NLP-处理文件列表时OpenIE内存不足 [英] Stanford NLP - OpenIE out of memory when processing list of files

查看：161 发布时间：2020/8/6 3:06:17 java stanford-nlp

本文介绍了Stanford NLP-处理文件列表时OpenIE内存不足的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正尝试使用Stanford CoreNLP的OpenIE工具从多个文件中提取信息，当将多个文件(而不只是一个)传递给输入时，它会出现内存不足的错误.

I'm trying to extract information from several files using the OpenIE tool from Stanford CoreNLP, it gives an out of memory error when several files are passed to the input, instead of just one.

All files have been queued; awaiting termination...
java.lang.OutOfMemoryError: GC overhead limit exceeded
at edu.stanford.nlp.graph.DirectedMultiGraph.outgoingEdgeIterator(DirectedMultiGraph.java:508)
at edu.stanford.nlp.semgraph.SemanticGraph.outgoingEdgeIterator(SemanticGraph.java:165)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER$1.advance(GraphRelation.java:267)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1102)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.<init>(GraphRelation.java:1083)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER$1.<init>(GraphRelation.java:257)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER.searchNodeIterator(GraphRelation.java:257)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:320)
at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.matches(CoordinationPattern.java:211)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matchChild(NodePattern.java:514)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:542)
at edu.stanford.nlp.naturalli.RelationTripleSegmenter.segmentVerb(RelationTripleSegmenter.java:541)
at edu.stanford.nlp.naturalli.RelationTripleSegmenter.segment(RelationTripleSegmenter.java:850)
at edu.stanford.nlp.naturalli.OpenIE.relationInFragment(OpenIE.java:354)
at edu.stanford.nlp.naturalli.OpenIE.lambda$relationsInFragments$2(OpenIE.java:366)
at edu.stanford.nlp.naturalli.OpenIE$$Lambda$76/1438896944.apply(Unknown Source)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1540)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at edu.stanford.nlp.naturalli.OpenIE.relationsInFragments(OpenIE.java:366)
at edu.stanford.nlp.naturalli.OpenIE.annotateSentence(OpenIE.java:486)
at edu.stanford.nlp.naturalli.OpenIE.lambda$annotate$3(OpenIE.java:554)
at edu.stanford.nlp.naturalli.OpenIE$$Lambda$25/606198361.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1249)
at edu.stanford.nlp.naturalli.OpenIE.annotate(OpenIE.java:554)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:71)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:499)
at edu.stanford.nlp.naturalli.OpenIE.processDocument(OpenIE.java:630)
DONE processing files. 1 exceptions encountered.

我使用此调用通过输入传递文件:

I pass the files by input using this call:

java -mx3g -cp stanford-corenlp-3.6.0.jar:stanford-corenlp-3.6.0-models.jar:CoreNLP-to-HTML.xsl:slf4j-api.jar:slf4j-simple.jar edu.stanford.nlp.naturalli.OpenIE file1 file2 file3 etc.

我尝试使用-mx3g和其他变体来增加内存，尽管处理的文件数量增加了，但数量并不多(例如，从5到7).每个文件都经过正确处理，因此我排除了包含大句子或多行的文件.

I tried increasing the memory with -mx3g and other variants, and although the amount of processed files increases, it's not much (from 5 to 7, for eg.). Each file individually is processed correctly, so I'm excluding a file with big sentences or many lines.

是否有我没有考虑的选项，一些OpenIE或Java标志，可以用来强制转储到输出，清理或处理的每个文件之间的垃圾回收的东西?

Is there an option I'm not considering, some OpenIE or Java flag, something that I can use to force a dump to an output, a cleaning, or garbage collection between each file that is processed?

提前谢谢

Stanford NLP-处理文件列表时OpenIE内存不足 [英] Stanford NLP - OpenIE out of memory when processing list of files

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Stanford NLP-处理文件列表时OpenIE内存不足 [英] Stanford NLP - OpenIE out of memory when processing list of files

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭