stanford-corenlp中的默认线程数是多少 [英] What is the default number of threads in stanford-corenlp
问题描述
stanford-corenlp中的默认线程数是多少?具体来说,是命名实体提取器,然后是信息提取器.另外,我都想将单个线程用于调试目的,该如何设置?
What is the default number of threads in stanford-corenlp? Specifically, the named entity extractor, and then the information extractor. Also, I would like both to use a single thread for debugging purposes, how do I set this?
谢谢!
推荐答案
默认为1个线程.
有两种方法可以在多线程模式下运行Stanford CoreNLP.
There are two ways to run Stanford CoreNLP in a multi-threaded mode.
1.)每个线程处理一个单独的文档
1.) each thread handles a separate document
2.)每个线程处理一个单独的句子
2.) each thread handles a separate sentence
假设您有4个核心.
如果希望每个线程处理一个单独的文档,请使用-threads 4
选项(假设您要使用4).
If you want each thread to handle a separate document, use the -threads 4
option (assuming you want to use 4).
因此您可以运行以下命令:
So you might run this command:
java -Xmx14g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,kbp -threads 4 -fileList sample-files.txt -outputFormat text
多个注释器可以并行处理句子.这是将命名实体处理器设置为使用多个线程的示例.
Multiple annotators can process sentences in parallel. Here is an example of setting the named entity processor to use multiple threads.
java -Xmx14g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,kbp -ner.nthreads 4 -fileList sample-filelist-16.txt -outputFormat text
以下注释器可以同时处理多个句子:
The following annotators can work on multiple sentences at the same time:
name example configuration
depparse -depparse.nthreads 4
ner -ner.nthreads 4
parse -parse.nthreads 4
请注意,尽管ner
注释器可以在多线程模式下运行,但它使用了几个不能运行的子注释器.因此,您实际上只能使统计模型并行运行.模式匹配规则模块不能在多线程模式下运行.
Note that while the ner
annotator can run in multi-threaded mode, it uses several sub-annotators that cannot. So you are really only getting the statistical model run in parallel. The pattern matching rules modules do not operate in multi-threaded mode.
这篇关于stanford-corenlp中的默认线程数是多少的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!