gensim.LDAMulticore抛出异常: [英] gensim.LDAMulticore throwing exception:
问题描述
我正在从python gensim库运行LDAMulticore,该脚本似乎无法创建多个线程.这是错误:
I am running LDAMulticore from the python gensim library, and the script cannot seem to create more than one thread. Here is the error:
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 97, in worker
initializer(*initargs)
File "/usr/lib64/python2.7/site-packages/gensim/models/ldamulticore.py", line 333, in worker_e_step
worker_lda.do_estep(chunk) # TODO: auto-tune alpha?
File "/usr/lib64/python2.7/site-packages/gensim/models/ldamodel.py", line 725, in do_estep
gamma, sstats = self.inference(chunk, collect_sstats=True)
File "/usr/lib64/python2.7/site-packages/gensim/models/ldamodel.py", line 655, in inference
ids = [int(idx) for idx, _ in doc]
TypeError: 'int' object is not iterable
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 765, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
pool._maintain_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 229, in _maintain_pool
self._repopulate_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
我正在这样创建我的LDA模型:
I'm creating my LDA model like this:
ldamodel = LdaMulticore(corpus, num_topics=50, id2word = dictionary, workers=3)
我实际上已经询问了有关此脚本的另一个问题,因此可以在此处找到完整的脚本:
I have actually asked another question about this script, so the full script can be found here:
如果相关,我正在CentOS服务器上运行它.让我知道我是否应该包括任何其他信息.
If it's relevant, I'm running this on a CentOS server. Let me know if I should include any other information.
感谢您的帮助!
推荐答案
OSError: [Errno 12] Cannot allocate memory
听起来好像内存用完了.
OSError: [Errno 12] Cannot allocate memory
sounds like you are running out of RAM.
检查可用内存并进行交换.
Check your available free memory and swap.
您可以尝试使用workers
参数减少线程数,或者使用chunksize
参数减少每个训练块中要使用的文档数.
You can try to to reduce the number of threads with the workers
parameter or the number of documents to be used in each training chunk with the chunksize
parameter.
这篇关于gensim.LDAMulticore抛出异常:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!