gensim.LDAMulticore抛出异常: [英] gensim.LDAMulticore throwing exception:

查看:246
本文介绍了gensim.LDAMulticore抛出异常:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从python gensim库运行LDAMulticore,该脚本似乎无法创建多个线程.这是错误:

I am running LDAMulticore from the python gensim library, and the script cannot seem to create more than one thread. Here is the error:

  Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 97, in worker
    initializer(*initargs)
  File "/usr/lib64/python2.7/site-packages/gensim/models/ldamulticore.py", line 333, in worker_e_step
    worker_lda.do_estep(chunk)  # TODO: auto-tune alpha?
  File "/usr/lib64/python2.7/site-packages/gensim/models/ldamodel.py", line 725, in do_estep
    gamma, sstats = self.inference(chunk, collect_sstats=True)
  File "/usr/lib64/python2.7/site-packages/gensim/models/ldamodel.py", line 655, in inference
    ids = [int(idx) for idx, _ in doc]
TypeError: 'int' object is not iterable
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 765, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
    pool._maintain_pool()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 229, in _maintain_pool
    self._repopulate_pool()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
    w.start()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
    self._popen = Popen(self)
  File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

我正在这样创建我的LDA模型:

I'm creating my LDA model like this:

ldamodel = LdaMulticore(corpus, num_topics=50, id2word = dictionary, workers=3)

我实际上已经询问了有关此脚本的另一个问题,因此可以在此处找到完整的脚本:

I have actually asked another question about this script, so the full script can be found here:

Gensim LDA多核Python脚本运行太慢了

如果相关,我正在CentOS服务器上运行它.让我知道我是否应该包括任何其他信息.

If it's relevant, I'm running this on a CentOS server. Let me know if I should include any other information.

感谢您的帮助!

推荐答案

OSError: [Errno 12] Cannot allocate memory听起来好像内存用完了.

OSError: [Errno 12] Cannot allocate memory sounds like you are running out of RAM.

检查可用内存并进行交换.

Check your available free memory and swap.

您可以尝试使用workers参数减少线程数,或者使用chunksize参数减少每个训练块中要使用的文档数.

You can try to to reduce the number of threads with the workers parameter or the number of documents to be used in each training chunk with the chunksize parameter.

这篇关于gensim.LDAMulticore抛出异常:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆