Python node2vec(Gensim Word2Vec)“进程完成,退出代码134(被信号6中断:SIGABRT)" [英] Python node2vec (Gensim Word2Vec) "Process finished with exit code 134 (interrupted by signal 6: SIGABRT)"

查看:30
本文介绍了Python node2vec(Gensim Word2Vec)“进程完成,退出代码134(被信号6中断:SIGABRT)"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 Python 开发 node2vec,它在内部使用了 Gensim 的 Word2Vec.

当我使用小数据集时,代码运行良好.但是一旦我尝试在大型数据集上运行相同的代码,代码就会崩溃.

错误:进程已完成,退出代码为 134(被信号 6:SIGABRT 中断).

给出错误的行是

model = Word2Vec(walks, size=args.dimensions,window=args.window_size, min_count=0, sg=1,工人=args.workers, iter=args.iter)

我正在使用 pycharm 和 python 3.5.

知道发生了什么吗?我找不到任何可以解决我的问题的帖子.

解决方案

您几乎肯定会耗尽内存 – 这会导致操作系统使用 SIGABRT 中止您的内存使用过程.

一般来说,解决这个问题意味着查看您的代码如何使用内存,以及在失败的那一刻.(然而,过多大容量内存使用的实际泄漏"可能会更早——只有最后一个小/适当的增量会触发错误.)

特别是使用 Python 和使用 Gensim Word2Vec 类的 node2vec 工具,一些可以尝试的事情包括:

在尝试过程中观察 Python 进程大小的读数.

至少在 INFO 级别启用 Python 日志记录,以查看有关导致崩溃的更多信息.

此外,请务必:

  1. 优化您的walks 可迭代 组成一个大的内存列表.(Gensim 的Word2Vec 可以处理任何长度的语料库,不包括那些远大于 RAM 的语料库,只要 (a) 语料库是通过可重复迭代从磁盘流式传输的Python 序列;以及 (b) 模型的唯一词/节点标记数可以在 RAM 中建模.)
  2. 确保模型中的唯一词(令牌/节点)的数量不需要大于 RAM 允许的模型.一旦启用,日志输出将显示在主模型分配(可能会失败)发生之前所涉及的原始大小.(如果失败,要么:(a) 使用具有更多 RAM 的系统来容纳您的完整节点集;或 (b) 或使用更高的 min_count 值来丢弃更多不太重要的节点.)

如果您的 进程以退出代码 134 结束(被信号 6 中断:SIGABRT) 错误不涉及 Python、Gensim 和 &Word2Vec,你应该:

  1. 搜索该错误的发生情况以及触发情况的更具体细节 - 造成您的错误的工具/库和代码行.
  2. 针对您的情况查看通用的内存分析工具,以确定您的代码可能消耗了几乎所有可用 RAM 的位置(甚至在最终错误之前很久).

I am working on node2vec in Python, which uses Gensim's Word2Vec internally.

When I am using small dataset the code works well. But as soon as I try to run the same code on large dataset, the code crashes.

Error: Process finished with exit code 134 (interrupted by signal 6: SIGABRT).

The line which is giving error is

model = Word2Vec(walks, size=args.dimensions,
                 window=args.window_size, min_count=0, sg=1,
                 workers=args.workers, iter=args.iter)

I am using pycharm and python 3.5.

Any idea what is happening? I could not found any post which could solve my problem.

解决方案

You are almost certainly running out of memory – which causes the OS to abort your memory-using process with the SIGABRT.

In general, solving this means looking at how your code is using memory, leading up to and at the moment of failure. (The actual 'leak' of excessive bulk memory usage might, however, be arbitrarily earlier - with only the last small/proper increment triggering the error.)

Specifically with the usage of Python, and the node2vec tool which makes use of the Gensim Word2Vec class, some things to try include:

Watch a readout of the Python process size during your attempts.

Enable Python logging to at least the INFO level to see more about what's happening leading-up to the crash.

Further, be sure to:

  1. Optimize your walks iterable to not compose a large in-memory list. (Gensim's Word2Vec can work on a corpus of any length, iuncluding those far larger than RAM, as long as (a) the corpus is streamed from disk via a re-iterable Python sequence; and (b) the model's number of unique word/node tokens can be modeled within RAM.)
  2. Ensure the number of unique words (tokens/nodes) in your model doesn't require a model larger than RAM allows. Logging output, once enabled, will show the raw sizes involved just before the main model-allocation (which is likely failing) happens. (If it fails, either: (a) use a system with more RAM to accomdate your full set of nodes; or (b) or use a higher min_count value to discard more less-important nodes.)

If your Process finished with exit code 134 (interrupted by signal 6: SIGABRT) error does not involve Python, Gensim, & Word2Vec, you should instead:

  1. Search for occurrences of that error combined with more specific details of your triggering situations - the tools/libraries and lines-of-code that create your error.
  2. Look into general memory-profiling tools for your situation, to identify where (even long before the final error) your code might be consuming almost-all of the available RAM.

这篇关于Python node2vec(Gensim Word2Vec)“进程完成,退出代码134(被信号6中断:SIGABRT)"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆