如何在 Linux 上超过 64 位 LuaJIT 的 1GB 内存限制? [英] How to get past 1gb memory limit of 64 bit LuaJIT on Linux?

查看:19
本文介绍了如何在 Linux 上超过 64 位 LuaJIT 的 1GB 内存限制?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

概述是我正在对代码进行原型设计以了解我的问题空间,并且我遇到了恐慌:调用 Lua API 时出现未受保护的错误(内存不足)"错误.我正在寻找绕过这个限制的方法.

The overview is I am prototyping code to understand my problem space, and I am running into 'PANIC: unprotected error in call to Lua API (not enough memory)' errors. I am looking for ways to get around this limit.

环境底线是Torch,一个运行在LuaJIT上的科学计算框架,LuaJIT运行在Lua上.我需要 Torch,因为我最终想在 GPU 上解决我的神经网络问题,但要达到这个目标,我需要对问题进行良好的表示以提供给网络.我(被困)在 Centos Linux 上,我怀疑尝试以 32 位模式从源代码重建所有部分(据报道这会将 LuaJIT 内存限制扩展到 4GB)将是一场噩梦,如果它适用于所有图书馆.

The environment bottom line is Torch, a scientific computing framework that runs on LuaJIT, and LuaJIT runs on Lua. I need Torch because I eventually want to hammer on my problem with neural nets on a GPU, but to get there I need a good representation of the problem to feed to the nets. I am (stuck) on Centos Linux, and I suspect that trying to rebuild all the pieces from source in 32bit mode (this is reported to extend the LuaJIT memory limit to 4gb) will be a nightmare if it works at all for all of the libraries.

问题空间本身可能不是特别相关,但总的来说,我有点的数据文件,我计算它们之间的距离,然后对这些距离进行分类(即制作直方图)以尝试计算出最有用的范围.方便地,我可以用各种垃圾箱和 torch.save() 来创建复杂的 Lua 表,计算出一团糟,然后再捡起来,用不同的标准化等检查——所以玩了一个月后,我发现这是非常简单和强大.

The problem space itself is probably not particularly relevant, but in overview I have datafiles of points that I calculate distances between and then bin (i.e. make histograms of) these distances to try and work out the most useful ranges. Conveniently I can create complicated Lua tables with various sets of bins and torch.save() the mess of counts out, then pick it up later and inspect with different normalisations etc. -- so after one month of playing I am finding this to be really easy and powerful.

我可以让它在最多 3 个距离中工作,每个 15 个 bin(15x15x15 加上开销),但这只能通过添加显式的garbagecollection() 调用并为每个数据文件使用 fork()/wait() 以便外部如果一个数据文件(数千个)仍然超出内存限制并使子进程崩溃,循环将继续运行.这会变得更加痛苦,因为每个成功的子进程现在都必须读取、修改和写入当前的 bin 计数集——而我目前最大的文件是 36mb.我想要更大(更多垃圾箱),并且真的更愿意只保留我似乎无法访问的 15 个内存中的计数.

I can make it work looking at up to 3 distances with 15 bins each (15x15x15 plus overhead), but this only by adding explicit garbagecollection() calls and using fork()/wait() for each datafile so that the outer loop will keep running if one datafile (of several thousand) still blows the memory limit and crashes the child. This gets extra painful as each successful child process now has to read, modify and write the current set of bin counts -- and my largest files for this are currently 36mb. I would like to go larger (more bins), and would really prefer to just hold the counts in the 15 gigs of RAM I can't seem to access.

所以,这里有一些我想到的路径;如果您可以确认/否认其中任何一个将/不会让我超出 1GB 边界,或者只会提高我在其中的效率,请发表评论.如果您能提出我没有想到的另一种方法,请发表评论.

So, here are some paths I have thought of; please do comment if you can confirm/deny that any of them will/won't get me outside of the 1gb boundary, or will just improve my efficiency within it. Please do comment if you can suggest another approach that I have not thought of.

  • 我是否缺少一种触发 Lua 进程的方法,我可以从中读取任意表?毫无疑问,我可以将我的问题分解成更小的部分,但是从 stdio 解析返回表(如从系统调用到另一个 Lua 脚本)似乎容易出错,并且写入/读取小的中间文件将是大量的磁盘 i/o.

  • am I missing a way to fire off a Lua process that I can read an arbitrary table back in from? No doubt I can break my problem into smaller pieces, but parsing a return table from stdio (as from a system call to another Lua script) seems error prone, and writing/reading small intermediate files will be a lot of disk i/o.

我是不是缺少一个 stash-and-access-table-in-high-memory 模块?这似乎是我真正想要的,但还没有找到

am I missing a stash-and-access-table-in-high-memory module ? This seems like what I really want, but not found it yet

FFI C 数据结构可以放在 1gb 之外吗?情况似乎并非如此,但当然我首先对导致限制的原因缺乏充分了解.我怀疑这只会让我比通用 Lua 表提高效率,因为它已经超越了原型设计的几个部分?(除非我为每次更改都编写一堆代码)

can FFI C data structures be put outside the 1gb? Doesn't seem like that would be the case but certainly I lack a full understanding of what is causing the limit in the first place. I suspect that this will just get me an efficiency improvement over generic Lua tables for the few pieces that have moved beyond prototyping? (unless I do a bunch of coding for each change)

当然,我可以通过用 C 语言编写扩展来退出(Torch 似乎支持应该超出限制的网络),但是我在那里的简短调查发现了对lightuserdata"指针的引用——这是否意味着更正常的扩展也不会超出 1gb?这似乎也为应该是原型设计的工作带来了沉重的开发成本.

Surely I can get out by writing an extension in C (Torch appears to support nets that should go outside of the limit), but my brief investigation there turns up references to 'lightuserdata' pointers -- does this mean that a more normal extension won't get outside 1gb either? This also seems like it has the heavy development cost for what should be a prototyping exercise.

我很了解 C,所以走 FFI 或扩展路线并不困扰我 - 但我从经验中知道,以这种方式封装算法既非常优雅又非常痛苦,有两个地方可以隐藏错误.在堆栈上的表中处理包含表的数据结构似乎也不是很好.在我做出这项努力之前,我想确定最终结果真的能解决我的问题.

I know C well so going the FFI or extension route doesn't bother me - but I know from experience that encapsulating algorithms in this way can be both really elegant and really painful with two places to hide bugs. Working through data structures containing tables within tables on the stack doesn't seem great either. Before I make this effort I would like to be certain that the end result really will solve my problem.

感谢您阅读这篇长文.

推荐答案

只有 LuaJIT 自己分配的对象被限制在前 2GB 内存.这意味着使用 ffi.new 分配的表、字符串、完整用户数据(即不是 lightuserdata)和 FFI 对象将计入限制,但使用 mallocmmap 等不受此限制(无论是否由 C 模块或 FFI 调用).

Only object allocated by LuaJIT itself are limited to the first 2GB of memory. This means that tables, strings, full userdata (i.e. not lightuserdata), and FFI objects allocated with ffi.new will count towards the limit, but objects allocated with malloc, mmap, etc. are not subjected to this limit (regardless if called by a C module or the FFI).

使用 malloc 分配结构的示例:

An example for allocating a structure with malloc:

ffi.cdef[[
    typedef struct { int bar; } foo;
    void* malloc(size_t);
    void free(void*);
]]

local foo_t = ffi.typeof("foo")
local foo_p = ffi.typeof("foo*")

function alloc_foo()
    local obj = ffi.C.malloc(ffi.sizeof(foo_t))
    return ffi.cast(foo_p, obj)
end

function free_foo(obj)
    ffi.C.free(obj)
end

LuaJIT 3.0 IIRC 新的 GC 将没有这个限制,但最近我没有听到任何关于它的发展的消息.

The new GC to be implemented in LuaJIT 3.0 IIRC will not have this limit, but I haven't heard any news on it's development recently.

来源:http://lua-users.org/lists/lua-l/2012-04/msg00729.html

这篇关于如何在 Linux 上超过 64 位 LuaJIT 的 1GB 内存限制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆