Julia 中的内存分配 [英] Memory allocations in Julia

查看:16
本文介绍了Julia 中的内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在将程序从 Python 翻译成 Julia 后,我非常不满意:

I'm extremely dissatisfied after translating a program from Python to Julia:

  • 对于小/非常小的输入,Python 更快
  • 对于中等输入,Julia 更快(但没那么快)
  • 对于大输入,Python 更快

我认为原因是我不明白内存分配是如何工作的(这里是自学者,没有 CS 背景).我会在这里发布我的代码,但它太长而且太具体,除了我之外,它对任何人都没有好处.因此我做了一些实验,现在我有一些问题.

I think the reason is that I don't understand how memory allocation works (autodidact here, no CS background). I would post my code here but it is too long and too specific and it would not be beneficial for anybody but me. Therefore I made some experiments and now I have some questions.

考虑这个简单的script.jl:

function main()
    @time begin
        a = [1,2,3]
    end
end
main()

当我运行它时,我得到:

When I run it I get:

$ julia script.jl
  0.000004 seconds (1 allocation: 96 bytes)

1. 为什么是 96 字节?当我设置 a = [] 时,我得到 64 个字节(为什么空数组的权重如此之大?).96 字节 - 64 字节 = 32 字节.但是 a 是一个 Array{Int64,1}.3 * 64 位 = 3 * 8 字节 = 24 字节!= 32 字节.

1. Why 96 bytes? When I set a = [] I get 64 bytes (why does an empty array weight so much?). 96 bytes - 64 bytes = 32 bytes. But a is an Array{Int64,1}. 3 * 64 bits = 3 * 8 bytes = 24 bytes != 32 bytes.

2. 为什么我设置了 a = [1,2,3,4] 却得到 96 个字节?

2. Why do I get 96 bytes even if I set a = [1,2,3,4]?

3.为什么我运行这个时得到 937.500 KB:

3. Why do I get 937.500 KB when I run this:

function main()
    @time begin
        for _ in 1:10000
            a = [1,2,3]
        end
    end
end
main()

而不是 960.000 KB?

and not 960.000 KB?

4. 例如,为什么 filter() 效率如此之低?看看这个:

4. Why is, for instance, filter() so inefficient? Take a look at this:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            b = []
            for x in a
                check(x) && push!(b,x)
            end
            a = b
        end
    end
end
main()
$ julia script.jl
  0.000177 seconds (3.00 k allocations: 203.125 KB)

改为:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            a = filter(check,a)
        end
    end
end
main()

$ julia script.jl
  0.002029 seconds (3.43 k allocations: 225.339 KB)

如果我使用匿名函数 (x -> x % 2 == 0) 而不是检查内部过滤器,我会得到:

And if I use an anonymous function (x -> x % 2 == 0)instead of check inside filter, I get:

$ julia script.jl
  0.004057 seconds (3.05 k allocations: 206.555 KB)

如果内置函数速度较慢且需要更多内存,为什么还要使用它?

Why should I use a built-in function if it is slower and needs more memory?

推荐答案

快速解答:

1.数组在标题中记录它们的维度和大小等.

1. Arrays keep track of their dimensionality and size, among other things, in a header.

2.Julia 确保其数组是 16 字节对齐.如果您查看更多示例的分配,则该模式会变得很明显:

2. Julia ensures its arrays are 16-byte aligned. The pattern becomes obvious if you look at the allocations for a few more examples:

julia> [@allocated(Array{Int64}(i)) for i=0:8]'
1x9 Array{Any,2}:
 64  80  80  96  96  112  112  128  128

3.它以千字节为单位报告.一千字节有1024个字节:

3. It's reporting in kilobytes. There are 1024 bytes in a kilobyte:

julia> 937.500 * 1024
960000.0

4.匿名函数和将函数传递给像 filter 这样的高阶函数是 0.4 中已知的性能问题,并且已在最新的开发版本中得到修复.

4. Anonymous functions and passing functions to higher order functions like filter are known performance gotchas in 0.4, and have been fixed in the latest development version.

一般来说,获得比预期更多的分配通常是类型不稳定的标志.我强烈建议您阅读手册的性能提示页面了解更多信息.

In general, getting more allocations than you expect is often a sign of type-instability. I highly recommend reading through the manual's Performance Tips page for more information about this.

这篇关于Julia 中的内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆