Julia中的内存分配 [英] Memory allocations in Julia

查看:65
本文介绍了Julia中的内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将程序从Python转换为Julia后,我感到非常不满意:

  • 对于小/非常小的输入,Python更快
  • 对于中等输入,Julia速度更快(但不是很多)
  • 对于大量输入,Python更快

我认为原因是我不了解内存分配的工作原理(此处自动编辑,没有CS背景).我将在此处发布我的代码,但是它太长且太具体,对我以外的任何人都没有好处.因此,我做了一些实验,现在有一些问题.

考虑以下简单的script.jl:

function main()
    @time begin
        a = [1,2,3]
    end
end
main()

当我运行它时,我得到:

$ julia script.jl
  0.000004 seconds (1 allocation: 96 bytes)

1..为什么是96个字节?当我设置a = []时,我得到64个字节(为什么一个空数组的权重那么大?). 96个字节-64个字节= 32个字节.但是aArray{Int64,1}. 3 * 64位= 3 * 8字节= 24字节!= 32字节

2..即使设置了a = [1,2,3,4],为什么我仍然得到96个字节?

3..为什么运行此命令时会得到937.500 KB:

function main()
    @time begin
        for _ in 1:10000
            a = [1,2,3]
        end
    end
end
main()

而不是960.000 KB?

4..例如,为什么filter()如此低效?看看这个:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            b = []
            for x in a
                check(x) && push!(b,x)
            end
            a = b
        end
    end
end
main()
$ julia script.jl
  0.000177 seconds (3.00 k allocations: 203.125 KB)

相反:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            a = filter(check,a)
        end
    end
end
main()

$ julia script.jl
  0.002029 seconds (3.43 k allocations: 225.339 KB)

如果我使用匿名函数(x -> x % 2 == 0)而不是在过滤器内部进行检查,则会得到:

$ julia script.jl
  0.004057 seconds (3.05 k allocations: 206.555 KB)

如果速度较慢且需要更多内存,为什么应该使用内置函数?

解决方案

快速解答:

1. Array在标头中跟踪其尺寸和大小.

2. Julia确保其数组 16字节对齐.如果您查看一些其他示例的分配,该模式将变得显而易见:

julia> [@allocated(Array{Int64}(i)) for i=0:8]'
1x9 Array{Any,2}:
 64  80  80  96  96  112  112  128  128

3.它以千字节为单位报告.一千个字节中有1024个字节:

julia> 937.500 * 1024
960000.0

4.匿名函数和传递给更高阶函数(例如filter)的函数在0.4中是已知的性能陷阱,并且已在最新开发版本中进行了修复.

通常,获得比您预期更多的分配通常是类型不稳定的标志.我强烈建议您通读手册的性能提示页,以获取有关此内容的更多信息. /p>

I'm extremely dissatisfied after translating a program from Python to Julia:

  • for small/very small inputs, Python is faster
  • for medium inputs, Julia is faster (but not that much)
  • for big inputs, Python is faster

I think the reason is that I don't understand how memory allocation works (autodidact here, no CS background). I would post my code here but it is too long and too specific and it would not be beneficial for anybody but me. Therefore I made some experiments and now I have some questions.

Consider this simple script.jl:

function main()
    @time begin
        a = [1,2,3]
    end
end
main()

When I run it I get:

$ julia script.jl
  0.000004 seconds (1 allocation: 96 bytes)

1. Why 96 bytes? When I set a = [] I get 64 bytes (why does an empty array weight so much?). 96 bytes - 64 bytes = 32 bytes. But a is an Array{Int64,1}. 3 * 64 bits = 3 * 8 bytes = 24 bytes != 32 bytes.

2. Why do I get 96 bytes even if I set a = [1,2,3,4]?

3. Why do I get 937.500 KB when I run this:

function main()
    @time begin
        for _ in 1:10000
            a = [1,2,3]
        end
    end
end
main()

and not 960.000 KB?

4. Why is, for instance, filter() so inefficient? Take a look at this:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            b = []
            for x in a
                check(x) && push!(b,x)
            end
            a = b
        end
    end
end
main()
$ julia script.jl
  0.000177 seconds (3.00 k allocations: 203.125 KB)

instead:

check(n::Int64) = n % 2 == 0

function main()
    @time begin
        for _ in 1:1000
            a = [1,2,3]
            a = filter(check,a)
        end
    end
end
main()

$ julia script.jl
  0.002029 seconds (3.43 k allocations: 225.339 KB)

And if I use an anonymous function (x -> x % 2 == 0)instead of check inside filter, I get:

$ julia script.jl
  0.004057 seconds (3.05 k allocations: 206.555 KB)

Why should I use a built-in function if it is slower and needs more memory?

解决方案

Quick answers:

1. Arrays keep track of their dimensionality and size, among other things, in a header.

2. Julia ensures its arrays are 16-byte aligned. The pattern becomes obvious if you look at the allocations for a few more examples:

julia> [@allocated(Array{Int64}(i)) for i=0:8]'
1x9 Array{Any,2}:
 64  80  80  96  96  112  112  128  128

3. It's reporting in kilobytes. There are 1024 bytes in a kilobyte:

julia> 937.500 * 1024
960000.0

4. Anonymous functions and passing functions to higher order functions like filter are known performance gotchas in 0.4, and have been fixed in the latest development version.

In general, getting more allocations than you expect is often a sign of type-instability. I highly recommend reading through the manual's Performance Tips page for more information about this.

这篇关于Julia中的内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆