在 Julia 中,数组访问的循环很慢 [英] loop with array access is slow in Julia

查看:16
本文介绍了在 Julia 中,数组访问的循环很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在循环访问和不访问数组的情况下进行了如下比较,发现两者之间的性能差异很大:1.463677[sec] vs 0.086808[sec].

I did a comparison between a loop with and without array access as below and found that the performance difference between the two was huge: 1.463677[sec] vs 0.086808[sec].

您能否解释一下如何通过数组访问来改进我的代码以及为什么会发生这种情况?

Could you explain how to improve my code with array access and why this happens?

@inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
    r2set = Array[]
    for i=1:10000
        r2_add = rand(2, 1)
        push!(r2set, r2_add)
    end
    return r2set
end

function test()
    N = 10000
    r2set = rand_gen()
    a = [1 1]
    b = [2 2]

    @time for i=1:N, j=1:N
        dist2(r2set[i], r2set[j])
    end

    @time for i=1:N, j=1:N
        dist2(a, b)
    end
end
test()

推荐答案

r2set 有一个像这样的具体类型(另见 https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-containers-with-abstract-type-parameters-1):

Make r2set have a concrete type like this (see also https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-containers-with-abstract-type-parameters-1):

@inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
    r2set = Matrix{Float64}[]
    for i=1:10000
        r2_add = rand(2, 1)
        push!(r2set, r2_add)
    end
    return r2set
end

function test()
    N = 10000
    r2set = rand_gen()
    a = [1 1]
    b = [2 2]

    @time for i=1:N, j=1:N
        dist2(r2set[i], r2set[j])
    end

    @time for i=1:N, j=1:N
        dist2(a, b)
    end
end
test()

现在测试是:

julia> test()
  0.347000 seconds
  0.147696 seconds

这已经更好了.

现在,如果您真的想要速度,请使用不可变类型,例如Tuple 不是这样的数组:

Now if you really want speed use immutable type, e.g. Tuple not an array like this:

@inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
    r2set = Tuple{Float64,Float64}[]
    for i=1:10000
        r2_add = (rand(), rand())
        push!(r2set, r2_add)
    end
    return r2set
end

function test()
    N = 10000
    r2set = rand_gen()
    a = (1,1)
    b = (2,2)

    s = 0.0
    @time for i=1:N, j=1:N
        @inbounds s += dist2(r2set[i], r2set[j])
    end

    @time for i=1:N, j=1:N
        s += dist2(a, b)
    end
end
test()

而且你会比较两者的速度:

And you will comparable speed of both:

julia> test()
  0.038901 seconds
  0.039666 seconds

julia> test()
  0.041379 seconds
  0.039910 seconds

请注意,我添加了一个 s,因为没有它,Julia 优化了循环,注意到它没有做任何工作.

Note that I have added an addition of s because without it Julia optimized out the loop by noticing that it does not do any work.

关键是,如果将数组存储在数组中,那么外部数组会保存指向内部数组的指针,而对于不可变类型,数据会直接存储.

The key is that if you store arrays in an array then the outer array holds pointers to inner arrays while with immutable types the data is stored directly.

这篇关于在 Julia 中,数组访问的循环很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆