Julia 比 Java 慢得多 [英] Julia much slower than Java
问题描述
我是 Julia 新手,我编写了一个简单的函数来计算 RMSE(均方根误差).ratings
是一个评分矩阵,每一行是[user, film, rating]
.有 1500 万个收视率.rmse()
方法需要 12.0 秒,但 Java 实现大约快 188 倍:0.064 秒.为什么 Julia 的实现这么慢?在 Java 中,我正在处理一个 Rating
对象数组,如果它是一个多维 int
数组,它会更快.
I'm new to Julia and I've written a simple function that calculates RMSE (root mean square error). ratings
is a matrix of ratings, each row is [user, film, rating]
. There are 15 million ratings. The rmse()
method takes 12.0 s, but Java implementation is about 188x faster: 0.064 s. Why is the Julia implementation that slow? In Java, I'm working with an array of Rating
objects, if it was a multidimensional int
array, it would be even faster.
ratings = readdlm("ratings.dat", Int32)
function predict(user, film)
return 3.462
end
function rmse()
total = 0.0
for i in 1:size(ratings, 1)
r = ratings[i,:]
diff = predict(r[1], r[2]) - r[3]
total += diff * diff
end
return sqrt(total / size(ratings)[1])
end
避免全局变量后,它在 1.99 秒内完成(比 Java 慢 31 倍).删除 r = rating[i,:]
后,它是 0.856 秒(慢 13 倍).
After avoiding the global variable, it finishes in 1.99 s (31x slower than Java). After removing the r = ratings[i,:]
, it's 0.856 s (13x slower).
推荐答案
几点建议:
- 不要使用全局变量.由于烦人的技术原因,它们很慢.而是将
ratings
作为参数传入. r = rating[i,:]
行进行复制,这很慢.相反,请使用predict(r[i,1], r[i,2]) - r[i,3]
.square()
可能比x*x
快——试试吧.- 如果您从源代码中使用最前沿的 Julia,请查看全新的
NumericExtensions.jl
包,它为许多常见的数值运算提供了疯狂优化的功能.(查看 julia-dev 列表) - Julia 必须在第一次执行代码时对其进行编译.在 Julia 中进行基准测试的正确方法是多次计时并忽略第一次.
- Don't use globals. For annoying technical reasons, they're slow. Instead, pass
ratings
in as an argument. - The
r = ratings[i,:]
line makes a copy, which is slow. Instead, usepredict(r[i,1], r[i,2]) - r[i,3]
. square()
may be faster thanx*x
-- try it.- If you're using the bleeding-edge Julia from source, check out the brand new
NumericExtensions.jl
package, which has insanely optimized functions for many common numerical operations. (see the julia-dev list) - Julia has to compile the code the first time it executes it. The right way to benchmark in Julia is to do the timing several times and ignore the first time through.
这篇关于Julia 比 Java 慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!