为什么鼓励 Julia 中的去向量化? [英] Why devectorization in Julia is encouraged?

查看:32
本文介绍了为什么鼓励 Julia 中的去向量化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Julia 中似乎鼓励编写去向量化代码.甚至还有一个 会尝试为您做到这一点.

Seems like writing devectorized code is encouraged in Julia. There is even a package that tries to do that for you.

我的问题是为什么?

首先,从用户体验方面来说,矢量化代码更简洁(代码更少,错误的可能性更小),更清晰(因此更容易调试),更自然的代码编写方式(至少对于某些人来说)谁来自科学计算背景,Julia 试图迎合谁).能够写出 vector'vectorvector'Matrix*vector 之类的东西很重要,因为它对应于实际的数学表示,这就是科学计算人的想法它在他们的脑海中(而不是在嵌套循环中).而且我讨厌这样一个事实,即这不是编写它的最佳方式,并且将其重新解析为循环会更快.

First of all, speaking from the user experience aspect, vectorized code is more concise (less code, then less likelihood of bugs), more clear (hence easier to debug), more natural way of writing code (at least for someone who comes from scientific computing background, whom Julia tries to cater to). Being able to write something like vector'vector or vector'Matrix*vector is very important, because it corresponds to actual mathematical representation, and this is how scientific computing guys think of it in their head (not in nested loops). And I hate the fact that this is not the best way to write this, and reparsing it into loops will be faster.

目前看来,编写快速代码的目标与简洁/清晰的代码之间存在冲突.

At the moment it seems like there is a conflict between the goal of writing the code that is fast and the code that is concise/clear.

其次,这是什么技术原因?好的,我知道矢量化代码会创建额外的临时对象等,但矢量化函数(例如,broadcast()map() 等)有可能多线程,我认为多线程的好处可以超过临时函数的开销和向量化函数的其他缺点,使它们比常规 for 循环更快.

Secondly, what is the technical reason for this? Ok, I understand that vectorized code creates extra temporaries, etc., but vectorized functions (for example, broadcast(), map(), etc.) have a potential of multithreading them, and I think that the benefit of multithreading can outweigh the overhead of temporaries and other disadvantages of vectorized functions making them faster than regular for loops.

Julia 中矢量化函数的当前实现是否在底层实现了隐式多线程?

Do current implementations of vectorized functions in Julia do implicit multithreading under the hood?

如果没有,是否有工作/计划向矢量化函数添加隐式并发并使它们比循环更快?

If not, is there work / plans to add implicit concurrency to vectorized functions and to make them faster than loops?

推荐答案

为了方便阅读,我决定把我上面的评论马拉松变成一个答案.

For easy reading I decided to turn my comment marathon above into an answer.

Julia 背后的核心开发声明是我们很贪心".核心开发人员希望它能够完成一切,并且能够快速完成.特别要注意的是,该语言应该解决双语问题",在这个阶段,它看起来会在 v1.0 发布时完成.

The core development statement behind Julia is "we are greedy". The core devs want it to do everything, and do it fast. In particular, note that the language is supposed to solve the "two-language problem", and at this stage, it looks like it will accomplish this by the time v1.0 hits.

就您的问题而言,这意味着您所询问的所有内容要么已经是 Julia 的一部分,要么已计划用于 v1.0.

In the context of your question, this means that everything you are asking about is either already a part of Julia, or planned for v1.0.

特别是,这意味着如果您的编程问题适合矢量化代码,那么请编写矢量化代码.如果使用循环更自然,请使用循环.

In particular, this means that if your programming problem lends itself to vectorized code, then write vectorized code. If it is more natural to use loops, use loops.

到 v1.0 发布时,大多数矢量化代码应该与 Matlab 中的等效代码一样快或更快.在许多情况下,这个开发目标已经实现,因为 Julia 中的许多向量/矩阵运算都由编译器发送到适当的 BLAS 例程.

By the time v1.0 hits, most vectorized code should be as fast, or faster, than equivalent code in Matlab. In many cases, this development goal has already been achieved, since many vector/matrix operations in Julia are sent to the appropriate BLAS routines by the compiler.

关于多线程,native 多线程目前正在为 Julia 实现,我相信 master 分支上已经有一组实验性的例程.相关问题页面是这里.隐式多线程对于一些向量/矩阵运算在理论上已经在 J​​ulia 中可用,因为 Julia 调用 BLAS.我不确定它是否默认开启.

Regarding multi-threading, native multi-threading is currently being implemented for Julia, and I believe an experimental set of routines is already available on the master branch. The relevant issue page is here. Implicit multithreading for some vector/matrix operations is already in theory available in Julia, since Julia calls BLAS. I'm not sure if it is switched on by default.

但请注意,许多向量化操作在 MATLAB 中(目前)仍然会快得多,因为 MATLAB 多年来一直在编写专门的多线程 C 库,然后在后台调用它们.一旦 Julia 拥有原生多线程,我预计 Julia 将超越 MATLAB,因为届时整个开发社区都可以搜索标准 Julia 包并升级它们以尽可能利用原生多线程.

Be aware though, that many vectorized operations will still (currently) be much faster in MATLAB, since MATLAB have been writing specialised multi-threaded C libraries for years and then calling them under the hood. Once Julia has native multi-threading, I expect Julia will overtake MATLAB, since at that point the entire dev community can scour the standard Julia packages and upgrade them to take advantage of native multi-threading wherever possible.

相比之下,MATLAB 没有原生多线程,因此您依赖 Mathworks 以底层 C 库的形式提供专门的多线程例程.

In contrast, MATLAB does not have native multi-threading, so you are relying on Mathworks to provide specialised multi-threaded routines in the form of underlying C libraries.

这篇关于为什么鼓励 Julia 中的去向量化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆