为什么鼓励在Julia中进行矢量化处理? [英] Why devectorization in Julia is encouraged?

查看:119
本文介绍了为什么鼓励在Julia中进行矢量化处理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Julia鼓励使用编写去矢量化的代码. 甚至还有一个程序包试图为您做到这一点.

Seems like writing devectorized code is encouraged in Julia. There is even a package that tries to do that for you.

我的问题是为什么?

首先,从用户体验方面来讲,矢量化代码更简洁(代码更少,然后出现错误的可能性更小),更清晰(因此更容易调试),更自然的代码编写方式(至少对于某人而言)谁来自科学计算背景,朱莉娅(Julia)试图迎合他.能够编写vector'vectorvector'Matrix*vector之类的内容非常重要,因为它与实际的数学表示形式相对应,这就是科学计算人员在头脑中(而不是嵌套循环中)想到的方式.而且我讨厌这样的事实,那就是这不是编写此代码的最佳方法,并且将其重新解析为循环会更快.

First of all, speaking from the user experience aspect, vectorized code is more concise (less code, then less likelihood of bugs), more clear (hence easier to debug), more natural way of writing code (at least for someone who comes from scientific computing background, whom Julia tries to cater to). Being able to write something like vector'vector or vector'Matrix*vector is very important, because it corresponds to actual mathematical representation, and this is how scientific computing guys think of it in their head (not in nested loops). And I hate the fact that this is not the best way to write this, and reparsing it into loops will be faster.

目前看来,编写快速代码的目标与简洁/清晰的代码之间存在冲突.

At the moment it seems like there is a conflict between the goal of writing the code that is fast and the code that is concise/clear.

第二,这是什么技术原因?好的,我知道向量化的代码会创建额外的临时对象等,但是向量化的函数(例如broadcast()map()等)具有对它们进行多线程处理的潜力,并且我认为多线程处理的好处可能会超过临时对象的开销以及矢量化函数的其他缺点使它们比常规的for循环更快.

Secondly, what is the technical reason for this? Ok, I understand that vectorized code creates extra temporaries, etc., but vectorized functions (for example, broadcast(), map(), etc.) have a potential of multithreading them, and I think that the benefit of multithreading can outweigh the overhead of temporaries and other disadvantages of vectorized functions making them faster than regular for loops.

Julia中矢量化函数的当前实现是否在后台进行隐式多线程处理?

Do current implementations of vectorized functions in Julia do implicit multithreading under the hood?

如果没有,是否有工作/计划向矢量化函数添加隐式并发并使它们比循环更快?

If not, is there work / plans to add implicit concurrency to vectorized functions and to make them faster than loops?

推荐答案

为了便于阅读,我决定将上面的评论马拉松转化为答案.

For easy reading I decided to turn my comment marathon above into an answer.

朱莉娅背后的核心发展声明是我们贪婪".核心开发人员希望它能够做到一切,并且能快速完成.特别要注意的是,该语言应该能够解决两种语言的问题",并且在此阶段,看起来将在v1.0命中之时完成此任务.

The core development statement behind Julia is "we are greedy". The core devs want it to do everything, and do it fast. In particular, note that the language is supposed to solve the "two-language problem", and at this stage, it looks like it will accomplish this by the time v1.0 hits.

就您的问题而言,这意味着您所要询问的一切已经是Julia的一部分,或计划用于v1.0.

In the context of your question, this means that everything you are asking about is either already a part of Julia, or planned for v1.0.

特别是,这意味着如果您的编程问题使自己适合矢量化代码,请编写矢量化代码.如果更自然地使用循环,请使用循环.

In particular, this means that if your programming problem lends itself to vectorized code, then write vectorized code. If it is more natural to use loops, use loops.

到v1.0发行时,大多数矢量化代码的速度应与Matlab中的等效代码一样快或更快.在许多情况下,由于编译器将Julia中的许多矢量/矩阵运算发送到适当的BLAS例程,因此已经实现了此开发目标.

By the time v1.0 hits, most vectorized code should be as fast, or faster, than equivalent code in Matlab. In many cases, this development goal has already been achieved, since many vector/matrix operations in Julia are sent to the appropriate BLAS routines by the compiler.

关于多线程,目前正在为Julia实现 native 多线程,我相信master分支上已经有一组实验性例程.相关问题页面位于此处.从理论上讲,某些向量/矩阵操作的 Implicit 多线程在Julia中已经可用,因为Julia调用了BLAS.我不确定是否默认将其打开.

Regarding multi-threading, native multi-threading is currently being implemented for Julia, and I believe an experimental set of routines is already available on the master branch. The relevant issue page is here. Implicit multithreading for some vector/matrix operations is already in theory available in Julia, since Julia calls BLAS. I'm not sure if it is switched on by default.

但是请注意,由于MATLAB已经编写多年的专用多线程C库,然后在后台对其进行了调用,因此许多矢量化操作(当前)在MATLAB中的运行速度仍然要快得多.一旦Julia具有本机多线程功能,我希望Julia会超过MATLAB,因为到那时,整个开发社区都可以搜索标准的Julia程序包并对其进行升级,以尽可能利用本机多线程功能.

Be aware though, that many vectorized operations will still (currently) be much faster in MATLAB, since MATLAB have been writing specialised multi-threaded C libraries for years and then calling them under the hood. Once Julia has native multi-threading, I expect Julia will overtake MATLAB, since at that point the entire dev community can scour the standard Julia packages and upgrade them to take advantage of native multi-threading wherever possible.

相反,MATLAB没有本机多线程,因此您依赖MathWorks以基础C库的形式提供专门的多线程例程.

In contrast, MATLAB does not have native multi-threading, so you are relying on Mathworks to provide specialised multi-threaded routines in the form of underlying C libraries.

这篇关于为什么鼓励在Julia中进行矢量化处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆