如果从命令行运行 Julia 脚本,是否每次都需要重新编译? [英] If a Julia script is run from the command line, does it need to be re-compiled every time?

查看:47
本文介绍了如果从命令行运行 Julia 脚本,是否每次都需要重新编译?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了很多文档和问题,但我仍然对此感到困惑.

在文档的 Profiling 部分,建议首先运行REPL 中的目标函数一次,以便它在被分析之前已经编译.但是,如果脚本相当复杂并且打算在命令行中运行并接受参数怎么办?当 julia 进程完成并且我第二次运行脚本时,是否再次执行编译?像 https://stackoverflow.com/a/42040763/1460448Julia 每次都编译脚本? 给出相互矛盾的答案.当 Julia 不断进化时,它们似乎也老了.

在我看来,根据我的经验,第二次运行所需的时间与第一次运行的时间完全相同.启动时间相当长.我应该如何优化这样的程序?添加 __precompile__() 似乎根本没有改变执行时间.

另外,当我想对这样的程序进行概要分析时应该怎么做?所有关于分析的资源都在 REPL 中讨论这样做.

解决方案

如果我错了,请纠正我,但听起来你写了一些很长的脚本,比如,myfile.jl,和然后从您的操作系统命令行调用 julia myfile.jl args....这个对吗?此外,听起来 myfile.jl 并没有定义太多的函数,而只是一个命令序列.这个对吗?如果是这样,那么正如问题评论中所建议的那样,这不是 julia 的典型工作流程,原因有两个:

1) 从命令行调用 julia,即 julia myfile.jl args... 相当于打开一个 REPL,在 上运行一个 include 命令myfile.jl,然后关闭 REPL.对 include 的初始调用将编译 myfile.jl 中的操作所需的任何方法,这需要时间.但是由于您是从命令行运行的,一旦 include 完成,REPL 会自动关闭,并且所有已编译的代码都会被丢弃.这就是 DNF 的意思,他说推荐的工作流程是在单个 REPL 会话中工作,并且在当天完成之前不要关闭它,或者除非您有意重新编译您正在使用的所有方法.

2) 即使您在单个 REPL 会话中工作,将您所做的几乎所有事情都封装在函数中是极其非常重要的(这与 Matlab 等语言的工作流程非常不同).如果您这样做,Julia 将为每个函数编译专门针对您正在使用的输入参数类型的方法.这就是 Julia 速度快的根本原因.一旦一个方法被编译一次,它在整个 REPL 会话中仍然可用,但在您关闭 REPL 时被处理掉.关键是,如果您不将操作包装在函数中,则不会发生这种专门的编译,因此您可以预期代码非常慢.在 Julia 中,我们称之为在全局范围内工作".请注意,Julia 的这一特性鼓励将您的任务分解为许多小的专门功能的编码风格,而不是一个由 1000 行代码组成的庞然大物.这是一个好主意,原因有很多.(在我自己的代码库中,很多函数都是单行的,大多数是 5 行或更少)

以上两点对于理解您是否在 Julia 中工作至关重要.但是,一旦您对它们感到满意,我建议您实际上将所有功能放在 modules,然后在需要时从活动的 REPL 会话中调用您的模块.这还有一个额外的好处,你可以在模块的顶部添加一个 __precompile__() 语句,然后 julia 将预编译该模块中的一些(但不一定是全部)代码.执行此操作后,关闭 REPL 时模块中的预编译代码不会消失,因为它以 .ji 文件存储在硬盘驱动器上.因此,您可以启动一个新的 REPL 会话,键入 using MyModule,您的预编译代码将立即可用.如果你改变了模块的内容,它只需要重新编译(这一切都是自动发生的).

I've read through quite some documentation and questions but I'm still confused about this.

In the Profiling section of the documentation it's suggested to first run the target function in the REPL once, so that it's already compiled before being profiled. However, what if the script is fairly complicated and is inteded to be run in the command line, taking arguments? When the julia process finishes and I run the script the second time, is the compilation performed again? Posts like https://stackoverflow.com/a/42040763/1460448, Julia compiles the script every time? give conflicting answers. They also seem to be old while Julia is constantly evolving.

It seems to me that the second run takes exactly as much time as the first run in my experience. The startup time is quite long. How should I optimize such a program? Adding __precompile__() doesn't seem to have changed the execution time at all.

Also, what should I do when I want to profile such a program? All resources on profiling talk about doing so in the REPL.

解决方案

Please correct me if I am wrong, but it sounds like you have written some long script, say, myfile.jl, and then from your OS command line you are calling julia myfile.jl args.... Is this correct? Also, it sounds like myfile.jl does not define much in the way of functions, but is instead just a sequence of commands. Is this correct? If so, then as has been suggested in the comments on the question, this is not the typical work-flow for julia, for two reasons:

1) Calling julia from the command line, ie julia myfile.jl args... is equivalent to opening a REPL, running an include command on myfile.jl, and then closing the REPL. The initial call to include will compile any methods that are needed for the operations in myfile.jl, which takes time. But since you're running from the command line, once the include is finished, the REPL automatically closes, and all that compiled code is thrown away. This is what DNF means when he says the recommended workflow is to work within a single REPL session, and don't close it until you are done for the day, or unless you deliberately want to recompile all the methods you are using.

2) Even if you are working within a single REPL session, it is extremely important to wrap pretty much everything you do in functions (this is a very different workflow to languages like Matlab). If you do this, Julia will compile methods for each function that are specialized on the types of the input arguments that you are using. This is essentially why Julia is fast. Once a method is compiled once, it remains available for the entire REPL session, but is disposed of when you close the REPL. Critically, if you do not wrap your operations in functions, then this specialized compilation does not occur, and so you can expect very slow code. In julia, we call this "working in the global scope". Note that this feature of Julia encourages a coding style consisting of breaking your tasks down into lots of small specialized functions rather than one behemoth consisting of 1000 lines of code. This is a good idea for many reasons. (in my own codebase, many functions are a single-liners, most are 5 lines or less)

The two points above are absolutely critical to understand if you are working in Julia. However, once you are comfortable with them, I would recommend that you actually put all your functions inside modules, and then call your module(s) from an active REPL session whenever you need it. This has the additional advantage that you can just add a __precompile__() statement at the top of your module, and then julia will precompile some (but not necessarily all) of the code in that module. Once you do this, the precompiled code in your module doesn't disappear when you close the REPL, since it is stored on the hard-drive in a .ji file. So you can start a new REPL session, type using MyModule, and your precompiled code is immediately available. It will only need to re-compile if you alter the contents of the module (and this all happens automatically).

这篇关于如果从命令行运行 Julia 脚本,是否每次都需要重新编译?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆