如果从命令行运行Julia脚本,是否需要每次都重新编译? [英] If a Julia script is run from the command line, does it need to be re-compiled every time?

查看:162
本文介绍了如果从命令行运行Julia脚本,是否需要每次都重新编译?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了很多文档和问题,但是对此我仍然感到困惑.

在文档的分析部分中,建议您首先运行在REPL中只使用一次目标函数,以便在进行概要分析之前已经对其进行了编译.但是,如果脚本相当复杂并且实际上是要在命令行中运行并接受参数,该怎么办?当julia进程完成并且第二次运行脚本时,是否再次执行编译? https://stackoverflow.com/a/42040763/1460448 解决方案

如果我错了,请纠正我,但这听起来像是您写了一些长脚本,例如myfile.jl,然后从OS命令行中输入了正在呼叫julia myfile.jl args....这样对吗?同样,听起来myfile.jl并没有在功能方面定义太多,而只是一个命令序列.这样对吗?如果是这样,那么正如对该问题的评论所建议的那样,这不是朱莉娅的典型工作流程,原因有两个:

1)从命令行调用julia,即julia myfile.jl args...等效于打开REPL,在myfile.jl上运行include命令,然后关闭REPL.对include的初始调用将编译myfile.jl中的操作所需的任何方法,这需要花费时间.但是,由于您是从命令行运行的,因此include完成后,REPL会自动关闭,并且所有已编译的代码都将被丢弃.这就是DNF的意思,当他说推荐的工作流程是在单个REPL会话中工作,并且除非一天完成工作或除非您有意要重新编译所使用的所有方法,否则不要关闭它. >

2)即使在单个REPL会话中进行工作,将您在函数中执行的几乎所有操作都包装起来也是非常重要的(这与Matlab等语言是非常不同的工作流).如果这样做,Julia将为每个函数编译专门针对您所使用的输入参数类型的方法.从本质上讲,这就是朱莉娅快的原因.一旦方法被编译一次,它将在整个REPL会话中仍然可用,但是在您关闭REPL时将被丢弃.至关重要的是,如果您不将操作包装在函数中,则不会发生这种专门的编译,因此您可能会期望代码很慢.在朱利娅,我们称其为在全球范围内工作".请注意,Julia的此功能鼓励使用一种编码样式,该编码样式包括将您的任务分解为许多小的专门功能,而不是由1000行代码组成的庞然大物.这是一个好主意,原因有很多. (在我自己的代码库中,许多功能都是单行代码,大多数是5行或更少)

要理解您是否在Julia中工作,上述两点绝对至关重要.但是,一旦对它们感到满意,我建议您实际上将所有功能放入模块,然后在需要时从活动的REPL会话中调用您的模块.这样做还有一个好处,就是您可以在模块的顶部添加__precompile__()语句,然后julia将在该模块中预编译一些(但不一定是全部)代码.完成此操作后,关闭REPL时模块中的预编译代码不会消失,因为它存储在硬盘驱动器中的.ji文件中.因此,您可以启动一个新的REPL会话,键入using MyModule,您的预编译代码将立即可用.仅当您更改模块的内容时(它会自动发生),才需要重新编译.

I've read through quite some documentation and questions but I'm still confused about this.

In the Profiling section of the documentation it's suggested to first run the target function in the REPL once, so that it's already compiled before being profiled. However, what if the script is fairly complicated and is inteded to be run in the command line, taking arguments? When the julia process finishes and I run the script the second time, is the compilation performed again? Posts like https://stackoverflow.com/a/42040763/1460448, Julia compiles the script every time? give conflicting answers. They also seem to be old while Julia is constantly evolving.

It seems to me that the second run takes exactly as much time as the first run in my experience. The startup time is quite long. How should I optimize such a program? Adding __precompile__() doesn't seem to have changed the execution time at all.

Also, what should I do when I want to profile such a program? All resources on profiling talk about doing so in the REPL.

解决方案

Please correct me if I am wrong, but it sounds like you have written some long script, say, myfile.jl, and then from your OS command line you are calling julia myfile.jl args.... Is this correct? Also, it sounds like myfile.jl does not define much in the way of functions, but is instead just a sequence of commands. Is this correct? If so, then as has been suggested in the comments on the question, this is not the typical work-flow for julia, for two reasons:

1) Calling julia from the command line, ie julia myfile.jl args... is equivalent to opening a REPL, running an include command on myfile.jl, and then closing the REPL. The initial call to include will compile any methods that are needed for the operations in myfile.jl, which takes time. But since you're running from the command line, once the include is finished, the REPL automatically closes, and all that compiled code is thrown away. This is what DNF means when he says the recommended workflow is to work within a single REPL session, and don't close it until you are done for the day, or unless you deliberately want to recompile all the methods you are using.

2) Even if you are working within a single REPL session, it is extremely important to wrap pretty much everything you do in functions (this is a very different workflow to languages like Matlab). If you do this, Julia will compile methods for each function that are specialized on the types of the input arguments that you are using. This is essentially why Julia is fast. Once a method is compiled once, it remains available for the entire REPL session, but is disposed of when you close the REPL. Critically, if you do not wrap your operations in functions, then this specialized compilation does not occur, and so you can expect very slow code. In julia, we call this "working in the global scope". Note that this feature of Julia encourages a coding style consisting of breaking your tasks down into lots of small specialized functions rather than one behemoth consisting of 1000 lines of code. This is a good idea for many reasons. (in my own codebase, many functions are a single-liners, most are 5 lines or less)

The two points above are absolutely critical to understand if you are working in Julia. However, once you are comfortable with them, I would recommend that you actually put all your functions inside modules, and then call your module(s) from an active REPL session whenever you need it. This has the additional advantage that you can just add a __precompile__() statement at the top of your module, and then julia will precompile some (but not necessarily all) of the code in that module. Once you do this, the precompiled code in your module doesn't disappear when you close the REPL, since it is stored on the hard-drive in a .ji file. So you can start a new REPL session, type using MyModule, and your precompiled code is immediately available. It will only need to re-compile if you alter the contents of the module (and this all happens automatically).

这篇关于如果从命令行运行Julia脚本,是否需要每次都重新编译?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆