每个管道运行时未加载当前快照 [英] Current snapshot not loaded on every pipeline run

查看:88
本文介绍了每个管道运行时未加载当前快照的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,

我使用Azure机器学习服务创建了一个机器学习管道。我有一个配置文件(config.py),我保存所有超参数等。但是,在管道创建过程中,并不总是加载当前快照(所有我的python文件,包括例如config.py
中的更改) 。对我来说,重要的是每次加载当前快照以及config.py文件中的更改 是最新的。

I have created a Machine Learning Pipeline using the Azure Machine Learning Services. I have one configuration file (config.py) where I save all hyperparameters etc. However, the current snapshot (all my python files, including also changes in config.py for instance) is not always loaded during the pipeline creation. For me, it is important that the current snapshot is loaded every time and that the changes in the config.py files  are up-to date.

另一件需要注意的事情是,config.py文件只是由几个
PythonScriptSteps中定义的主要python文件使用。 每次在提交管道之前,我都会通过更改 PythonScriptStep 中的
allow_reuse
参数来快速解决问题。在创建管道时,是否有可能始终从当前文件夹加载所有文件?

Another thing important to note is that config.py file is just used by my main python files defined in several PythonScriptSteps. I quickly solved the problem by changing allow_reuse parameter in PythonScriptStep every time before the pipeline is submitted. Is there a possibility to always load all files from the current folder when the pipeline is created?

我很清楚,Azure机器学习服务正在比较/修改文件缓存以检测文件中的更改,以便某些管道步骤可以重复使用。如果我理解正确,也可以使用
hash_paths 来定义更多文件路径,然后跟踪这些路径以进行更改。这是每次重新加载文件的正确方法吗?

It is clear for me that Azure Machine Learning Services are comparing/modifying file caches to detect the changes in files, so that some of the pipeline steps can be reusable. If I understood correctly it is also possible to use hash_paths to define more paths to the files, that are then tracked for changes. Is that the correct way to re-load the files every time?

提前致谢!

援助

推荐答案

Hello Aid,

Hello Aid,

看起来你已经正确设置了每次加载的参数从config.py使用
allow_reuse 为false。因此,应该为每次运行加载它,而不是默认使用上一次运行的结果。 

It looks like you have correctly set the parameters to load every time from config.py using allow_reuse to false. So, it should be loaded for every run instead of using the results of previous run by default. 

使用 hash_paths  您可以包含所有内容source_directory通过指定hash_paths = ['。']

Using hash_paths you can include all contents of source_directory by specifying hash_paths=['.']

你能不能为hash_paths尝试这个参数并检查你的所有文件是否都被加载了?

Could you please try this parameter for hash_paths and check if all your files are loaded?

如果设置相同后有任何问题,请随时在

documentation
页面。 

If there are any issues after setting the same, Please feel free to raise a git issue at the documentation page. 

这是另一个
问题
有类似的问题,我相信这将帮助您找出需要为您设置的参数情景。

Here is another issue with similar questions and I believe this will help you in sorting out which parameters needs to be set for your scenario.


这篇关于每个管道运行时未加载当前快照的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆