在jupyter/iPython笔记本脚本和类方法之间同步代码 [英] Synchronizing code between jupyter/iPython notebook script and class methods

查看:199
本文介绍了在jupyter/iPython笔记本脚本和类方法之间同步代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找出使代码保持在Jupyter/iPython笔记本中以及使类方法中的相同代码保持同步的最佳方法.这是用例:

I'm trying to figure out the best way to keep code in an Jupyter/iPython notebook and the same code inside of a class method in sync. Here's the use case:

我写了一个长脚本,该脚本在笔记本中使用熊猫,并具有多个单元,这使开发变得容易,因为我可以检查笔记本中的中间结果.这对于使用pandas脚本非常有用.我将工作代码下载到Python".py"文件中,并将该脚本转换为程序中Python类中的一种方法,该方法用输入数据实例化,并提供该方法的输出结果.一切都很好.该Python类用于更大的应用程序中,因此才是真正的可交付成果.

I wrote a long script that uses pandas inside a notebook, and have multiple cells which made the development easy, because I could check intermediate results within the notebook. This is very useful with pandas scripts. I downloaded that working code into a Python ".py" file, and converted that script to be a method within a Python class in my program, that is instantiated with the input data, and provides the output as a result of that method. Everything works great. That Python class is used in a much larger application, so that is the real deliverable.

但是在该方法的实现中存在某个数据集的错误,这也存在于我的脚本中.我可以回到笔记本上,逐步浏览各个单元以查找问题.我已解决了该问题,但随后我必须仔细地在常规Python类方法代码中进行更改.这有点痛苦.

But then there was a bug for a certain data set in the implementation in the method, which also was in my script. I could go back to my notebook and go step-by-step through the various cells to find the issue. I fix the issue, but then I have to carefully make the change back in the regular Python class method code. This is a bit painful.

理想情况下,我希望能够跨单元运行类方法,因此我可以检查中间结果.我不知道该怎么做.

Ideally, I'd like to be able to run a class method across cells, so I can check intermediate results. I can't figure out how to do this.

那么使脚本代码和嵌入在类方法中的代码保持同步之间的最佳实践是什么?

So what is the best practice between keeping a script code and code embedded within a class method in sync?

是的,我知道我可以将类导入到笔记本中,但是随后我失去了通过单个单元格查看类方法内部中间结果的能力,这是我在使用纯脚本时要做的事情.对于大熊猫,这非常有用.

Yes, I know that I can import the class into the notebook, but then I lose the ability to look at intermediate results inside the class method via individual cells, which is what I do when it is a pure script. With pandas, this is very useful.

推荐答案

我使用了相同的开发工作流程,并意识到了使用jupyter笔记本单步执行代码的价值.我已经开发了几个程序包,首先散列详细信息,然后最终将经过修饰的产品移到单独的.py文件中.对于您所遇到的不便,我认为没有简单的解决方案(我遇到了同样的问题),但是我将描述自己的做法(我没有那么大胆地​​宣称它是最佳"做法),也许它将对您的用例有所帮助.

I have used your same development workflow and recognize the value of being able to step through code using the jupyter notebook. I've developed several packages by first hashing out the details and then eventually moving the polished product in to separate .py files. I do not think there is a simple solution to the inconvenience you encounter (I have run into the same issues), but I will describe my practice (I'm not so bold as to proclaim it the "best" practice) and maybe it will be helpful in your use case.

根据我的经验,一旦我从jupyter笔记本中创建了模块/程序包,就可以更轻松地在笔记本外部维护/开发代码并将该模块导入到笔记本中进行测试.

In my experience, once I have created a module/package from my jupyter notebook, it is easier to maintain/develop the code outside of the notebook and import that module into the notebook for testing.

将每种方法保持较小的大小通常是一个好习惯,这对于使用笔记本电脑在每个步骤中测试逻辑非常有帮助.您可以将较大的公共"方法分解为较小的私人"方法,这些方法使用前划线(例如"_load_file")命名.您可以在笔记本中调用私人"方法进行测试/调试,但是模块用户应该知道忽略这些方法.

Keeping each method small is good practice in general, and is very helpful for testing the logic at each step using the notebook. You can break larger "public" methods into smaller "private" methods named using a leading underscore (e.g. '_load_file'. You can call the "private" methods in your notebook for testing/debugging, but users of your module should know to ignore these methods.

您可以使用importlib模块中的reload功能,通过对源所做的更改快速刷新导入的模块.

You can use the reload function in the importlib module to quickly refresh your imported modules with changes made to the source.

import mymodule
from importlib import reload
reload(mymodule)

再次调用import并不会真正更新您的名称空间.您需要reload函数(或类似函数)来强制python重新编译/执行模块代码.

Calling import again will not actually update your namespace. You need the reload function (or similar) to force python to recompile/execute the module code.

不可避免的是,您仍然需要逐行浏览各个功能,但是如果您将代码分解成小的方法,则在笔记本中需要重写"的代码量很小.

Inevitably, you will still need to step through individual functions line by line, but if you've decomposed your code into small methods, the amount of code you need to "re-write" in the notebook is very small.

这篇关于在jupyter/iPython笔记本脚本和类方法之间同步代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆