Python:为什么要泡菜? [英] Python: why pickle?

查看:104
本文介绍了Python:为什么要泡菜?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用泡菜,很高兴,然后我看到了这篇文章:不要腌制您的数据

I have been using pickle and was very happy, then I saw this article: Don't Pickle Your Data

进一步阅读似乎是:

  • Pickle is slow
  • Pickle is unsafe
  • Pickle isn’t human readable
  • Pickle isn’t language-agnostic

我已将数据另存为JSON,但我想了解最佳做法:

I’ve switched to saving my data as JSON, but I wanted to know about best practice:

鉴于所有这些问题,您何时会使用泡菜?有什么特殊情况需要使用它?

Given all these issues, when would you ever use pickle? What specific situations call for using it?

推荐答案

Pickle是不安全的,因为它通过调用任意函数来构造任意Python对象.但是,这也使它能够序列化几乎所有Python对象,而无需任何样板甚至白名单/黑名单(在常见情况下).对于某些用例,这是非常理想的:

Pickle is unsafe because it constructs arbitrary Python objects by invoking arbitrary functions. However, this is also gives it the power to serialize almost any Python object, without any boilerplate or even white-/black-listing (in the common case). That's very desirable for some use cases:

  • 快速&简单的序列化,例如用于暂停和恢复长时间运行但简单的脚本.这里的任何问题都无关紧要,您只想按原样转储程序的状态并在以后加载它.
  • 将任意Python数据发送到其他进程或计算机,如multiprocessing所示. 可能会涉及安全问题(但大多数情况下不会),普遍性是绝对必要的,并且人类不必阅读它.
  • Quick & easy serialization, for example for pausing and resuming a long-running but simple script. None of the concerns matter here, you just want to dump the program's state as-is and load it later.
  • Sending arbitrary Python data to other processes or computers, as in multiprocessing. The security concerns may apply (but mostly don't), the generality is absolutely necessary, and humans won't have to read it.

在其他情况下,没有任何缺点足以证明将您的内容映射到JSON或其他限制性数据模型的合理性.也许您不希望需要人类可读性/安全性/跨语言兼容性,或者您可能不需要.记住,您将不需要它.使用JSON是正确的选择™,但正确的选择并不总是那么好.

In other cases, none of the drawbacks is quite enough to justify the work of mapping your stuff to JSON or another restrictive data model. Maybe you don't expect to need human readability/safety/cross-language compatibility or maybe you can do without. Remember, You Ain't Gonna Need It. Using JSON would be the right thing™ but right doesn't always equal good.

您会注意到,我完全忽略了缓慢"的缺点.那是因为它在某种程度上具有误导性:对于完全适合JSON模型(字符串,数字,数组,映射)的数据,Pickle的速度确实确实较慢,但是如果您的数据如此,则无论如何您都应使用JSON进行其他处理.如果您的数据不太可能(这样),则还需要考虑将对象转换为JSON数据所需的自定义代码,以及将JSON数据重新转换为您所需的自定义代码对象.它增加了工程工作量和运行时开销,必须根据具体情况进行量化.

You'll notice that I completely ignored the "slow" downside. That's because it's partially misleading: Pickle is indeed slower for data that fits the JSON model (strings, numbers, arrays, maps) perfectly, but if your data's like that you should use JSON for other reasons anyway. If your data isn't like that (very likely), you also need to take into account the custom code you'll need to turn your objects into JSON data, and the custom code you'll need to turn JSON data back into your objects. It adds both engineering effort and run-time overhead, which must be quantified on a case-by-case basis.

这篇关于Python:为什么要泡菜?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆