泡菜序列化谜 [英] Pickle serialization order mystery

查看:170
本文介绍了泡菜序列化谜的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请参阅以下示例:

  import pickle 
x = {'order_number':'X','deal_url' :'J'}

pickle.dumps(x)
pickle.dumps(pickle.loads(pickle.dumps(x)))
pickle.dumps(pickle.loads (pickle.dumps(pickle.loads(pickle.dumps(x)))))

结果:

 (dp0\\\
S'deal_url'\\\
p1\\\
S'J'\\\
p2\\\
sS'order_number' \\\
p3\\\
S'X'\\\
p4\\\
s
(dp0\\\
S'order_number'\\\
p1\\\
S'X'\\\
p2\\\
sS'deal_url'\\\
p3\\ \\ nS'J'\\\
p4\\\
s。
(dp0\\\
S'deal_url'\\\
p1\\\
S'J'\\\
p2\\\
sS'order_number'\\\
p3\\\
S'X '\\\
p4\\\
s。

显然,每次转储的序列化输出更改当我删除字符从任何键,这不会发生我发现这是 Stream-Framework 使用酸洗输出作为其k / v存储上的通知存储的关键,如果我们更好地了解这里发生了什么,我将提出请求。我找到了两个解决办法来阻止它:



A - 排序后转换为字典(是的,以某种方式提供预期的副作用)

  import operator 
sorted_x = dict(sorted(x.iteritems(),key = operator.itemgetter(1))

B - 删除下划线(但不确定是否可以正常工作)



那么什么原因导致了腌菜字典排序中的谜?



更新



证明调用排序通过dict提供转储以产生相同的结果:

  import operator 
x = dict(sorted(x.iteritems (),key = operator.itemgetter(1)))

pickle.dumps(x)
(dp0\\\
S'order_number'\\\
p1\\\
S'X'\\ \\ np2\\\
sS'deal_url'\\\
p3\\\
S'J'\\\
p4\\\
s

x = pickle.loads(pickle.dumps(x))
x = dict(sorted(x.iteritems(),key = operator.itemgetter(1)))

pickle.dumps(x)
(dp0\\\
S' order_number'\\\
p1\\\
S'X'\\\
p2\\\
sS'deal_url'\\\
p3\\\
S'J'\\\
p4\\\
s

验尸



Stream-Framework可能会重新考虑其使用设计内容是通知的关键。



问题#153 引用这个。

解决方案

字典是未分类的数据结构。这意味着这个顺序是随意的,酸洗会按原样存储它们。您可以使用 collections.OrderedDict 如果你想使用排序的字典。



你认为你在翻译过程中看到的任何顺序只是翻译和你一起玩很好



dict


最好把字典看成一组无序的键:值对,要求键是唯一的(在一个字典内)


记住函数 dict.keys() dict.values() dict.items() code>也以任意顺序返回各自的值。


See following sample:

import pickle
x = {'order_number': 'X', 'deal_url': 'J'}

pickle.dumps(x)
pickle.dumps(pickle.loads(pickle.dumps(x)))
pickle.dumps(pickle.loads(pickle.dumps(pickle.loads(pickle.dumps(x)))))

Results:

(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.
(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns.
(dp0\nS'deal_url'\np1\nS'J'\np2\nsS'order_number'\np3\nS'X'\np4\ns.

Clearly, serialized output changes for every dump. When I remove a character from any of keys, this doesn't happen. I discovered this as Stream-Framework use pickled output as key for storage of notifications on its k/v store. I will pull request if we get a better understanding what is going on here. I have found two solutions to prevent it:

A - Convert to dictionary after sorting (yes, somehow provides the intended side effect)

import operator
sorted_x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

B - Remove underscores (but not sure if this always works)

So what causes the mystery under dictionary sorting for pickle?

Update

Proof that calling sort over dict provides dump to produce same result:

import operator
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."

x = pickle.loads(pickle.dumps(x))
x = dict(sorted(x.iteritems(), key=operator.itemgetter(1)))

pickle.dumps(x)
"(dp0\nS'order_number'\np1\nS'X'\np2\nsS'deal_url'\np3\nS'J'\np4\ns."

Post-mortem

Stream-Framework might reconsider its design on using content as key for notifications.

Issue #153 references this.

解决方案

Dictionaries are unsorted data structures. This means that the order is arbitrary and pickle will store them as they are. You can use the collections.OrderedDict if you want to use a sorted dictionary.

Any order you think you see when you're playing around in the interpreter is just the interpreter playing nice with you.

From the documentation of dict:

It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary)

Remember that the functions dict.keys(), dict.values() and dict.items() also return their respective values in arbitrary order.

这篇关于泡菜序列化谜的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆