将 numpy 类型转换为 python [英] Convert numpy type to python
问题描述
我有一个由熊猫生成的以下形式的字典列表.我想把它转换成json格式.
list_val = [{1.0: 685}, {2.0: 8}]输出 = json.dumps(list_val)
然而,json.dumps 抛出错误:TypeError: 685 is not JSON serializable
我猜这是从 numpy 到 python(?) 的类型转换问题.
但是,当我使用 np.int32(v) 转换数组中每个字典的值 v 时,它仍然会抛出错误.
这是完整的代码
new = df[df[label] == label_new]ks_dict = json.loads(内容)ks_list = ks_dict['变量']频率计数 = []对于 ks_list 中的 ks_var:freq_var = dict()freq_var["name"] = ks_var["name"]ks_series = new[ks_var["name"]]temp_df = ks_series.value_counts().to_dict()freq_var["new"] = [{u: np.int32(v)} for (u, v) in temp_df.iteritems()]freq_counts.append(freq_var)out = json.dumps(freq_counts)
看来你是对的:
<预><代码>>>>导入 numpy>>>导入json>>>json.dumps(numpy.int32(685))回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中转储中的文件/usr/lib/python2.7/json/__init__.py",第 243 行返回_default_encoder.encode(obj)文件/usr/lib/python2.7/json/encoder.py",第207行,编码块 = self.iterencode(o, _one_shot=True)文件/usr/lib/python2.7/json/encoder.py",第 270 行,在 iterencode 中返回 _iterencode(o, 0)文件/usr/lib/python2.7/json/encoder.py",第184行,默认引发类型错误(repr(o)+不是JSON可序列化的")类型错误:685 不是 JSON 可序列化的不幸的是,numpy 数字的 __repr__
并没有给你任何关于它们是什么类型的提示.当他们不是 (gasp) 时,他们会伪装成 int
到处跑.最终,看起来 json
告诉您 int
不可序列化,但实际上,它告诉您这个特定的 np.int32(或您实际拥有的任何类型)) 不可序列化.(没有真正的惊喜——没有 np.int32 is 可序列化).这也是为什么您在将它传递给 json.dumps
之前不可避免地打印 的 dict 看起来也只有整数的原因.
这里最简单的解决方法可能是编写自己的序列化程序1:
class MyEncoder(json.JSONEncoder):定义默认(自我,对象):如果 isinstance(obj, numpy.integer):返回整数(对象)elif isinstance(obj, numpy.floating):返回浮点数(对象)elif isinstance(obj, numpy.ndarray):返回 obj.tolist()别的:返回 super(MyEncoder, self).default(obj)
你像这样使用它:
json.dumps(numpy.float32(1.2), cls=MyEncoder)json.dumps(numpy.arange(12), cls=MyEncoder)json.dumps({'a': numpy.int32(42)}, cls=MyEncoder)
等
1或者你可以只编写默认函数并将其作为 defaut
关键字参数传递给 json.dumps
.在这种情况下,您将用 raise TypeError
替换最后一行,但是……嗯.该类更具可扩展性:-)
I have a list of dicts in the following form that I generate from pandas. I want to convert it to a json format.
list_val = [{1.0: 685}, {2.0: 8}]
output = json.dumps(list_val)
However, json.dumps throws an error: TypeError: 685 is not JSON serializable
I am guessing it's a type conversion issue from numpy to python(?).
However, when I convert the values v of each dict in the array using np.int32(v) it still throws the error.
EDIT: Here's the full code
new = df[df[label] == label_new]
ks_dict = json.loads(content)
ks_list = ks_dict['variables']
freq_counts = []
for ks_var in ks_list:
freq_var = dict()
freq_var["name"] = ks_var["name"]
ks_series = new[ks_var["name"]]
temp_df = ks_series.value_counts().to_dict()
freq_var["new"] = [{u: np.int32(v)} for (u, v) in temp_df.iteritems()]
freq_counts.append(freq_var)
out = json.dumps(freq_counts)
It looks like you're correct:
>>> import numpy
>>> import json
>>> json.dumps(numpy.int32(685))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python2.7/json/encoder.py", line 184, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 685 is not JSON serializable
The unfortunate thing here is that numpy numbers' __repr__
doesn't give you any hint about what type they are. They're running around masquerading as int
s when they aren't (gasp). Ultimately, it looks like json
is telling you that an int
isn't serializable, but really, it's telling you that this particular np.int32 (or whatever type you actually have) isn't serializable. (No real surprise there -- No np.int32 is serializable). This is also why the dict that you inevitably printed before passing it to json.dumps
looks like it just has integers in it as well.
The easiest workaround here is probably to write your own serializer1:
class MyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, numpy.integer):
return int(obj)
elif isinstance(obj, numpy.floating):
return float(obj)
elif isinstance(obj, numpy.ndarray):
return obj.tolist()
else:
return super(MyEncoder, self).default(obj)
You use it like this:
json.dumps(numpy.float32(1.2), cls=MyEncoder)
json.dumps(numpy.arange(12), cls=MyEncoder)
json.dumps({'a': numpy.int32(42)}, cls=MyEncoder)
etc.
1Or you could just write the default function and pass that as the defaut
keyword argument to json.dumps
. In this scenario, you'd replace the last line with raise TypeError
, but ... meh. The class is more extensible :-)
这篇关于将 numpy 类型转换为 python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!