TypeError:“字节"类型的对象不可JSON序列化 [英] TypeError: Object of type 'bytes' is not JSON serializable
问题描述
我刚刚开始编程Python.我想用scrapy创建一个机器人,它显示了 TypeError:运行项目时,字节"类型的对象不可JSON序列化.
I just started programming Python. I want to use scrapy to create a bot,and it showed TypeError: Object of type 'bytes' is not JSON serializable when I run the project.
import json
import codecs
class W3SchoolPipeline(object):
def __init__(self):
self.file = codecs.open('w3school_data_utf8.json', 'wb', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(dict(item)) + '\n'
# print line
self.file.write(line.decode("unicode_escape"))
return item
from scrapy.spiders import Spider
from scrapy.selector import Selector
from w3school.items import W3schoolItem
class W3schoolSpider(Spider):
name = "w3school"
allowed_domains = ["w3school.com.cn"]
start_urls = [
"http://www.w3school.com.cn/xml/xml_syntax.asp"
]
def parse(self, response):
sel = Selector(response)
sites = sel.xpath('//div[@id="navsecond"]/div[@id="course"]/ul[1]/li')
items = []
for site in sites:
item = W3schoolItem()
title = site.xpath('a/text()').extract()
link = site.xpath('a/@href').extract()
desc = site.xpath('a/@title').extract()
item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
return items
跟踪:
TypeError: Object of type 'bytes' is not JSON serializable
2017-06-23 01:41:15 [scrapy.core.scraper] ERROR: Error processing {'desc': [b'\x
e4\xbd\xbf\xe7\x94\xa8 XSLT \xe6\x98\xbe\xe7\xa4\xba XML'],
'link': [b'/xml/xml_xsl.asp'],
'title': [b'XML XSLT']}
Traceback (most recent call last):
File
"c:\users\administrator\appdata\local\programs\python\python36\lib\site-p
ackages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "D:\LZZZZB\w3school\w3school\pipelines.py", line 19, in process_item
line = json.dumps(dict(item)) + '\n'
File
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\_
_init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 257, in iterencode
return _iterencode(o, 0)
File
"c:\users\administrator\appdata\local\programs\python\python36\lib\
json\encoder.py", line 180, in default
o.__class__.__name__)
TypeError: Object of type 'bytes' is not JSON serializable
推荐答案
您将自己创建这些bytes
对象:
You are creating those bytes
objects yourself:
item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
这些t.encode()
,l.encode()
和d.encode()
调用中的每一个都会创建一个bytes
字符串.请勿执行此操作,将其保留为JSON格式以将其序列化.
Each of those t.encode()
, l.encode()
and d.encode()
calls creates a bytes
string. Do not do this, leave it to the JSON format to serialise these.
接下来,您正在犯其他几个错误;您在不需要的地方编码过多.将其留给json
模块和open()
调用返回的 standard 文件对象以处理编码.
Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json
module and the standard file object returned by the open()
call to handle encoding.
您也不需要将items
列表转换为字典;它已经是可以直接进行JSON编码的对象:
You also don't need to convert your items
list to a dictionary; it'll already be an object that can be JSON encoded directly:
class W3SchoolPipeline(object):
def __init__(self):
self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(item) + '\n'
self.file.write(line)
return item
我猜您遵循的是一个假定使用Python 2的教程,而您使用的是Python 3.我强烈建议您找到其他教程;它不仅是为过时的Python版本编写的,而且如果它倡导line.decode('unicode_escape')
,它也在教一些极端的不良习惯,这些习惯会导致难以跟踪的错误.我可以建议您查看 Think Python,第二版 获得一本关于学习Python 3的好书,免费.
I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape')
it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.
这篇关于TypeError:“字节"类型的对象不可JSON序列化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!