类型错误:“字节"类型的对象不是 JSON 可序列化的 [英] TypeError: Object of type 'bytes' is not JSON serializable
问题描述
我刚刚开始编写 Python.我想用scrapy创建一个bot,它显示TypeError: 当我运行项目时,'bytes' 类型的对象不是 JSON 可序列化的.
I just started programming Python. I want to use scrapy to create a bot,and it showed TypeError: Object of type 'bytes' is not JSON serializable when I run the project.
import json
import codecs
class W3SchoolPipeline(object):
def __init__(self):
self.file = codecs.open('w3school_data_utf8.json', 'wb', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(dict(item)) + '
'
# print line
self.file.write(line.decode("unicode_escape"))
return item
<小时>
from scrapy.spiders import Spider
from scrapy.selector import Selector
from w3school.items import W3schoolItem
class W3schoolSpider(Spider):
name = "w3school"
allowed_domains = ["w3school.com.cn"]
start_urls = [
"http://www.w3school.com.cn/xml/xml_syntax.asp"
]
def parse(self, response):
sel = Selector(response)
sites = sel.xpath('//div[@id="navsecond"]/div[@id="course"]/ul[1]/li')
items = []
for site in sites:
item = W3schoolItem()
title = site.xpath('a/text()').extract()
link = site.xpath('a/@href').extract()
desc = site.xpath('a/@title').extract()
item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
return items
回溯:
TypeError: Object of type 'bytes' is not JSON serializable
2017-06-23 01:41:15 [scrapy.core.scraper] ERROR: Error processing {'desc': [b'x
e4xbdxbfxe7x94xa8 XSLT xe6x98xbexe7xa4xba XML'],
'link': [b'/xml/xml_xsl.asp'],
'title': [b'XML XSLT']}
Traceback (most recent call last):
File
"c:usersadministratorappdatalocalprogramspythonpython36libsite-p
ackages wistedinternetdefer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "D:LZZZZBw3schoolw3schoolpipelines.py", line 19, in process_item
line = json.dumps(dict(item)) + '
'
File
"c:usersadministratorappdatalocalprogramspythonpython36libjson\_
_init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File
"c:usersadministratorappdatalocalprogramspythonpython36libjsone
ncoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File
"c:usersadministratorappdatalocalprogramspythonpython36libjsone
ncoder.py", line 257, in iterencode
return _iterencode(o, 0)
File
"c:usersadministratorappdatalocalprogramspythonpython36lib
jsonencoder.py", line 180, in default
o.__class__.__name__)
TypeError: Object of type 'bytes' is not JSON serializable
推荐答案
您正在自己创建那些 bytes
对象:
You are creating those bytes
objects yourself:
item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
每个 t.encode()
、l.encode()
和 d.encode()
调用都会创建一个 字节
字符串.不要这样做,把它留给 JSON 格式来序列化这些.
Each of those t.encode()
, l.encode()
and d.encode()
calls creates a bytes
string. Do not do this, leave it to the JSON format to serialise these.
接下来,您犯了其他几个错误;你在没有必要的地方编码太多了.把它留给 json
模块和 open()
调用返回的 standard 文件对象来处理编码.
Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json
module and the standard file object returned by the open()
call to handle encoding.
您也不需要将您的 items
列表转换为字典;它已经是一个可以直接进行 JSON 编码的对象:
You also don't need to convert your items
list to a dictionary; it'll already be an object that can be JSON encoded directly:
class W3SchoolPipeline(object):
def __init__(self):
self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(item) + '
'
self.file.write(line)
return item
我猜您遵循了假定 Python 2 的教程,而您使用的是 Python 3.我强烈建议你找一个不同的教程;它不仅是为过时的 Python 版本编写的,如果它提倡 line.decode('unicode_escape')
,它就会教给一些极坏的习惯,这些习惯会导致难以跟踪的错误.我建议您查看 Think Python,第二版一本关于学习 Python 3 的免费好书.
I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape')
it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.
这篇关于类型错误:“字节"类型的对象不是 JSON 可序列化的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!