类型错误:“字节"类型的对象不是 JSON 可序列化的 [英] TypeError: Object of type 'bytes' is not JSON serializable

查看:31
本文介绍了类型错误:“字节"类型的对象不是 JSON 可序列化的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始编写 Python.我想用scrapy创建一个bot,它显示TypeError: 当我运行项目时,'bytes' 类型的对象不是 JSON 可序列化的.

I just started programming Python. I want to use scrapy to create a bot,and it showed TypeError: Object of type 'bytes' is not JSON serializable when I run the project.

import json
import codecs

class W3SchoolPipeline(object):

  def __init__(self):
      self.file = codecs.open('w3school_data_utf8.json', 'wb', encoding='utf-8')

  def process_item(self, item, spider):
      line = json.dumps(dict(item)) + '
'
      # print line

      self.file.write(line.decode("unicode_escape"))
      return item

<小时>

from scrapy.spiders import Spider
from scrapy.selector import Selector
from w3school.items import W3schoolItem

class W3schoolSpider(Spider):

    name = "w3school"
    allowed_domains = ["w3school.com.cn"]

    start_urls = [
        "http://www.w3school.com.cn/xml/xml_syntax.asp"
    ]

    def parse(self, response):
        sel = Selector(response)
        sites = sel.xpath('//div[@id="navsecond"]/div[@id="course"]/ul[1]/li')

    items = []
    for site in sites:
        item = W3schoolItem()
        title = site.xpath('a/text()').extract()
        link = site.xpath('a/@href').extract()
        desc = site.xpath('a/@title').extract()

        item['title'] = [t.encode('utf-8') for t in title]
        item['link'] = [l.encode('utf-8') for l in link]
        item['desc'] = [d.encode('utf-8') for d in desc]
        items.append(item)
        return items

回溯:

TypeError: Object of type 'bytes' is not JSON serializable
2017-06-23 01:41:15 [scrapy.core.scraper] ERROR: Error processing       {'desc': [b'x
e4xbdxbfxe7x94xa8 XSLT xe6x98xbexe7xa4xba XML'],
 'link': [b'/xml/xml_xsl.asp'],
 'title': [b'XML XSLT']}

Traceback (most recent call last):
File  
"c:usersadministratorappdatalocalprogramspythonpython36libsite-p
ackages	wistedinternetdefer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
File "D:LZZZZBw3schoolw3schoolpipelines.py", line 19, in process_item
    line = json.dumps(dict(item)) + '
'
File 
"c:usersadministratorappdatalocalprogramspythonpython36libjson\_
_init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
File 
"c:usersadministratorappdatalocalprogramspythonpython36libjsone
ncoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
File  
"c:usersadministratorappdatalocalprogramspythonpython36libjsone
ncoder.py", line 257, in iterencode
    return _iterencode(o, 0)
File      
"c:usersadministratorappdatalocalprogramspythonpython36lib
jsonencoder.py", line 180, in default
    o.__class__.__name__)
  TypeError: Object of type 'bytes' is not JSON serializable

推荐答案

您正在自己创建那些 bytes 对象:

You are creating those bytes objects yourself:

item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)

每个 t.encode()l.encode()d.encode() 调用都会创建一个 字节 字符串.不要这样做,把它留给 JSON 格式来序列化这些.

Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.

接下来,您犯了其他几个错误;你在没有必要的地方编码太多了.把它留给 json 模块和 open() 调用返回的 standard 文件对象来处理编码.

Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.

您也不需要将您的 items 列表转换为字典;它已经是一个可以直接进行 JSON 编码的对象:

You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:

class W3SchoolPipeline(object):    
    def __init__(self):
        self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(item) + '
'
        self.file.write(line)
        return item

我猜您遵循了假定 Python 2 的教程,而您使用的是 Python 3.我强烈建议你找一个不同的教程;它不仅是为过时的 Python 版本编写的,如果它提倡 line.decode('unicode_escape'),它就会教给一些极坏的习惯,这些习惯会导致难以跟踪的错误.我建议您查看 Think Python,第二版一本关于学习 Python 3 的免费好书.

I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.

这篇关于类型错误:“字节"类型的对象不是 JSON 可序列化的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆