scrapy python 中的 Unicode 问题 [英] Unicode issue in scrapy python

查看:76
本文介绍了scrapy python 中的 Unicode 问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

两个小时以来,我一直在寻找这个主题,并尝试了很多解决方案,但没有在我的情况下起作用先上代码

For two hours, I am searching for this topic and I have tried a lot of solutions but noen worked in my case Here's the code first

import scrapy

class HamburgSpider(scrapy.Spider):
    name = 'hamburg'
    #allowed_domains = ['https://www.hamburg.de']
    start_urls = ['https://www.hamburg.de/branchenbuch/hamburg/10239785/n0/']
    custom_settings = {
        'FEED_EXPORT_FORMAT': 'utf-8'
    }

    def parse(self, response):
        #response=response.body.encode('utf-8')
        items = response.xpath("//div[starts-with(@class, 'item')]")
        for item in items:
            business_name = item.xpath(".//h3[@class='h3rb']/text()").get()
            address1 = item.xpath(".//div[@class='address']/p[@class='extra post']/text()[1]").get()
            address2 = item.xpath(".//div[@class='address']/p[@class='extra post']/text()[2]").get()
            phone = item.xpath(".//div[@class='address']/span[@class='extra phone']/text()").get()

            yield {
                'Business Name': business_name,
                'Address1': address1,
                'Address2': address2,
                'Phone Number': phone
            }

在代码中我放了这一行

custom_settings = {'FEED_EXPORT_FORMAT': 'utf-8'}

custom_settings = { 'FEED_EXPORT_FORMAT': 'utf-8' }

该行应该处理编码问题,但是当将结果导出到 csv 时,我发现问题仍然存在.我只需要显示网站上显示的文本 Poppenbütteler Bogen 29a sa 示例.我发现输出是不同的

The line supposed to deal with the issue of encoding but when exporting the results to csv, I found that the issue is still there. I simply need to show this example of text Poppenbütteler Bogen 29a sa shown on the website. What I found is that the output is different

推荐答案

您的设置名称有误.

FEED_EXPORT_FORMAT 不是scrapy 默认使用的设置之一,你想要FEED_EXPORT_ENCODING.

FEED_EXPORT_FORMAT is not one of the settings scrapy uses by default, you want FEED_EXPORT_ENCODING instead.

这篇关于scrapy python 中的 Unicode 问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆