获取AttributeError错误'str'对象没有属性'get' [英] Getting AttributeError error 'str' object has no attribute 'get'

查看:181
本文介绍了获取AttributeError错误'str'对象没有属性'get'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用JSON响应时出现错误:

I am getting an error while working with JSON response:

Error: AttributeError: 'str' object has no attribute 'get'

可能是什么问题?

对于其他值,我也遇到以下错误:

I am also getting the following errors for the rest of the values:

*** TypeError:"builtin_function_or_method"对象不可下标

***TypeError: 'builtin_function_or_method' object is not subscriptable

电话":值['_source'] ['primaryPhone'], KeyError:"primaryPhone" ***

'Phone': value['_source']['primaryPhone'], KeyError: 'primaryPhone'***

# -*- coding: utf-8 -*-
import scrapy
import json


class MainSpider(scrapy.Spider):
    name = 'main'
    start_urls = ['https://experts.expcloud.com/api4/std?searchterms=AB&size=216&from=0']

def parse(self, response):

    resp = json.loads(response.body)
    values = resp['hits']['hits']

    for value in values:

        yield {
            'Full Name': value['_source']['fullName'],
            'Phone': value['_source']['primaryPhone'],
            "Email": value['_source']['primaryEmail'],
            "City": value.get['_source']['city'],
            "Zip Code": value.get['_source']['zipcode'],
            "Website": value['_source']['websiteURL'],
            "Facebook": value['_source']['facebookURL'],
            "LinkedIn": value['_source']['LinkedIn_URL'],
            "Twitter": value['_source']['Twitter'],
            "BIO": value['_source']['Bio']
        }

推荐答案

它的嵌套程度超出了您的想象.这就是为什么您会遇到错误.

It's nested deeper than what you think it is. That's why you're getting an error.

import scrapy
import json


class MainSpider(scrapy.Spider):
    name = 'test'
    start_urls = ['https://experts.expcloud.com/api4/std?searchterms=AB&size=216&from=0']

    def parse(self, response):
        resp = json.loads(response.body)
        values = resp['hits']['hits']

        for value in values:
            yield {
                'Full Name': value['_source']['fullName'],
                'Primary Phone':value['_source']['primaryPhone']
            }

解释

resp变量正在创建python字典,但是此JSON数据中没有resp['hits']['hits']['fullName'].您要查找的用于fullName的数据实际上是resp['hits']['hits'][i]['_source']['fullName']. i是数字,因为resp['hits']['hits']是列表.

Explanation

The resp variable is creating a python dictionary, but there is no resp['hits']['hits']['fullName'] within this JSON data. The data you're looking for, for fullName is actually resp['hits']['hits'][i]['_source']['fullName']. i being an number because resp['hits']['hits'] is a list.

resp['hits']是字典,因此values变量很好. 但是resp['hits']['hits']是一个列表,因此您不能使用get请求,它仅接受数字作为[]中的值,而不接受字符串.因此是错误.

resp['hits'] is a dictionary and therefore the values variable is fine. But resp['hits']['hits'] is a list, therefore you can't use the get request, and it's only accepts numbers as values within [], not strings. Hence the error.

  1. 使用response.json()而不是json.loads(response.body),因为Scrapy v2.2开始,scrapy现在内部支持json.在后台,它已经导入了json.

  1. Use response.json() instead of json.loads(response.body), since Scrapy v2.2, scrapy now has support for json internally. Behind the scenes it already imports json.

还要检查json数据,为了方便起见,我使用了请求,只是嵌套直到获得所需的数据为止.

Also check the json data, I used requests for ease and just getting nesting down till I got the data you needed.

为这种类型的数据构建一个字典是合适的,因为它的结构合理,但是任何其他需要修改或更改或在某些地方有错误的数据.使用Items字典或ItemLoader.这两种产生输出的方式比产生字典具有更大的灵活性.我几乎从不制作字典,只有在您拥有高度结构化的数据时.

Yielding a dictionary is fine for this type of data as it's well structured, but any other data that needs modifying or changing or is wrong in places. Use either Items dictionary or ItemLoader. There's a lot more flexibility in those two ways of yielding an output than yielding a dictionary. I almost never yield a dictionary, the only time is when you have highly structured data.

更新代码

查看JSON数据,有很多丢失的数据.这是网络抓取的一部分,您会发现类似这样的错误.在这里,我们使用try andexcept块,因为当我们得到KeyError时,这意味着python无法识别与值关联的键.我们必须处理该异常,我们在这里通过说要产生一个字符串'No XXX'

Updated Code

Looking at the JSON data, there are quite a lot of missing data. This is part of web scraping you will find errors like this. Here we use a try and except block, for when we get a KeyError which means python hasn't been able to recognise the key associated with a value. We have to handle that exception, which we do here by saying to yield a string 'No XXX'

一旦开始出现空白等,最好考虑使用Items字典或Itemloaders.

Once you start getting gaps etc it's better to consider an Items dictionary or Itemloaders.

现在值得查看有关Items的Scrapy文档.本质上,Scrapy做两件事,它从网站中提取数据,并且提供了一种存储该数据的机制.这样做的方法是将其存储在名为Items的字典中.该代码与生成字典没有太大区别,但是Items字典使您可以通过scrapy可以做的其他事情更轻松地操纵提取的数据.您需要先使用所需的字段来编辑items.py.我们创建一个名为TestItem的类,我们使用scrapy.Field()定义每个字段.然后,我们可以在蜘蛛脚本中导入此类.

Now it's worth looking at the Scrapy docs about Items. Essentially Scrapy does two things, it extracted data from websites, and it provides a mechanism for storing this data. The way it does this is storing it in a dictionary called Items. The code isn't that much different from yielding a dictionary but Items dictionary allows you to manipulate the extracted data more easily with extra things scrapy can do. You need to edit your items.py first with the fields you want. We create a class called TestItem, we define each field using scrapy.Field(). We then can import this class in our spider script.

import scrapy


class TestItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    full_name = scrapy.Field()
    Phone = scrapy.Field()
    Email = scrapy.Field()
    City = scrapy.Field()
    Zip_code = scrapy.Field()
    Website = scrapy.Field()
    Facebook = scrapy.Field()
    Linkedin = scrapy.Field()
    Twitter = scrapy.Field()
    Bio = scrapy.Field()

在这里,我们指定了我们想要的字段,不幸的是,您不能使用带空格的字符串,因此为什么全名是full_name. field()为我们创建商品字典的字段.

Here we're specifying what we want the fields to be, you can't use a string with spaces unfortunately hence why full name is full_name. The field() creates the field of the item dictionary for us.

我们使用from ..items import TestItem将此项字典导入到我们的蜘蛛脚本中. from ..items意味着我们将items.py从父文件夹移至蜘蛛脚本,并导入类TestItem.这样,我们的蜘蛛就可以使用我们的json数据填充商品字典.

We import this item dictionary into our spider script with from ..items import TestItem. The from ..items means we're taking the items.py from the parent folder to the spider script and we're importing the class TestItem. That way our spider can populate the items dictionary with our json data.

请注意,就在for循环之前,我们通过item = TestItem()实例化了TestItem类.实例化意味着调用该类,在这种情况下,它会创建一个字典.这意味着我们正在创建项目字典,然后使用键和值填充该字典.您必须先执行此操作,然后才能在for循环中添加键和值.

Note that just before the for loop we instantiate the class TestItem by item = TestItem(). Instantiate means to call upon the class, in this case it makes a dictionary. This means we are creating the item dictionary and then we populate that dictionary with keys and values. You have to does this before you add your keys and values as you can see from within the for loop.

import scrapy
import json
from ..items import TestItem

class MainSpider(scrapy.Spider):
   name = 'test'
   start_urls = ['https://experts.expcloud.com/api4/std?searchterms=AB&size=216&from=0']

   def parse(self, response):
       resp = json.loads(response.body)
       values = response.json()['hits']['hits']
       item = TestItem()
       for value in values:
        try:
            item['full_name'] = value['_source']['fullName']
        except KeyError:
            item['full_name'] = 'No Name'
        try:
            item['Phone'] = value['_source']['primaryPhone']
        except KeyError:
            item['Phone'] = 'No Phone number'
        try:
            item["Email"] =  value['_source']['primaryEmail']
        except KeyError:
            item['Email'] = 'No Email'
        try:
            item["City"] = value['_source']['activeLocations'][0]['city']
        except KeyError:
            item['City'] = 'No City'
        try:
             item["Zip_code"] = value['_source']['activeLocations'][0]['zipcode']
        except KeyError:
            item['Zip_code'] = 'No Zip code'
                
        try:
            item["Website"] = value['AgentMarketingCenter'][0]['Website']
        except KeyError:
            item['Website'] = 'No Website'
               
        try:
            item["Facebook"] = value['_source']['AgentMarketingCenter'][0]['Facebook_URL']
        except KeyError:
            item['Facebook'] = 'No Facebook'
                
        try:
            item["Linkedin"] = value['_source']['AgentMarketingCenter'][0]['LinkedIn_URL']
        except KeyError:
            item['Linkedin'] = 'No Linkedin'    
        try:
            item["Twitter"] = value['_source']['AgentMarketingCenter'][0]['Twitter']
        except KeyError:
            item['Twitter'] = 'No Twitter'
        
        try:
             item["Bio"]: value['_source']['AgentMarketingCenter'][0]['Bio']
        except KeyError:
            item['Bio'] = 'No Bio'
               
        yield item
                    

这篇关于获取AttributeError错误'str'对象没有属性'get'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆