使用 Scrapy 抓取 JSON 响应 [英] Scraping a JSON response with Scrapy

查看:40
本文介绍了使用 Scrapy 抓取 JSON 响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用 Scrapy 抓取返回 JSON 的 Web 请求?例如,JSON 将如下所示:

How do you use Scrapy to scrape web requests that return JSON? For example, the JSON would look like this:

{
    "firstName": "John",
    "lastName": "Smith",
    "age": 25,
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": "10021"
    },
    "phoneNumber": [
        {
            "type": "home",
            "number": "212 555-1234"
        },
        {
            "type": "fax",
            "number": "646 555-4567"
        }
    ]
}

我希望抓取特定项目(例如上面的 namefax)并保存到 csv.

I would be looking to scrape specific items (e.g. name and fax in the above) and save to csv.

推荐答案

与使用 Scrapy 的 HtmlXPathSelector 进行 html 响应相同.唯一的区别是你应该使用 json 模块来解析响应:

It's the same as using Scrapy's HtmlXPathSelector for html responses. The only difference is that you should use json module to parse the response:

class MySpider(BaseSpider):
    ...


    def parse(self, response):
         jsonresponse = json.loads(response.text)

         item = MyItem()
         item["firstName"] = jsonresponse["firstName"]             

         return item

希望有所帮助.

这篇关于使用 Scrapy 抓取 JSON 响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆