使用 Scrapy 抓取 JSON 响应 [英] Scraping a JSON response with Scrapy
本文介绍了使用 Scrapy 抓取 JSON 响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何使用 Scrapy 抓取返回 JSON 的 Web 请求?例如,JSON 将如下所示:
How do you use Scrapy to scrape web requests that return JSON? For example, the JSON would look like this:
{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}
我希望抓取特定项目(例如上面的 name
和 fax
)并保存到 csv.
I would be looking to scrape specific items (e.g. name
and fax
in the above) and save to csv.
推荐答案
与使用 Scrapy 的 HtmlXPathSelector
进行 html 响应相同.唯一的区别是你应该使用 json
模块来解析响应:
It's the same as using Scrapy's HtmlXPathSelector
for html responses. The only difference is that you should use json
module to parse the response:
class MySpider(BaseSpider):
...
def parse(self, response):
jsonresponse = json.loads(response.text)
item = MyItem()
item["firstName"] = jsonresponse["firstName"]
return item
希望有所帮助.
这篇关于使用 Scrapy 抓取 JSON 响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文