用 Scrapy 抓取 ajax 页面? [英] Scraping ajax page with Scrapy?
本文介绍了用 Scrapy 抓取 ajax 页面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用 Scrapy 从这个页面抓取数据
I'm using Scrapy for scrape data from this page
https://www.bricoetloisirs.ch/magasins/gardena
产品列表动态显示.查找 url 以获取产品
Product list appears dynamically. Find url to get products
但是当我用 Scrapy 抓取它时,它给了我一个空白页面
But when i scrape it by Scrapy it give me empty page
<span class="pageSizeInformation" id="page0" data-page="0" data-pagesize="12">Page: 0 / Size: 12</span>
这是我的代码
# -*- coding: utf-8 -*-
import scrapy
from v4.items import Product
class GardenaCoopBricoLoisirsSpider(scrapy.Spider):
name = "Gardena_Coop_Brico_Loisirs_py"
start_urls = [
'https://www.bricoetloisirs.ch/coop/ajax/nextPage/(cpgnum=1&layout=7.01-14_180_69_164_182&uiarea=2&carea=%24ROOT&fwrd=frwd0&cpgsize=12)/.do?page=2&_=1473841539272'
]
def parse(self, response):
print response.body
推荐答案
我解决了这个问题.
# -*- coding: utf-8 -*-
import scrapy
from v4.items import Product
class GardenaCoopBricoLoisirsSpider(scrapy.Spider):
name = "Gardena_Coop_Brico_Loisirs_py"
start_urls = [
'https://www.bricoetloisirs.ch/magasins/gardena'
]
def parse(self, response):
for page in xrange(1, 50):
url = response.url + '/.do?page=%s&_=1473841539272' % page
yield scrapy.Request(url, callback=self.parse_page)
def parse_page(self, response):
print response.body
这篇关于用 Scrapy 抓取 ajax 页面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文