无法使用 Scrapy 跟踪链接 [英] Not able to follow link using Scrapy
本文介绍了无法使用 Scrapy 跟踪链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我无法点击链接并取回值.
I am not able to follow the link and get back the values.
我尝试使用下面的代码我能够抓取第一个链接之后它不会重定向到第二个跟随链接(函数).
I tried using the below code I am able to crawl the first link after that it doesnt redirect to the second follow link(function).
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from scrapy.http.request import Request
class ScrapyOrgSpider(BaseSpider):
name = "scrapy"
allowed_domains = ["example.com"]
start_urls = ["http://www.example.com/abcd"]
def parse(self, response):
hxs = HtmlXPathSelector(response)
res1=Request("http://www.example.com/follow", self.a_1)
print res1
def a_1(self, response1):
hxs2 = HtmlXPathSelector(response1)
print hxs2.select("//a[@class='channel-link']").extract()[0]
return response1
推荐答案
parse
函数必须返回请求,而不仅仅是打印它.
The parse
function must return the request, not just print it.
def parse(self, response):
hxs = HtmlXPathSelector(response)
res1 = Request("http://www.example.com/follow", callback=self.a_1)
print res1 # if you want
return res1
这篇关于无法使用 Scrapy 跟踪链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文