如何在scrapy中处理302重定向 [英] how to handle 302 redirect in scrapy

查看：181 发布时间：2021/6/25 20:31:24 python scrapy http-status-code-302

本文介绍了如何在scrapy中处理302重定向的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在抓取网站时收到来自服务器的 302 响应:

I am receiving a 302 response from a server while scrapping a website:

2014-04-01 21:31:51+0200 [ahrefs-h] DEBUG: Redirecting (302) to <GET http://www.domain.com/Site_Abuse/DeadEnd.htm> from <GET http://domain.com/wps/showmodel.asp?Type=15&make=damc&a=664&b=51&c=0>

我想将请求发送到 GET url 而不是被重定向.现在我找到了这个中间件:

I want to send request to GET urls instead of being redirected. Now I found this middleware:

https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/downloadermiddleware/redirect.py#L31

我将此重定向代码添加到我的 middleware.py 文件中，并将其添加到 settings.py 中:

I added this redirect code to my middleware.py file and I added this into settings.py:

DOWNLOADER_MIDDLEWARES = {
 'street.middlewares.RandomUserAgentMiddleware': 400,
 'street.middlewares.RedirectMiddleware': 100,
 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
}

但我仍然被重定向.为了让这个中间件正常工作，我只需要这样做吗?我错过了什么吗?

But I am still getting redirected. Is that all I have to do in order to get this middleware working? Do I miss something?

推荐答案

忘记了这个场景中的中间件，这会解决问题:

Forgot about middlewares in this scenario, this will do the trick:

meta = {'dont_redirect': True,'handle_httpstatus_list': [302]}

也就是说，您在产生请求时需要包含元参数:

That said, you will need to include meta parameter when you yield your request:

yield Request(item['link'],meta = {
                  'dont_redirect': True,
                  'handle_httpstatus_list': [302]
              }, callback=self.your_callback)

这篇关于如何在scrapy中处理302重定向的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在scrapy中处理302重定向 [英] how to handle 302 redirect in scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在scrapy中处理302重定向 [英] how to handle 302 redirect in scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭