比浏览器的响应不同的响应斗志旺盛 [英] Scrappy response different than browser response
问题描述
我试图用刮一scrapy这个页面:
I am trying to scrape a this page with scrapy:
http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=7391
和我得到的回应是比我在浏览器中看到的不同。浏览器的响应具有正确的页面,而scrapy的回应是:
and the response which I get is different than what I see in the browser. Browser response has the correct page, while scrapy response is:
http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=1
页。我曾尝试与urllib2的,但仍然有同样的问题。任何帮助深表AP preciated。
page. I have tried with urllib2 but still have the same issue. Any help is much appreciated.
推荐答案
我真的不明白的问题,但通常是一个浏览器不同的反应和scrapy由一个这些原因造成的:
I don't really understand the issue, but usually a different response for a browser and scrapy is caused by one these:
- 服务器分析你的
用户代理
头,并返回移动客户端或僵尸特制的网页; - 服务器分析饼干,做一些特别的东西,当它看起来像您正在访问的第一次;
- 您正试图使通过scrapy POST请求,如浏览器,但你忘了某种形式的领域,或者把错误的值
- 等
- the server analyzes your
User-Agent
header, and returns a specially crafted page for mobile clients or bots; - the server analyzes the cookies, and does something special when it looks like you are visiting for the first time;
- you are trying to make a POST request via scrapy like the browser does, but you forgot some form fields, or put wrong values
- etc.
有没有通用的方法来确定什么是错的,因为它取决于服务器的逻辑,你不知道。如果运气好的话,你会分析并解决所有上述问题,并将使其工作。
There is no universal way to determine what's wrong, because it depends on the server logic, which you don't know. If you are lucky, you will analyze and fix all the mentioned issues and will make it work.
这篇关于比浏览器的响应不同的响应斗志旺盛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!