比浏览器的响应不同的响应斗志旺盛 [英] Scrappy response different than browser response

查看:94
本文介绍了比浏览器的响应不同的响应斗志旺盛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用刮一scrapy这个页面:

I am trying to scrape a this page with scrapy:

http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=7391

和我得到的回应是比我在浏览器中看到的不同。浏览器的响应具有正确的页面,而scrapy的回应是:

and the response which I get is different than what I see in the browser. Browser response has the correct page, while scrapy response is:

http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=1

页。我曾尝试与urllib2的,但仍然有同样的问题。任何帮助深表AP preciated。

page. I have tried with urllib2 but still have the same issue. Any help is much appreciated.

推荐答案

我真的不明白的问题,但通常是一个浏览器不同的反应和scrapy由一个这些原因造成的:

I don't really understand the issue, but usually a different response for a browser and scrapy is caused by one these:


  • 服务器分析你的用户代理头,并返回移动客户端或僵尸特制的网页;

  • 服务器分析饼干,做一些特别的东西,当它看起来像您正在访问的第一次;

  • 您正试图使通过scrapy POST请求,如浏览器,但你忘了某种形式的领域,或者把错误的值


  • the server analyzes your User-Agent header, and returns a specially crafted page for mobile clients or bots;
  • the server analyzes the cookies, and does something special when it looks like you are visiting for the first time;
  • you are trying to make a POST request via scrapy like the browser does, but you forgot some form fields, or put wrong values
  • etc.

有没有通用的方法来确定什么是错的,因为它取决于服务器的逻辑,你不知道。如果运气好的话,你会分析并解决所有上述问题,并将使其工作。

There is no universal way to determine what's wrong, because it depends on the server logic, which you don't know. If you are lucky, you will analyze and fix all the mentioned issues and will make it work.

这篇关于比浏览器的响应不同的响应斗志旺盛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆