与浏览器响应不同的糟糕响应 [英] Scrappy response different than browser response

查看:42
本文介绍了与浏览器响应不同的糟糕响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 scrapy 抓取此页面:

I am trying to scrape a this page with scrapy:

http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=7391

我得到的响应与我在浏览器中看到的不同.浏览器响应有正确的页面,而scrapy响应是:

and the response which I get is different than what I see in the browser. Browser response has the correct page, while scrapy response is:

http://www.barnesandnoble.com/s?dref=4815&sort=SA&startat=1

页面.我试过 urllib2 但仍然有同样的问题.非常感谢任何帮助.

page. I have tried with urllib2 but still have the same issue. Any help is much appreciated.

推荐答案

我不太明白这个问题,但通常浏览器和scrapy 的不同响应是由以下原因引起的:

I don't really understand the issue, but usually a different response for a browser and scrapy is caused by one these:

  • 服务器分析您的 User-Agent 标头,并为移动客户端或机器人返回一个特制的页面;
  • 服务器会分析 cookie,并在您第一次访问时执行一些特殊操作;
  • 您正尝试像浏览器一样通过scrapy 发出POST 请求,但是您忘记了一些表单字段,或者输入了错误的值
  • the server analyzes your User-Agent header, and returns a specially crafted page for mobile clients or bots;
  • the server analyzes the cookies, and does something special when it looks like you are visiting for the first time;
  • you are trying to make a POST request via scrapy like the browser does, but you forgot some form fields, or put wrong values
  • etc.

没有通用的方法来确定哪里出了问题,因为这取决于您不知道的服务器逻辑.如果幸运的话,您将分析并解决所有提到的问题,并使其发挥作用.

There is no universal way to determine what's wrong, because it depends on the server logic, which you don't know. If you are lucky, you will analyze and fix all the mentioned issues and will make it work.

这篇关于与浏览器响应不同的糟糕响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆