用Python抓取Ajax [英] Scraping Ajax with Python

查看:79
本文介绍了用Python抓取Ajax的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在用Python练习我的抓取能力.我的表现不错,但是遇到了一些让我很困惑的网站.他们利用Ajax查找附近的位置.有几个站点以相同的方式设计.其中一个网站是www.applebees.com.即使使用萤火虫,我也找不到答案.

I have been practicing my scraping abilities in Python. I have gotten pretty good, but came across a few sites that have me pretty stumped. They make use of Ajax to find the nearby locations. There are several sites designed the same way. One of the websites is www.applebees.com. Even using firebug I cannot find the answer.

Python如何通过ajax调用请求位置?我完全陷入了困境.

How can Python request the locations via the ajax call? I am completely stumped.

该页面为www.applebees.com,右侧有一个表格可输入邮政编码,它会拉出与该邮政编码最接近的位置.但是,如果我在输入该邮政编码后拉出源,则位置仍不会显示在源文件中.请求/响应完全是ajax并隐藏在html源中,我从未见过类似的东西.我正在尝试研究解决方案.

The page is www.applebees.com, there is a form on the right hand side to enter the zipcode and it pulls up the closest locations to that zipcode. However, if I pull the source after this zipcode is entered the locations still don't show up in the source file. The request/response are completely ajax and hidden to the html source, i have never seen anything like it. I am trying to research a solution now.

推荐答案

对于某些网站,以编程方式使用http库进行抓取可能很困难.如果要在JavaScript重载站点(ajax或其他站点)上模拟用户的吸引力,则可以考虑使用诸如硒之类的东西来驱动真正的浏览器.有python客户端浏览器,您将获得对页面DOM的访问权限.

Scraping programatically using an http library can be difficult for some sites. If you are trying to simulate user interraction on a JavaScript heavy site (ajax or otherwise) you might consider driving a real browser using something like selenium. There are python client browsing and you will get some access to the page DOM.

http://pypi.python.org/pypi/selenium

这篇关于用Python抓取Ajax的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆