php爬虫的网站与ajax内容和https [英] php crawler for website with ajax content and https

查看:136
本文介绍了php爬虫的网站与ajax内容和https的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图抓取基于ajax和https但没有运气的网站的内容。

i'm trying to grab the content of a website based on ajax and https but with no luck.

这是可能的。

我试图抓取的网站是:

https://www.bet3000.com/en/html/ home.html#!https://www.bet3000.com/html/en/eventssportsbook.html?category_id = 2117

感谢

推荐答案

如果你看看这个页面正在做的HTTP请求(例如使用Firebug for Firefox) em>,你会注意到它产生了一些Ajax请求。

If you take a look at the HTTP requests that this page is doing (using, for example, Firebug for Firefox), you'll notice it makes several Ajax requests.

一个可能的解决方案可能是你请求其中一个URL

Instead of trying to execute the Javascript code, a possible solution could be for you to request one of those URLs, and get the data -- you'd also not have to parse the HTML, this way.



在此特定情况下,其中一个这些请求发送到以下网址:


In this specific case, one of those requests is made to the following URL :

https://www.bet3000.com/ajax/en/sportsbook.json.html?category_id=2117&offset=&live=&sportsbook_id=0

返回一些JSON数据,这应该感兴趣你一点;-)

(在JSON之前和之后有一些字符,需要被删除,但除此之外,我没有看到任何不好看的东西。)

This URL seems to return some JSON data, that should interest you quite a bit ;-)
(There is a few characters before and after the JSON, that will need to be removed, but, asides from that, I don't see anything that doesn't look good.)

这篇关于php爬虫的网站与ajax内容和https的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆