cURL请求在需要JavaScript支持的页面上 [英] cURL request on a page requiring JavaScript support

查看:816
本文介绍了cURL请求在需要JavaScript支持的页面上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要获取pinnaclesports.com的HTML源代码。问题是它检测是否启用了Cookie和JS,如果没有,它只返回一些页面


此网站需要JavaScript和Cookie启用。请更改浏览器设置或升级浏览器。


使用cURL时是否有任何方式欺骗JS支持?



编辑:我可以使用无头浏览器作为Perl / Ruby模块或用PHP编写。

解决方案

我想出来,
如果你做无cookie的REQUEST将返回一个页面,它使用javascript来设置cookie,你正在使用curl。 >

进行另一个curl调用,如下所示

  curl https://www.pinnaclesports .com / --cookieYPF8827340282Jdskjhfiw_928937459182JAX666 = 122.167.231.139

ie你必须做2个调用
1)使cookie减少调用,读取和正则表达式查找cookiename。
2)在设置cokie名称后发出第二个请求。



<$>






p $ p> select * from html where url =https://www.pinnaclesports.com/

将您的卷曲指向此处


I need to get the HTML source of pinnaclesports.com. The problem is it detects whether cookies and JS are enabled and if not, it just returns some page saying

This site requires JavaScript and Cookies to be enabled. Please change your browser settings or upgrade your browser.

Is there any way how to spoof JS support when using cURL?

EDIT: I can use a headless browser that runs either as a Perl/Ruby module or is written in PHP

解决方案

I figured out that, if you make cookie-less REQUEST a page will be returned , which uses javascript to set cookies, the one which you are getting using the curl.

make another curl call like this

curl https://www.pinnaclesports.com/ --cookie "YPF8827340282Jdskjhfiw_928937459182JAX666=122.167.231.139"

i.e. You have to make 2 calls 1) make cookie less call, read and regex to find cookiename. 2) make 2nd request after setting the cokie name. that will solve your problem.

OR
Just use YQL

select * from html where url="https://www.pinnaclesports.com/" 

point your curl to here

这篇关于cURL请求在需要JavaScript支持的页面上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆