page.open()函数不适用于某些URL [英] page.open() function doesn't work properly for some URLs

查看:147
本文介绍了page.open()函数不适用于某些URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是node的新手.我已经使用Node和Phantom编写了一个刮取网站的代码.我的代码适用于google.com,但不适用于Facebook,因为它在内部向其他文件发出ajax请求以获取数据.

I am new in node. I have written a code using Node and Phantom to scrape a website. My code is working for google.com but not working for facebook because it is internally making an ajax request to other files to get the data.

var phantom = require('phantom');

phantom.create(function(ph) {
   return ph.createPage(function(page) {
       return page.open("https://facebook.com/", function(status) {
            if(status !== 'success'){
                console.log('Unable to load the url!');
                ph.exit();
            } else {
                setTimeout(function() {
                    return page.evaluate(function() {
                        return document.getElementsByTagName('body')[0].innerHTML;

                     }, function(result) {
                         console.log(result); //Log out the data.
                         ph.exit();
                     });
                }, 5000);
            };
        });
    });
});

所以基本上,当我执行我的代码时,如果是facebook,它会返回无法加载,但如果是google,它会给出响应.

So basically when I am executing my code then in case of facebook it is returning unable to load but but in case of google it is giving body response.

有人可以告诉我我应该做些什么改变才能得到结果吗?

Can anybody tell me what changes should I do to get the result?

PhantomJS版本:1.9.0

PhantomJS version: 1.9.0

推荐答案

您应该将一些命令行选项传递给PhantomJS,以不使用SSLv3,而仅使用TLSv1,并可以选择忽略SSL错误(--web-security=false也可能有帮助):

You should pass some commandline options to PhantomJS to not use SSLv3 but only TLSv1 and optionally ignore SSL errors (--web-security=false might also be helpful):

phantom.create('--ssl-protocol=tlsv1', '--ignore-ssl-errors=true', function(ph) {
    ...

这可能是一个问题,原因是很多网站由于Poodle漏洞而删除了SSLv3支持.

The reason this might be an issue is that many sites have removed SSLv3 support because of the Poodle vulnerability.

此答案为普通的PhantomJS提供了解决方案. 我在这里的答案为CasperJS更详细地说明了该问题.

This answer provides the solution for plain PhantomJS. My answer here elaborates on that issue in more detail for CasperJS.

这篇关于page.open()函数不适用于某些URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆