从javascript生成的外部网站中提取内容 [英] pull content from an external website generated by javascript

查看:106
本文介绍了从javascript生成的外部网站中提取内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何使用php从外部网站提取html内容并进行解析,但是问题是我要提取的内容是由javascript函数生成的.

I know how to pull html content from an external website with php and parse it, but the problem is that the content I want to extract is generated by a javascript function.

代码如下:

  <div align="left">
   <div id="divCotizaciones"></div>
   <script type="text/javascript">
           getCotizaciones("cotizaciones_busca.dat");
   </script>
  </div>

我想提取该函数生成的所有内容. 这是我尝试提取其内容的网页: http://www.bvl .com.pe/neg_rv_alfa.html#

I would like to extract all the content generated by that function. This is the webpage from where I'm trying to pull the content: http://www.bvl.com.pe/neg_rv_alfa.html#

我尝试过此方法,但是它不起作用:

I tried this, but it's not working:

$html = new DOMDocument();
$html->loadHtmlFile('http://www.bvl.com.pe/neg_rv_alfa.html#');
$xpath = new DOMXPath($html);
$nodelist = $xpath->query('//*[@id="div"]/div[4]');
echo $output = $nodelist->item(0)->nodeValue;

// and this is the output I get: getCotizaciones("cotizaciones_busca.dat");

推荐答案

很遗憾,您无法使用DOM或加载外部源(例如, get_file,curl等您需要JavaScript编译器,或者需要一种编程语言才能使用插件来编译JavScript(例如,C ++上的WebKit),PHP不提供该支持.

Unfortunately you cannot execute JavaScript code using DOM or any other PHP function that loads external sources e.g. get_file, curl, ect. You need JavaScript compiler, or a programming language needs a plugin to compile JavScript (e.g. WebKit on C++) PHP doesn't have that support.

但是,您可以做的是查看浏览器中如何生成数据以及如何显示数据.我为您完成了该任务,并发现通过向其他URL发出请求可以生成网格.因此,与其调用JavaScript函数getCotizaciones("cotizaciones_busca.dat");而不是调用'http://www.bvl.com.pe/neg_rv_alfa.html#',JavaScript函数又使用ajax调用了此URL.

However, what you can do is to see how the data is generated in a browser and how it displays that data. I did that for you and find out that grid is generated by making a request to different URL. So, instead of calling 'http://www.bvl.com.pe/neg_rv_alfa.html#' which calls the JavaScript function getCotizaciones("cotizaciones_busca.dat"); which in turn calls this URL using ajax.

http://www.bvl.com.pe/includes/cotizaciones_busca.dat

此url是您需要的数据,您可以通过DOM或其​​他方式加载它>

this url is the data you need and you an load it via DOM or whatever>

Protip :将Firebug或任何开发工具控制台用于您选择的浏览器.每当您看到ajax请求时,请查看它的作用,在何处发出请求以及什么是参数.检查存储函数的js文件的源.看看它能做什么.在您的实例http://www.bvl.com.pe/js/cabecera_pie.js中,您将看到其调用ajax请求,具体取决于用户单击了什么.在domload等之前将其复制到phpb中

Protip: Use firebug or whatever dev tool console for browser of your choice. whenever you see ajax request, see what it does, where does it make a request, and what are parameters. Check the source of the js file where function is stored. See what it does. In your instance http://www.bvl.com.pe/js/cabecera_pie.js and you'll see its calling an ajax request depending on what user has clicked. replicate that in phpb before domload , etc

这篇关于从javascript生成的外部网站中提取内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆