Selenium jSoup从Javascript网页获取数据 [英] Selenium jSoup get data from Javascript Webpage

查看:112
本文介绍了Selenium jSoup从Javascript网页获取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近已经问了几个问题,但是还没有真正找到我想要的东西.

Have asked a few questions around this recently, but haven't really found what I'm looking for.

我正在尝试从 http://www获得所有匹配项.futbol24.com/Live/?__ igp = 1& LiveDate = 20141106 打印出来,包括时间,主队和客队.我了解页面加载后会加载内容.

I am trying to get all of the matches from http://www.futbol24.com/Live/?__igp=1&LiveDate=20141106 to print out, with time, home team and away team. I understand the content is loaded after the page is.

有人告诉我要使用Selenium,然后对结果使用jSoup来获取所需的数据.是否有人可以在上面的网站上给我看一些教程或一些示例代码?

I have been told to use Selenium and then use jSoup on the result to get the data I want. Does anybody have a tutorial or some sample code they could show me, for how to do it on the website above?

任何例子,将不胜感激,谢谢

Any examples would be greatly appreciated, thanks

推荐答案

如果您要抓取/数据挖掘某人的站点,请注意以下事项:

If you are going to scrape / datamine someone's site, here are some considerations:

  1. 获得网站所有者的许可!如果您不这样做,您会生气,并在最好的情况下将其列入黑名单,或者在最坏的情况下被提起诉讼.
  2. 了解该网站是否公开了 .这始终是刮取网站的更好方法.
  3. 更适合此任务的研究工具/库.其中一些包括,.....根据您的舒适度/知识水平,您可能需要研究以下技术: 的问题是用于浏览器的功能测试库应用程序,这使该任务成为可怜的选择.
  1. Get permission from the site's owner! If you do not, you will piss off the owner and get blacklisted in the best case, or be served with a lawsuit in the worst case.
  2. Find out if the site exposes an api. This is always the better way of scraping a site.
  3. Research tools / libraries that are more appropriate for this task. Some of these include curl, wget, httpbuilder, ..... Depending on your level of comfort / knowledge, you may need to research the underlying technologies: http, rest, .....
  4. selenium is a functional test library for browser applications, which makes it a poor choice for this task.

PS:我完全希望对此事能引起谴责/关闭,因为讨论/观点对于SO来说是题外话 >.

PS: I am fully expecting for this to get downvoted / closed, because discussions / opinions are off-topic for SO.

这篇关于Selenium jSoup从Javascript网页获取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆