一些帮助在Java中刮取页面 [英] Some help scraping a page in Java

查看：87 发布时间：2018/6/15 12:55:42 java html xhtml screen-scraping

本文介绍了一些帮助在Java中刮取页面的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要使用Java来抓取网页，并且我已经读过，正则表达式是一种非常低效的方法，应该将其放入DOM文档中以导航它。

我试过阅读文档，但它看起来太广泛了，我不知道从哪里开始。

您能告诉我如何刮擦<一个href =http://www.cs.grinnell.edu/~walker/fluency-book/labs/sample-table.html>这个表中的数组？我可以尝试从那里找出我的路。一个片段/例子也可以做得很好。

谢谢。

解决方案

您可以尝试 jsoup：Java HTML Parser 。这是一个很好的示例代码库。

I need to scrape a web page using Java and I've read that regex is a pretty inefficient way of doing it and one should put it into a DOM Document to navigate it.

I've tried reading the documentation but it seems too extensive and I don't know where to begin.

Could you show me how to scrape this table in to an array? I can try figuring out my way from there. A snippet/example would do just fine too.

Thanks.

解决方案

You can try jsoup: Java HTML Parser. It is an excellent library with good sample codes.

这篇关于一些帮助在Java中刮取页面的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

一些帮助在Java中刮取页面 [英] Some help scraping a page in Java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

一些帮助在Java中刮取页面 [英] Some help scraping a page in Java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭