解析Java中的元标记 [英] parse meta tags in Java

查看：87 发布时间：2018/12/22 18:40:51 java html xml parsing

本文介绍了解析Java中的元标记的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一组HTML文档，我需要解析< meta>的内容。 < head>中的标签部分。这些是我唯一感兴趣的值的HTML标签，即我不需要解析< body>中的任何内容。部分。

I have a collection of HTML documents for which I need to parse the contents of the <meta> tags in the <head> section. These are the only HTML tags whose values I'm interested in, i.e. I don't need to parse anything in the <body> section.

我试图使用JDom提供的XPath支持来解析这些值。但是，这并不是很好，因为< body>中的很多HTML都是如此。 section是无效的XML。

I've attempted to parse these values using the XPath support provided by JDom. However, this isn't working out too well because a lot of the HTML in the <body> section is not valid XML.

有没有人对我如何以可以处理格式错误的HTML的方式解析这些标记值有任何建议？

Does anyone have any suggestions for how I might go about parsing these tag values in manner that can deal with malformed HTML?

干杯，
Don

Cheers, Don

解析Java中的元标记 [英] parse meta tags in Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

解析Java中的元标记 [英] parse meta tags in Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭