正则表达式去掉HTML标签 [英] Regex to strip HTML tags

查看：94 发布时间：2018/6/13 16:23:41 java html regex

本文介绍了正则表达式去掉HTML标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个HTML输入：

I have this HTML input:

some text another text

我想用正则表达式去除HTML标签，以便输出：

I'd like to use regex to remove the HTML tags so that the output is:

some text another text

任何人都可以建议如何使用正则表达式来完成这项工作吗？

Can anyone suggest how to do this with regex?

推荐答案

Jericho Html解析器。

You can go with HTML parser called Jericho Html parser.

你可以从这里下载 - http://jericho.htmlparser.net/docs/index.html

you can download it from here - http://jericho.htmlparser.net/docs/index.html

Jericho HTML Parser是一个Java库，允许分析和处理零件的HTML文档（包括服务器端标签），同时复制任何无法识别或无效的HTML。它还提供了高级别的HTML表单操作功能。

Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognized or invalid HTML. It also provides high-level HTML form manipulation functions.

格式不正确的HTML不会影响解析

The presence of badly formatted HTML does not interfere with the parsing

这篇关于正则表达式去掉HTML标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

正则表达式去掉HTML标签 [英] Regex to strip HTML tags

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

正则表达式去掉HTML标签 [英] Regex to strip HTML tags

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭