用Java剥离HTML标签 [英] Stripping HTML tags in Java
问题描述
是否有一个现有的Java库提供了一种从字符串中去除所有HTML标记的方法?我正在寻找相当于PHP中的 strip_tags
函数的东西。
我知道我可以使用这个Stackoverflow问题,但是我很好奇,如果可能已经存在一个在Apache Commons库中的某个地方浮动的 stripTags()
方法,可以使用。
在将这个问题打开近一周后,我可以肯定地说,没有方法可从Java API或Apache库中获得,它们从String中剥离HTML标签。您将不得不像前面的答案中所描述的那样使用HTML解析器,或者编写一个简单的正则表达式来去掉标签。
Is there an existing Java library which provides a method to strip all HTML tags from a String? I'm looking for something equivalent to the strip_tags
function in PHP.
I know that I can use a regex as described in this Stackoverflow question, however I was curious if there may already be a stripTags()
method floating around somewhere in the Apache Commons library that can be used.
After having this question open for almost a week, I can say with some certainty that there is no method available in the Java API or Apache libaries which strips HTML tags from a String. You would either have to use an HTML parser as described in the previous answers, or write a simple regular expression to strip out the tags.
这篇关于用Java剥离HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!