PHP SAX解析器的HTML吗? [英] PHP SAX parser for HTML?
问题描述
我需要用于PHP的HTML SAX(不是DOM!)解析器,该解析器甚至可以处理无效的HTML代码. 我需要它的原因是过滤用户输入的HTML(删除所有属性和标签 除外),然后将HTML内容截断为指定的长度.
I need HTML SAX (not DOM!) parser for PHP able to process even invalid HTML code. The reason i need it is to filter user entered HTML (remove all attributes and tags except allowed ones) and truncate HTML content to specified length.
有什么想法吗?
推荐答案
SAX可以处理有效的XML并在无效标记时失败.处理无效的HTML标记需要保持比SAX解析器通常保留的状态更多的状态.
SAX was made to process valid XML and fail on invalid markup. Processing invalid HTML markup requires keeping more state than SAX parsers typically keep.
我不知道任何类似HTML的类似于SAX的解析器.最好的方法是先将HTML传递给整洁,然后再使用XML解析器,但这可能会打败您首先使用SAX解析器的目的.
I'm not aware of any SAX-like parser for HTML. Your best shot is to use to pass the HTML through tidy before and then use a XML parser, but this may defeat your purpose of using a SAX parser in the first place.
这篇关于PHP SAX解析器的HTML吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!