如何在VB6中使用MSHTML Parser剥离所有HTML标签? [英] How to strip ALL HTML tags using MSHTML Parser in VB6?
问题描述
如何在VB6中使用MSHTML Parser剥离所有HTML标记?
How to strip ALL HTML tags using MSHTML Parser in VB6?
推荐答案
这是从CodeGuru的Code改编而来的.非常感谢原始作者: http://www.codeguru.com/vb/vb_internet/html/article .php/c4815
This is adapted from Code over at CodeGuru. Many Many thanks to the original author: http://www.codeguru.com/vb/vb_internet/html/article.php/c4815
如果需要从Web下载HTML,请检查原始源.例如:
Check the original source if you need to download your HTML from the web. E.g.:
Set objDocument = objMSHTML.createDocumentFromUrl("http://google.com", vbNullString)
我不需要从Web上下载HTML存根-我已经在内存中存了存根.因此,原始资料并不完全适用于我.我的主要目标只是让合格的DOM解析器为我从用户生成的内容中剥离HTML.有人会说:为什么不只使用RegEx剥离HTML?"祝你好运!
I don't need to download the HTML stub from the web - I already had my stub in memory. So the original source didn't quite apply to me. My main goal is just to have a qualified DOM Parser strip the HTML from the User generated content for me. Some would say, "Why not just use some RegEx to strip the HTML?" Good luck with that!
添加对以下内容的引用:Microsoft HTML对象库
Add a reference to: Microsoft HTML Object Library
这是运行Internet Explorer(IE)的同一HTML解析器-让黑客开始.好吧,赫克走开...
This is the same HTML Parser that runs Internet Explorer (IE) - Let the heckling begin. Well, Heckle away...
这是我使用的代码:
Dim objDocument As MSHTML.HTMLDocument
Set objDocument = New MSHTML.HTMLDocument
'NOTE: txtSource is an instance of a simple TextBox object
objDocument.body.innerHTML = "<p>Hello World!</p> <p>Hello Jason!</p> <br/>Hello Bob!"
txtSource.Text = objDocument.body.innerText
txtSource.Text 中的结果文本是我的用户内容,其中不包括所有HTML.清洁且可维护-对我而言,没有克苏鲁之路.
The resulting text in txtSource.Text is my User's Content stripped of all HTML. Clean and maintainable - No Cthulhu Way for me.
这篇关于如何在VB6中使用MSHTML Parser剥离所有HTML标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!