如何在VB6中使用MSHTML Parser剥离所有HTML标签? [英] How to strip ALL HTML tags using MSHTML Parser in VB6?

查看:142
本文介绍了如何在VB6中使用MSHTML Parser剥离所有HTML标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在VB6中使用MSHTML Parser剥离所有HTML标记?

How to strip ALL HTML tags using MSHTML Parser in VB6?

推荐答案

这是从CodeGuru的Code改编而来的.非常感谢原始作者: http://www.codeguru.com/vb/vb_internet/html/article .php/c4815

This is adapted from Code over at CodeGuru. Many Many thanks to the original author: http://www.codeguru.com/vb/vb_internet/html/article.php/c4815

如果需要从Web下载HTML,请检查原始源.例如:

Check the original source if you need to download your HTML from the web. E.g.:

Set objDocument = objMSHTML.createDocumentFromUrl("http://google.com", vbNullString)

我不需要从Web上下载HTML存根-我已经在内存中存了存根.因此,原始资料并不完全适用于我.我的主要目标只是让合格的DOM解析器为我从用户生成的内容中剥离HTML.有人会说:为什么不只使用RegEx剥离HTML?"祝你好运!

I don't need to download the HTML stub from the web - I already had my stub in memory. So the original source didn't quite apply to me. My main goal is just to have a qualified DOM Parser strip the HTML from the User generated content for me. Some would say, "Why not just use some RegEx to strip the HTML?" Good luck with that!

添加对以下内容的引用:Microsoft HTML对象库

Add a reference to: Microsoft HTML Object Library

这是运行Internet Explorer(IE)的同一HTML解析器-让黑客开始.好吧,赫​​克走开...

This is the same HTML Parser that runs Internet Explorer (IE) - Let the heckling begin. Well, Heckle away...

这是我使用的代码:

Dim objDocument As MSHTML.HTMLDocument
Set objDocument = New MSHTML.HTMLDocument

'NOTE: txtSource is an instance of a simple TextBox object
objDocument.body.innerHTML = "<p>Hello World!</p> <p>Hello Jason!</p> <br/>Hello Bob!"
txtSource.Text = objDocument.body.innerText

txtSource.Text 中的结果文本是我的用户内容,其中不包括所有HTML.清洁且可维护-对我而言,没有克苏鲁之路.

The resulting text in txtSource.Text is my User's Content stripped of all HTML. Clean and maintainable - No Cthulhu Way for me.

这篇关于如何在VB6中使用MSHTML Parser剥离所有HTML标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆