使用TEmbeddedWB或TWebBrowser检测外部内容 [英] Detecting external content with TEmbeddedWB or TWebBrowser

查看:153
本文介绍了使用TEmbeddedWB或TWebBrowser检测外部内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图阻止TEmbeddedWB或TWebBrowser(或TCppWebBrowser)加载的外部任何东西。我想阻止从互联网加载的任何东西,包括图像,javascript,外部CSS,外部[embed]或[object]或[applet]或[frame]或[iframe],执行可加载外部内容等的JavaScript。 / p>

此问题由两部分组成:




  • 将网页浏览器放入限制所有(除了没有图像的基本HTML),并检测这些内容是否存在

  • 如果外部内容不存在,如果是,显示一个下载栏,点击后将Web浏览器放入下载所有模式并获取所有内容。



第一个项目有问题。在TEmbeddedWB中,您可以使用DownloadOptions开关封锁几乎任何东西,最重要的是ForceOffline开关,但即使关闭所有这些功能,它仍然会通过一些东西,如 [object] [iframe] 标签。我知道这是因为我实现了OnBeforeNavigate2事件,它触发了这些标签中包含的URL,并且还在本地服务器的日志中进行输入。在TEmbeddedWB中设置 OfflineMode ForceOfflineMode 对这些项目没有帮助。



那么我怎么能真正阻止所有?所以它需要从基本的HTML开始,阻止外部元素,包括脚本和CSS。有没有办法每次想要下载任何东西来触发一个事件,所以它可以被阻止,或者通过阻止所有的外部下载来避免触发这样的事件呢?我需要解决Internet Explorer区域和安全问题吗?任何指向正确方向的指针都是有帮助的。



第二个项目也很棘手,因为我需要检测是否存在有问题的标签(例如applet,script link等等。这种检测不一定是完美的,但至少要足以覆盖大部分这样的标签,我已经这样做了:

  // ------------------------------------- --------------------------------- 
//检查外部内容(图像,脚本,ActiveX,框架...)
// --------------------------------------- -------------------------------
try
{
bool HasExternalContent = false;
DelphiInterface< IHTMLDocument2&diDoc; //智能指针包装 - 应该自动调用release()并做引用计数
diDoc = TEmbeddedWB-> Document;

DelphiInterface< IHTMLElementCollection> diColApplets; DelphiInterface< ; IDispatch> diDispApplets; DelphiInterface< IHTMLObjectElement> diObj
DelphiInterface< IHTMLElementCollection> diColEmbeds; DelphiInterface< IDispatch> diDispEmbeds;
DelphiInterface< IHTMLFramesCollection2> diColFrames; DelphiInterface< IDispatch> diDispFrames;
DelphiInterface< IHTMLElementCollection> diColImages; DelphiInterface< IDispatch> diDispImages; DelphiInterface< IHTMLImgElement> ;
DelphiInterface< IHTMLElementCollection> diColLinks; DelphiInterface< IDispatch> diDispLinks;
DelphiInterface< IHTMLElementCollection> diColPlugins; DelphiInterface< IDispatch> diDispPlugins;
DelphiInterface< IHTMLElementCollection> diColScripts; DelphiInterface< IDispatch> diDispScripts;
DelphiInterface< IHTMLStyleSheetsCollection> diColStyleSheets; DelphiInterface< IDispatch> diDispStyleSheets;

OleCheck(diDoc-> Get_applets(diColApplets));
OleCheck(diDoc-> Get_embeds(diColEmbeds));
OleCheck(diDoc-> Get_frames(diColFrames));
OleCheck(diDoc-> Get_images(diColImages));
OleCheck(diDoc-> Get_links(diColLinks));
OleCheck(diDoc-> GetPlugins(diColPlugins));
OleCheck(diDoc-> Get_scripts(diColScripts));
OleCheck(diDoc-> Get_styleSheets(diColStyleSheets));

//扫描applets外部链接
for(int i = 0; i< diColApplets-> length; i ++)
{
OleCheck(diColApplets- < item(i,i,diDispApplets));
if(diDispApplets!= NULL)
{
diDispApplets-> QueryInterface(IID_IHTMLObjectElement,(void **)& diObj);
if(diObj!= NULL)
{
UnicodeString s1 = Sysutils :: Trim(diObj-> data),
s2 = Sysutils :: Trim(diObj-> codeBase),
s3 = Sysutils :: Trim(diObj-> classid);

if(StartsText(http,s1)|| StartsText(http,s2)|| StartsText(http,s3))
{
HasExternalContent =真正;
break; //至少找到1个酒吧,不再需要搜索
}
}
}
}

//扫描外部图像链接
for(int i = 0; i< diColImages-> length; i ++)
{
OleCheck(diColImages-> item(i,i,diDispImages));
if(diDispImages!= NULL)//不必要?如果适用,OleCheck会抛出异常
{
diDispImages-> QueryInterface(IID_IHTMLImgElement,(void **)& diImg)
if(diImg!= NULL)
{
UnicodeString s1 = Sysutils :: Trim(diImg-> src);

//不区分大小写检查
if(StartsText(http,s1))
{
HasExternalContent = true;
break; //至少找到1个酒吧,不再需要搜索
}
}
}
}
}
catch(Exception& e)
{
//由OleCheck
触发ShowMessage(e.Message);
}

有没有更简单的方法来扫描这个或只有一个是运行几个循环使用其他界面功能,例如 Get_applets Get_embeds Get_stylesheets

  OleCheck(diDoc-> Get_applets diColApplets)); 
OleCheck(diDoc-> Get_embeds(diColEmbeds));
OleCheck(diDoc-> Get_frames(diColFrames));
OleCheck(diDoc-> Get_images(diColImages));
OleCheck(diDoc-> Get_links(diColLinks));
OleCheck(diDoc-> GetPlugins(diColPlugins));
OleCheck(diDoc-> Get_scripts(diColScripts));
OleCheck(diDoc-> Get_styleSheets(diColStyleSheets));

但是,如果这样可以轻松处理,我宁愿不实现这么多循环。可以吗?

解决方案

我建议你这个解决方案:

  #includehtml.h
THTMLDocument doc;
void __fastcall TForm1 :: CppWebBrowser1DocumentComplete(TObject * Sender,LPDISPATCH pDisp,
Variant * URL)
{
doc.documentFromVariant(CppWebBrowser1> Document);

bool HasExternalContent = false; (int i = 0; i< doc.images.length; i ++){
if(doc.images [i] .src.SubString(1,4)==http)
b $ b {
HasExternalContent = true;
break; (int i = 0; i< doc.applets.length; i ++){
THTMLObjectElement obj = doc.applets [i];
}
}

if(obj.data.SubString(1,4)==http)
HasExternalContent = true;
if(obj.codeBase.SubString(1,4)==http)
HasExternalContent = true;
if(obj.classid.SubString(1,4)==http)
HasExternalContent = true;
}
}

这个greate包装类可以从这里


I am trying to block anything external loaded by TEmbeddedWB or TWebBrowser (or TCppWebBrowser). I would like to block anything that is loaded from Internet including images, javascript, external CSS, external [embed] or [object] or [applet] or [frame] or [iframe], executing JavaScript that can load external content etc.

This problem consists of 2 parts:

  • putting web browser into "restrict all" (except basic HTML without images) and detecting if such content exists
  • if external content is not present ok, if it is, showing a "download bar" which after click puts web browser into "download all" mode and gets all content.

First item has issues. In TEmbeddedWB you can block almost anything using DownloadOptions switches and most important is ForceOffline switch but even with all of that turned off it still passes through some things like [object] or [iframe] tags. I know this is the case because I implemented OnBeforeNavigate2 event and it triggers for URLs contained in these tags and it also makes an entry in log of local server. Setting OfflineMode and ForceOfflineMode in TEmbeddedWB doesn't help for these items.

So how can I really block all? So it needs to start as basic HTML with blocked external elements including scripts and CSS. Is there a way to trigger an event every time it wants to download anything so it can be blocked or avoiding triggering such event in the first place by blocking all external downloads? Do I need to fiddle with Internet Explorer zones and security? Any pointer in right direction would be helpful.

Second item is also tricky because I need to detect if problematic tags are present (such as "applet", "script", "link" etc. This detection doesn't need to be perfect but it must at least be good enough to cover most of such tags. I've done it like this:

//----------------------------------------------------------------------
// Check for external content (images, scripts, ActiveX, frames...)
//----------------------------------------------------------------------
try
    {    
    bool                                HasExternalContent = false;
    DelphiInterface<IHTMLDocument2>     diDoc;                              // Smart pointer wrapper - should automatically call release() and do reference counting
    diDoc = TEmbeddedWB->Document;

    DelphiInterface<IHTMLElementCollection>     diColApplets;           DelphiInterface<IDispatch>          diDispApplets;      DelphiInterface<IHTMLObjectElement> diObj;
    DelphiInterface<IHTMLElementCollection>     diColEmbeds;            DelphiInterface<IDispatch>          diDispEmbeds;
    DelphiInterface<IHTMLFramesCollection2>     diColFrames;            DelphiInterface<IDispatch>          diDispFrames;
    DelphiInterface<IHTMLElementCollection>     diColImages;            DelphiInterface<IDispatch>          diDispImages;       DelphiInterface<IHTMLImgElement>    diImg;
    DelphiInterface<IHTMLElementCollection>     diColLinks;             DelphiInterface<IDispatch>          diDispLinks;
    DelphiInterface<IHTMLElementCollection>     diColPlugins;           DelphiInterface<IDispatch>          diDispPlugins;
    DelphiInterface<IHTMLElementCollection>     diColScripts;           DelphiInterface<IDispatch>          diDispScripts;
    DelphiInterface<IHTMLStyleSheetsCollection> diColStyleSheets;       DelphiInterface<IDispatch>          diDispStyleSheets;

    OleCheck(diDoc->Get_applets     (diColApplets));
    OleCheck(diDoc->Get_embeds      (diColEmbeds));
    OleCheck(diDoc->Get_frames      (diColFrames));
    OleCheck(diDoc->Get_images      (diColImages));
    OleCheck(diDoc->Get_links       (diColLinks));
    OleCheck(diDoc->Get_plugins     (diColPlugins));
    OleCheck(diDoc->Get_scripts     (diColScripts));
    OleCheck(diDoc->Get_styleSheets (diColStyleSheets));

    // Scan for applets external links
    for (int i = 0; i < diColApplets->length; i++)
        {
        OleCheck(diColApplets->item(i,i,diDispApplets));
        if (diDispApplets != NULL)
            {
            diDispApplets->QueryInterface(IID_IHTMLObjectElement, (void**)&diObj);
            if (diObj != NULL)
                {
                UnicodeString s1 = Sysutils::Trim(diObj->data),
                              s2 = Sysutils::Trim(diObj->codeBase),
                              s3 = Sysutils::Trim(diObj->classid);

                if (StartsText("http", s1) || StartsText("http", s2) || StartsText("http", s3))
                    {
                    HasExternalContent = true;
                    break;                                                  // At least 1 found, bar will be shown, no further search needed
                    }
                }
            }
        }

    // Scan for images external links
    for (int i = 0; i < diColImages->length; i++)
        {
        OleCheck(diColImages->item(i,i,diDispImages));
        if (diDispImages != NULL)                                           // Unnecessary? OleCheck throws exception if this applies?
            {
            diDispImages->QueryInterface(IID_IHTMLImgElement, (void**)&diImg);
            if (diImg != NULL)
                {
                UnicodeString s1 = Sysutils::Trim(diImg->src);

                // Case insensitive check
                if (StartsText("http", s1))
                    {
                    HasExternalContent = true;
                    break;                                                  // At least 1 found, bar will be shown, no further search needed
                    }
                }
            }
        }
    }
catch (Exception &e)
    {
    // triggered by OleCheck
    ShowMessage(e.Message);
    }

Is there an easier way to scan this or the only one is to run several loops using other interface functions such as Get_applets, Get_embeds, Get_stylesheets etc. similar to code above? So far I found I'd have to call following functions to cover all of this:

    OleCheck(diDoc->Get_applets     (diColApplets));
    OleCheck(diDoc->Get_embeds      (diColEmbeds));
    OleCheck(diDoc->Get_frames      (diColFrames));
    OleCheck(diDoc->Get_images      (diColImages));
    OleCheck(diDoc->Get_links       (diColLinks));
    OleCheck(diDoc->Get_plugins     (diColPlugins));
    OleCheck(diDoc->Get_scripts     (diColScripts));
    OleCheck(diDoc->Get_styleSheets (diColStyleSheets));

But I'd rather not implement that many loops if this can be handled easier. Can it?

解决方案

I suggest you this solution:

#include "html.h"
THTMLDocument doc;
void __fastcall TForm1::CppWebBrowser1DocumentComplete(TObject *Sender, LPDISPATCH pDisp,
          Variant *URL)
{
    doc.documentFromVariant(CppWebBrowser1->Document);

    bool HasExternalContent = false;
    for (int i=0; i<doc.images.length; i++) {
        if(doc.images[i].src.SubString(1, 4) == "http")
        {
            HasExternalContent = true;
            break;
        }
    }
    for (int i=0; i<doc.applets.length; i++) {
        THTMLObjectElement obj = doc.applets[i];
        if(obj.data.SubString(1, 4) == "http")
            HasExternalContent = true;
        if(obj.codeBase.SubString(1, 4) == "http")
            HasExternalContent = true;
        if(obj.classid.SubString(1, 4) == "http")
            HasExternalContent = true;
    }
}

This greate wrapper classes can be downloaded from here.

这篇关于使用TEmbeddedWB或TWebBrowser检测外部内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆