C#和Internet Explorer自动化,访问缓存 [英] C# and Internet Explorer automation, accessing the cache

查看:146
本文介绍了C#和Internet Explorer自动化,访问缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在C#中有一个Internet Explorer自动化脚本,它可以正常工作,但是我想访问一个验证码图像,每次访问该验证码链接都会返回刷新的图像,并且由于浏览器已经访问过一次,因此再次访问它会搞砸了,所以我尝试使用以下代码在磁盘上的浏览器缓存中找到图像

I have an Internet Explorer automation script in c#, it works ok but I want to access a captcha image the captcha link returns a refreshed image every time it is visited, and since the browser has already visited it once visiting it again would mess things up, so I tried to find the image in the browsers cache on the disk with the following code

tempDir = Environment.GetFolderPath(Environment.SpecialFolder.InternetCache).ToString();
System.Console.WriteLine(tempDir);
supstra = element.innerHTML.ToString().Substring(element.innerHTML.ToString().IndexOf("/sorry/image?id="), element.innerHTML.ToString().Length - element.innerHTML.ToString().IndexOf("/sorry/image?id="));
Console.WriteLine("http://www.goolge.com/sorry/image?id=" + element.innerHTML.ToString().Substring(element.innerHTML.ToString().IndexOf("/sorry/image?id="), supstra.IndexOf("&hl=")));
captchas = client.Decode(tempDir + "\\" + element.innerHTML.ToString().Substring(element.innerHTML.ToString().IndexOf("/sorry/image?id=") + 7, supstra.IndexOf("&hl=")).Replace("amp;", "") + "=en", 0);

但是,缓存目录中的映像不是映像,而是命令或名称为image?id=....

The image however in the cache directory is not an image but a command or something with the name image?id=....

,它所做的就是重新访问并获取新图像.我似乎要做的似乎是以某种方式访问​​浏览器显示的图像,该图像可能仅在内存中,我该怎么办?

and all it does is revisit and get new image. What do I have to do it seems is to somehow access the image the browser is showing, which might be only in the memory, how can I do that?

推荐答案

具体来说,是从以下问题开始的:

Specifically, from the question:

由于Internet Explorer已经在显示网页,因此网页中的图像必须已经存储在本地缓存中的某个位置

Since the Internet Explorer is already displaying the webpage, the images in the webpage must already be stored somewhere in local cache

答案(强调我的意思):

And the answer (emphasis mine):

您要使用 GetUrlCacheEntryInfo().

使用INTERNET_CACHE_ENTRY_INFO结构的lpszLocalFileName 从函数返回时.

Use the lpszLocalFileName of the INTERNET_CACHE_ENTRY_INFO structure upon return from the function.

此外,您的场所之一存在缺陷. 有时IE仅具有一个 图像和磁盘上项目的内存表示形式 删除.例如,如果no-cache指令具有 被设置.或者用户已清除其缓存,但未从 这一页.或拾荒者已将其删除,但用户尚未删除 导航.可能还有5到7种其他情况.

Furthermore, one of your premises is flawed. Sometimes IE only has an in-memory representation of the image and the item on disk has been deleted. This is the case if, for example, the no-cache directive has been set. Or the user has cleared their cache but not navigated from the page. Or the scavenger has deleted it but the user hasn't navigated. There are probably 5 to 7 other scenarios as well.

过去,当我不得不做类似的事情时,我强迫Web浏览器(在本例中为IE)使用类似 Fiddler2 作为代理.然后,在Fiddler2中,我可以截取对特定URL的图像请求,并使用C#将它们保存到磁盘上的已知位置.然后,自动化程序可以从那里抓取它们.

In the past when I've had to do something similar, I force the web browser (IE in this case) to use something like Fiddler2 as a proxy. In Fiddler2, I can then intercept the image requests for a particular URL and use C# to save them to disk in a known location. The automation program can then grab them from there.

这篇关于C#和Internet Explorer自动化,访问缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆