使用VB.NET来检测网页的变化 [英] Using VB.NET to Detect Changes in a Web Page

查看:232
本文介绍了使用VB.NET来检测网页的变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我再来找你们为你的,我有一个问题,专业知识和建议。我在想,如果有,你会知道如何在一个网页已经用VB.NET修改检测。我需要能够建立其定期(如每周一次)扫描用户输入的网页,如果网页内容发生了变化,我需要火了一封电子邮件给一个人,它已经改变了(不是一个任务页面本身)的确切位置。我将存储HTTP状态,当然页面数据本身以及当它的最后修改日期。当然,这需要非常容错,因为支票再次运行之前它可能是一个星期。任何帮助将是巨大的。谢谢。

Again I come to you guys for your expertise and advice on an issue that I am having. I was wondering if any of you would know how to detect if a web page has been modified using VB.NET. I need to be able to set up a task which periodically (like once a week) scans the user inputted web pages and if the web page content has changed, I need to fire off an email to an individual that it has changed (not the exact location on the page itself). I'll be storing the HTTP status and of course the page data itself as well as the date of when it was last modified. Of course this needs to be very fault tolerant since it could be another week before the check runs again. Any help would be great. Thank you.

修改

EDIT

在这个问题上很抱歉新的转折。我有更多的时间去思考我们想要的东西。所以...检测网页上的任何变化将是一种愚蠢的,因为页面的时间相关的元素会这么经常改变每一个。相反,我会想要做的是能够检测文档页面。例如,如果有的Excel,Word文档,或PDF是那些获得该网页上的改变。所以,我跑这些文件的散列然后在某种时间表做一次检查,以查看是否有新的文件被添加或者如果旧文件已被修改。关于如何检测嵌入在网页和运行散列上的文件的任何建议?再次感谢!

New twist on this question sorry. I had more time to think about what we wanted. So... Detecting ANY change on a web page would be kind of silly since time dependent elements of the page would change every so often. Instead, what I would like to do is be able to detect the documents in the page. For instance if there are excel, word docs, or pdfs that get changed on that page. So, I'd run the hash on these documents then on some sort of schedule do a check to see if new documents have been added or if the old documents have been modified. Any suggestions on how to detect the documents embedded on the page and running the hash? Thanks again!

推荐答案

在我的评论中提到,这种工作是什么的校验(也称为散列函数)被设计为

As I mentioned in a comment, this sort of job is what checksums (also known as hash functions) were designed for.

您code代表会是这个样子:

You code for will look something like this:

- for each webpage of interest
  - pull webbpage
  - calculate checksum of contents
  - is current checksum different to last checksum?
    - if yes, send email
  - store new checksum and other appropriate data

在.NET框架有许多可用的校验和。两种最流行的是 MD5 和<一个HREF =htt​​p://msdn.microsoft.com/en-us/library/system.security.cryptography.sha1.aspx相对=nofollow> SHA1

这篇关于使用VB.NET来检测网页的变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆