如何在C#中编写Http聚合器 [英] How Can I Write An Http Aggregator In C#

查看:116
本文介绍了如何在C#中编写Http聚合器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如我想在网站中存储网址URL地址网站URL将存储在硬编码数组中。



例如A [1] =www .msn.com;

A [2] = www.gmail.com; .. ..等等..



这个程序一直在运行并等待某些条件(在这种情况下是时间)发生,



例如当时间是早上6点时,程序会点击数组中列出的所有网站,并在output.html文件中获取HTTP内容(最终的HTML输出)。



例如,如果www.msn.com返回以下HTTP内容

< html> <身体GT;这是Msn.com< / body> < / HTML>



和www.hotmail.com返回以下HTTP内容

< html> <身体GT;这是hotmail.com< / body> < / HTML>



如果这是唯一的两个网站(在数组中)而不是output.html形式的最终输出将是:

< ; HTML> <身体GT;这是Msn.com< / body>

< / html> < HTML> <身体GT;这是hotmail.com< / body> < / html>



在第一阶段,它只是一个简单的C#应用​​程序。



可通过app.config或同等版本配置什么? ·



以小时为单位的时间(早上6点):分钟:AM / PM格式,显示何时生成该文件。生成文件的文件名·应保存文件的文件夹。



条件:我不会重新启动该应用程序。对app.config的更改是立即的。

解决方案

现在更加清晰。



以下是全球范围内如何实现目标:



I)在特定时间运行应用程序。

-最佳解决方案:使用Windows任务。 Windows允许您在特定时间(每周或每月)运行特定软件。

=>因此,您不再需要担心您的应用每天都会在特定时间运行。



- 对于文件位置,您可能会在代码中显示您的位置。

指定app.config中的位置是可能的(我猜),但我现在不知道怎么做。



II)捕获webPages html代码的一般想法。

我正在使用这种应用程序。你有2个解决方案。使用WebBrowser或httpwebRequets。

最简单的是后者。 httpwebRequest允许您同步获取页面的html代码。它是完美的开始这种技术。



这是一个获取url html代码的示例:



  public   string  CapturePage( string  url)
{
HttpWebRequest request =(HttpWebRequest)WebRequest.Create(url);
request.Accept = * / *;

request.AllowAutoRedirect = true ;
request.Timeout = 60000 ;
request.UserAgent = http_requester / 0.1;
request.Method = GET;
request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

HttpWebResponse response =(HttpWebResponse)request.GetResponse();

StreamReader sr = new StreamReader(response.GetResponseStream());
/// copier les donnees du stream dans une variable
string sourceCode = sr.ReadToEnd();

sr.Close();
response.Close();

return sourceCode;

}



我认为你现在不必使用线程。另一方面,它更好地将应用程序开发为控制台应用程序,因此它在任务完成时自动关闭。





III)计时器:

计时器与线程没有任何关系。我的想法是实现一个或多个计时器。

一旦计时器达到其冷却时间,就会引发一个事件。



这里有一个理解逻辑的链接:(MSDN)

http:// msdn。 microsoft.com/fr-fr/library/system.timers.timer(v=vs.110).aspx [ ^ ]



漂亮容易:)不需要线程



不要犹豫提出更多问题。



希望它有帮助


for e.g i want to store website url address in array Website URLs will be stored in the hard coded array.

For example A[1] = "www.msn.com";
A[2]= www.gmail.com; .. .. Etc..

This program is always running and waiting for certain condition ( in this case a time) to occur,

for example when the time is 6.00 AM the program hits all the websites listed in the array and fetch the HTTP content ( the final HTML output) in a output.html file.

For example if www.msn.com returns the following HTTP content
<html> <body> This is Msn.com </body> </html>

and www.hotmail.com returns the following HTTP content
<html> <body> This is hotmail.com </body> </html>

If these are the only two website ( in the array) than the final output in form of output.html would be :
<html> <body> This is Msn.com </body>
</html> <html> <body> This is hotmail.com </body> </html>

In a first phase it will be just a simple C# application.

What would be configurable via app.config or equivalent ? ·

The time ( 6.00 AM) in hour : minute : AM/PM format, which shows when it should generate that file. The file name of the generated file · The folder where it should save the file.

Condition : I will NOT re-start the application. The changes to the app.config are immediate.

解决方案

Hi, its much clearer now.

Here is globaly how you can achive your goal :

I) Running an app at a specific time.
-the best solution : use windows tasks. windows allow you to run a specific software at a specific time (daily weekly or monthly).
=> So you will no longer have to worry about your app running at a specifc time each day.

-For the file location, you will probably specife your location in the code.
specifiying the location in the app.config is possible (i guess), but i don't now how to do it.

II) General Idea to capture webPages html code.
I am working a lot with this kind of apps. you have 2 solutions. either use WebBrowser or httpwebRequets.
the easiest is the later. httpwebRequest allow you te get the html code of a page synchronously. its perfect to start with this technique.

Here is a sample to get an url html code:

public string CapturePage(string url)
        {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            request.Accept = "*/*";

            request.AllowAutoRedirect = true;
            request.Timeout = 60000;
            request.UserAgent = "http_requester/0.1";
            request.Method = "GET";
            request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            StreamReader sr = new StreamReader(response.GetResponseStream());
            ///copier les donnees du stream dans une variable
            string sourceCode = sr.ReadToEnd();            
            
            sr.Close();
            response.Close();

            return sourceCode;

        }


I don't think you will have to use threads for now. In the other hand, its much better to develop the app as console application, so it automatically close when the task is finished.


III) timers:
Timers don't have anything to do with threads. the idea is to instanciate one or many timers.
once the timer reach its cooldown time, an event is raised.

here is a link to understand the logic: (MSDN)
http://msdn.microsoft.com/fr-fr/library/system.timers.timer(v=vs.110).aspx[^]

Pretty Easy :) no thread needed

Don't hesitate to ask more questions.

Hope it helps.


这篇关于如何在C#中编写Http聚合器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆