C＃：履带式工程 [英] c#: crawler project

查看：120 发布时间：2016/6/7 21:05:07 c# asp.net

本文介绍了C＃：履带式工程的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我能变得非常容易遵循以下code例子：

Could I get very easy to follow code examples on the following:

使用浏览器控件发起请求到目标网站。

从捕获目标网站的响应。

转换成响应DOM对象。

通过像名字，姓氏等DOM对象和捕捉迭代的事情，如果他们反应的一部分。

感谢

推荐答案

下面为code，它使用一个WebRequest对象检索数据并捕获响应作为流。

Here is code that uses a WebRequest object to retrieve data and captures the response as a stream.

    public static Stream GetExternalData( string url, string postData, int timeout )
    {
        ServicePointManager.ServerCertificateValidationCallback += delegate( object sender,
                                                                                X509Certificate certificate,
                                                                                X509Chain chain,
                                                                                SslPolicyErrors sslPolicyErrors )
        {
            // if we trust the callee implicitly, return true...otherwise, perform validation logic
            return [bool];
        };

        WebRequest request = null;
        HttpWebResponse response = null;

        try
        {
            request = WebRequest.Create( url );
            request.Timeout = timeout; // force a quick timeout

            if( postData != null )
            {
                request.Method = "POST";
                request.ContentType = "application/x-www-form-urlencoded";
                request.ContentLength = postData.Length;

                using( StreamWriter requestStream = new StreamWriter( request.GetRequestStream(), System.Text.Encoding.ASCII ) )
                {
                    requestStream.Write( postData );
                    requestStream.Close();
                }
            }

            response = (HttpWebResponse)request.GetResponse();
        }
        catch( WebException ex )
        {
            Log.LogException( ex );
        }
        finally
        {
            request = null;
        }

        if( response == null || response.StatusCode != HttpStatusCode.OK )
        {
            if( response != null )
            {
                response.Close();
                response = null;
            }

            return null;
        }

        return response.GetResponseStream();
    }

有关管理的反应，我有我使用自定义的Xhtml分析器，但它是千code的行。有几个公开可用的解析器（见达林的评论）。

For managing the response, I have a custom Xhtml parser that I use, but it is thousands of lines of code. There are several publicly available parsers (see Darin's comment).

编辑：每在OP的问题，标头可以添加到模拟用户代理的请求。例如：

per the OP's question, headers can be added to the request to emulate a user agent. For example:

request = (HttpWebRequest)WebRequest.Create( url );
                request.Accept = "application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, */*";
                request.Timeout = timeout;
                request.Headers.Add( "Cookie", cookies );

                //
                // manifest as a standard user agent
                request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US)";

这篇关于C＃：履带式工程的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

C＃：履带式工程 [英] c#: crawler project

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

C＃：履带式工程 [英] c#: crawler project

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭