如何以编程方式登录网站并使用ASP.NET解析响应的HTML [英] How to programatically login to a website and parse the HTML of the response USING ASP.NET

查看:84
本文介绍了如何以编程方式登录网站并使用ASP.NET解析响应的HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,
我有一个包含用户名和密码的数据库表,这些凭据一次只能用于登录一个网站.每次登录时,我都想访问用户的页面(帐户页面)并阅读(解析)HTML.

改进:
通过解析,我的意思是,我希望将响应作为字符串,然后可以操纵字符串.
这是一个基于Web的应用程序,驻留在服务器上,客户端可以通过浏览器访问它.
我正在使用sql server 2005数据库存储从登录名中检索到的信息.

我如何使用ASP.NET做到这一点?甚至有可能吗?

[从OP的答案移出]
我自己不能编码.在网上搜索了一个有效的示例已经将近3周,但没有找到任何东西.所有人都说可以做到,但根本没有有效的代码.没有人给我答案.相信我,我的头发从此变得灰白了.

Hello everyone,
I have a database table which contain usernames and passwords, these credentials are to be used one at a time to login to a website. upon every login i would like to access a page(account page) of the user and read(parse) the HTML.

Improvements:
By Parsing i really mean, i want the response as a String, then i can manipulate the String.
This is a web Based Application, resides on server and clients access it via browser.
i am using sql server 2005 database to store the information which i retrieve from the login.

how can i do this using ASP.NET?, is it is even possible?

[Moved from OP''s answer]
I''m not able to code by myself. Been searching for a working example on the net for almost 3 weeks, but didn''t find anything. Everyone says it can be done, but no working code at all. No one gives me the answer. Believe me, my hair is becoming gray out of this.

推荐答案

与ASP.NET无关.您需要使用类System.Net.HttpWebRequest,但是要获取其实例,您需要使用类System.Net.WebRequest及其方法Create,因为运行时类型是由URI定义的.
参见:
http://msdn.microsoft.com/en-us/library/system.net. httpwebrequest.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system.net. webrequest.aspx [ ^ ].

在此处找到用于登录的代码示例: http://stackoverflow.com/questions/450380/使用httpwebrequest登录页面 [ ^ ];使用HTTPS协议的方式相同,代码相同,但是根证书应在系统上注册,请参见certmgr.msc.

另请参见:
http://en.wikipedia.org/wiki/Public_key_certificate [ http://en.wikipedia.org/wiki/Certificate_authority [ http://en.wikipedia.org/wiki/Self-signed_certificate [ http://www.majestic12.co.uk/projects/html_parser.php [ ^ ],Google提供了更多信息.

—SA
It has nothing to do with ASP.NET. You need to use the class System.Net.HttpWebRequest, but to obtain its instance you need to use the class System.Net.WebRequest and its method Create, as the run-time type is defined by the URI.

See:
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx[^],
http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx[^].

Find a code sample for login here: http://stackoverflow.com/questions/450380/login-to-the-page-with-httpwebrequest[^]; HTTPS protocol is used the same way, the code is the same but the root certificate should be registered on the system, see certmgr.msc.

See also:
http://en.wikipedia.org/wiki/Public_key_certificate[^],
http://en.wikipedia.org/wiki/Certificate_authority[^],
http://en.wikipedia.org/wiki/Self-signed_certificate[^] and links from these articles.

Now, parsing of the response…

The ideal case would be if the response provided well-formed XML, which is well supported by .NET parsers. If this is not the case, you can find some HTML parser which does not assume well-formed XML. Try this one:
http://www.majestic12.co.uk/projects/html_parser.php[^], Google for some more.

—SA


以下是成功登录的步骤:

概述:
这需要一个程序来模拟人类通过浏览器查看站点时的行为.也就是说,当您将站点的(登录页面)输入到浏览器的地址栏中并按Enter键时,您正在向服务器发出GET请求.然后,服务器会通过cookie(通常)与您的浏览器一起使用,并启动会话,从那时开始,无论发生什么交互,都会通过COOKIES进行记录.

所以:
1)发起一个httpwebrequest并使方法GET并捕获cookie.
2)提交html表单定义的表单HTTP标头(即POST METHOD:GET/POST)并捕获cookie.
3)接收响应作为httpwebrespone并按照规范使用流.

这是基本理论.每个步骤都需要您确定的某些设置,并且因情况而异
below are the steps to take for a successful login:

overview:
this requires a program to simulate what a human does when viewing a site via browser. That is, when you input a site''s(login page) into the address bar of the browser and hit enter, you are making a GET request to the server. the server then works with your browser via cookies(commonly) and initiates a session, from then on what ever interactions, get logged via COOKIES.

so:
1)initiate a httpwebrequest and make the method GET and capture the cookie.
2)submit the forms HTTP header( ie. POST METHOD:GET/POST) as defined by the html form and capture the cookie.
3)receive the response as a httpwebrespone and work with the stream as per spec.

this is the basic theory. each step will require certain settings which you need to determine and its different from case to case


有一个非常不错的视频以及代码和相关内容!
http://youtu.be/UV5_b5oaUQk [ ^ ]
there is a very nice how to video along with code and errything!
http://youtu.be/UV5_b5oaUQk[^]


这篇关于如何以编程方式登录网站并使用ASP.NET解析响应的HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆