如何从cookie中获得的Web会议? [英] How to get Web Session from cookie?
问题描述
我试图做一个刮一个网页,但为了文章中,我需要一个像
Web会话ID的数据web_session = HQJ3G1GPAAHRZGFR
我怎样才能拿到身份证?
我的code到目前为止是:
私人小组测试()
昏暗POSTDATA作为字符串= "web_session=HQJ3G1GPAAHRZGFR&intext=O&term_$c$c=201210&search_type=A&keyword=&kw_scope=all&kw_opt=all&subj_$c$c=BIO&crse_numb=205&campus=*&instructor=*&instr_session=*&attr_type=*&mon=on&tue=on&wed=on&thu=on&fri=on&sat=on&sun=on&avail_flag=on" /BANPROD/pkgyc_yccsweb.P_Results
昏暗tempCookie作为新的CookieContainer
昏暗的编码作为新UTF8Encoding
昏暗byteData为字节()= encoding.GetBytes(POSTDATA)
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
尝试
tempCookie.GetCookies(新的URI(https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results))
POSTDATA =web_session =&放大器; tempCookie。
昏暗postReq由于HttpWebRequest的= DirectCast(WebRequest.Create(https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results),HttpWebRequest的)
postReq.Method =POST
postReq.KeepAlive = TRUE
postReq.CookieContainer = tempCookie
postReq.ContentType =应用/的X WWW的形式urlen codeD
postReq.UserAgent =Mozilla的/ 4.0(兼容; MSIE 8.0; Windows NT的5.1;三叉戟/ 4.0; .NET CLR 1.0.3705; Media Center PC的4.0; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4 .0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
postReq.ContentLength = byteData.Length
昏暗postreqstream作为物流= postReq.GetRequestStream
postreqstream.Write(byteData,0,byteData.Length)
postreqstream.Close()
昏暗postresponse由于HttpWebResponse
postresponse = DirectCast(postReq.GetResponse,HttpWebResponse)
tempCookie.Add(postresponse.Cookies)
昏暗postresreader作为新的StreamReader(postresponse.GetResponseStream)
昏暗的翻动书页的String = postresreader.ReadToEnd
MSGBOX(翻动书页)
抓住EX作为WebException
MSGBOX(ex.Status.ToString&安培; vbNewLine&安培; ex.Message.ToString)
结束尝试
结束小组
问题是, tempCookie.GetCookies()
不是做你认为是什么它做什么。什么它实际上主要是一个过滤器pre-现有的 CookieCollection
下降到仅包括饼干所提供的URL。相反,你需要做的就是先创建一个请求一个页面,会给你这个会话令牌,然后为您的数据的实际要求是什么。因此第一个请求的页面 P_SEARCH
,然后再利用,与的CookieContainer
绑定到它,并张贴到<请求code> P_Results 。
的HttpWebRequest
对象不过,让我指出你的 Web客户端
类的<a href = http://stackoverflow.com/questions/2825377/how-can-i-get-the-webclient-to-use-cookies">my在这里发表有关扩展到支持cookie 。你会发现,你可以简化您的codeA不少。下面是一个完整的VB2010工作WinForms应用程序,显示这一点。如果你仍然想使用的HttpWebRequest
对象,这至少应该给你一个什么需要做的事情的想法,太:
选项严
显式的选项在
进口System.Net
公共类Form1中
私人小组Form1_Load的(发件人为System.Object的,E作为System.EventArgs)把手MyBase.Load
''//创建我们的Web客户端
使用WC作为新CookieAwareWebClient()
''//设置的SSLv3
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
''//创建一个会话,忽略返回什么
WC.DownloadString(https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Search)
''// POST我们的实际数据并返回结果
昏暗S = WC.UploadString(https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results,POST,term_ code = 201130&放大器; search_type = K&放大器;关键词=数学)
Trace.WriteLine(S)
结束使用
结束小组
末级
公共类CookieAwareWebClient
继承Web客户端
私人CC作为新的CookieContainer()
私人末页作为字符串
受保护的覆盖功能GetWebRequest(BYVAL地址作为的System.Uri)作为System.Net.WebRequest
昏暗的R = MyBase.GetWebRequest(地址)
如果TypeOf运算R是HttpWebRequest的再
随着DirectCast(R,HttpWebRequest的)
.CookieContainer = CC
如果不是末页是没有那么
.Referer =末页
结束如果
结束与
结束如果
末页= address.ToString()
返回ṛ
端功能
末级
I'm trying to do an scrape a web page but in order to Post the data I need a web session ID like
web_session=HQJ3G1GPAAHRZGFR
How can I get that ID?
My code so far is:
Private Sub test()
Dim postData As String = "web_session=HQJ3G1GPAAHRZGFR&intext=O&term_code=201210&search_type=A&keyword=&kw_scope=all&kw_opt=all&subj_code=BIO&crse_numb=205&campus=*&instructor=*&instr_session=*&attr_type=*&mon=on&tue=on&wed=on&thu=on&fri=on&sat=on&sun=on&avail_flag=on" '/BANPROD/pkgyc_yccsweb.P_Results
Dim tempCookie As New CookieContainer
Dim encoding As New UTF8Encoding
Dim byteData As Byte() = encoding.GetBytes(postData)
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
Try
tempCookie.GetCookies(New Uri("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"))
'postData="web_session=" & tempCookie.
Dim postReq As HttpWebRequest = DirectCast(WebRequest.Create("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"), HttpWebRequest)
postReq.Method = "POST"
postReq.KeepAlive = True
postReq.CookieContainer = tempCookie
postReq.ContentType = "application/x-www-form-urlencoded"
postReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; Media Center PC 4.0; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
postReq.ContentLength = byteData.Length
Dim postreqstream As Stream = postReq.GetRequestStream
postreqstream.Write(byteData, 0, byteData.Length)
postreqstream.Close()
Dim postresponse As HttpWebResponse
postresponse = DirectCast(postReq.GetResponse, HttpWebResponse)
tempCookie.Add(postresponse.Cookies)
Dim postresreader As New StreamReader(postresponse.GetResponseStream)
Dim thepage As String = postresreader.ReadToEnd
MsgBox(thepage)
Catch ex As WebException
MsgBox(ex.Status.ToString & vbNewLine & ex.Message.ToString)
End Try
End Sub
The problem is that tempCookie.GetCookies()
isn't doing what you think its doing. What it actually does is essentially filter a pre-existing CookieCollection
down to only include cookies for the supplied URL. Instead, what you need to do is first create a request to a page that will give you this session token, then make the actual request for your data. So first request the page at P_Search
, then re-use that request with the CookieContainer
bound to it and post to P_Results
.
Instead of the HttpWebRequest
object, however, let me point you to the WebClient
class and my post here about extending it to support cookies. You'll find that you can simplify your code a lot. Below is a full working VB2010 WinForms app that shows this. If you still want to use the HttpWebRequest
object this should at least give you an idea of what needs to be done, too:
Option Strict On
Option Explicit On
Imports System.Net
Public Class Form1
Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
''//Create our webclient
Using WC As New CookieAwareWebClient()
''//Set SSLv3
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
''//Create a session, ignore what is returned
WC.DownloadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Search")
''//POST our actual data and get the results
Dim S = WC.UploadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results", "POST", "term_code=201130&search_type=K&keyword=math")
Trace.WriteLine(S)
End Using
End Sub
End Class
Public Class CookieAwareWebClient
Inherits WebClient
Private cc As New CookieContainer()
Private lastPage As String
Protected Overrides Function GetWebRequest(ByVal address As System.Uri) As System.Net.WebRequest
Dim R = MyBase.GetWebRequest(address)
If TypeOf R Is HttpWebRequest Then
With DirectCast(R, HttpWebRequest)
.CookieContainer = cc
If Not lastPage Is Nothing Then
.Referer = lastPage
End If
End With
End If
lastPage = address.ToString()
Return R
End Function
End Class
这篇关于如何从cookie中获得的Web会议?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!