如何从URL检索文本数据? [英] How do I retrieve text data from url?
问题描述
我们使用以下代码尝试解析以下网址中的一些文字数据:
Dim strURL As String ="
http://pictures.sprintpcs.com/share.do?invite=VEL42hPQY
Yk34YgLaQPo& shareName = MMS& messageState = RETRIEVED"
'***确定请求
  ;   Dim loHttp As HttpWebRequest = DirectCast(WebRequest.Create(strURL),
HttpWebRequest)
          '***设置属性
loHttp.Timeout = 10000
'10秒<
loHttp.UserAgent =" Code Sample Web Client"
'***检索请求信息标题
Dim loWebResponse As HttpWebResponse =
DirectCast(loHttp.GetResponse(),HttpWebResponse)
;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; Dim enc As Encoding = Encoding.GetEncoding(1252)
'Windows默认代码页
Dim loResponseStream As New
StreamReader(loWebResponse.GetResponseStream(),enc)
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; Dim lcHtml As String = loResponseStream.ReadToEnd()
&NBSP;&NBSP; loWebResponse.Close()
loResponseStream.Close()
LogResponseStream.WriteLine(lcHtml)
问题是我们在代码中获得的回复在
时不完整与浏览器中实际呈现的内容进行比较。 我们得到的html
在html的主体中显示了一个javascript函数,我们真正想要的是函数的
结果,其中包括数据我们需要捕捉。 这是
通过在Google Chrome中加载页面,点击文本"100360"确认。
并选择"检查元素"这样我们就可以根据需要查看整页回复
,特别是以下行:
< pre class = QUOT;预LONGTEXT缠绕"> 100360< /预>
任何人都可以帮助我们弄清楚如何获得这个"原始"的页面回复? 脚本花费几秒钟的时间来回复,而我们只看到初始页面回复的b
,这可能是
。 提前感谢您的意见。
We are using the following code to try and parse some text data from the URL
below:
Dim strURL As String = "
http://pictures.sprintpcs.com/share.do?invite=VEL42hPQY
Yk34YgLaQPo&shareName=MMS&messageState=RETRIEVED"
' *** Establish the request
Dim loHttp As HttpWebRequest = DirectCast(WebRequest.Create(strURL),
HttpWebRequest)
' *** Set properties
loHttp.Timeout = 10000
' 10 secs
loHttp.UserAgent = "Code Sample Web Client"
' *** Retrieve request info headers
Dim loWebResponse As HttpWebResponse =
DirectCast(loHttp.GetResponse(), HttpWebResponse)
Dim enc As Encoding = Encoding.GetEncoding(1252)
' Windows default Code Page
Dim loResponseStream As New
StreamReader(loWebResponse.GetResponseStream(), enc)
Dim lcHtml As String = loResponseStream.ReadToEnd()
loWebResponse.Close()
loResponseStream.Close()
LogResponseStream.WriteLine(lcHtml)
The problem is that the response we are getting in code is incomplete when
compared with what actually renders in the browser. The html we are getting
shows a javascript function in the body of html where what we really want is the
result of the function, which includes the data we need to capture. This is
confirmed by loading the page in Google Chrome, clicking on the text "100360"
and choosing "inspect element" which allows us to see the full page response
with the data we need, specifically the following line:
<pre class="pre-longText-wrap">100360</pre>
Can anyone help us figure out how to get this "raw" page response? It may be
that the script is taking a few seconds to respond and that we are only seeing
the initial page response. Thank you in advance for your input.
推荐答案
你好tbStrat,
Hello tbStrat,
你是说你需要捕获的数据是由HTML中的javascript函数生成?
Are you saying that you need to capture the data that is generated by the javascript function inside the HTML?
如果你,你的方法是错误的。 你需要能够运行javascript,这需要一个浏览器 - 你只是发出一个http请求 - 你正在使用的是运行javascript函数。)
If you, you are going about this the wrong way. You need to be able to let the javascript run, which needs a browser - your just making a http request - nothing you are using is running the javascript function).
我可以建议你吗?使用WebBrowser控件? ( http://msdn.microsoft.com/en- us / library / system.windows.forms.webbrowser.aspx )这是一个封装浏览器的winforms
控件。 删除表单上的WebBrowser控件,将其导航到您的页面,然后在页面完成下载后获取数据。 您甚至可以使用文档属性
(这是一个HtmlDocument类
http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.aspx )
Can I suggest that you use the WebBrowser control? (http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx) This is a winforms control that encapsulates a browser. Drop a WebBrowser control on your form, navigate it to your page, then get at the data after the page has completed downloading. You can even fire scripts inside the downloaded page using the document property (which is a HtmlDocument class http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.aspx)
webbrowser的文档属性将有一个InvokeScript方法。
The document property off the webbrowser will have an InvokeScript method.
如果我误解了,请告诉我。
Let me know if ive misunderstood.
干杯,
马特
这篇关于如何从URL检索文本数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!