如何从URL检索文本数据? [英] How do I retrieve text data from url?

查看:113
本文介绍了如何从URL检索文本数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用以下代码尝试解析以下网址中的一些文字数据:
 

Dim strURL As String ="
http://pictures.sprintpcs.com/share.do?invite=VEL42hPQY


Yk34YgLaQPo& shareName = MMS& messageState = RETRIEVED"

 

'***确定请求

          ;   Dim loHttp As HttpWebRequest = DirectCast(WebRequest.Create(strURL),


HttpWebRequest)

 

             '***设置属性


            loHttp.Timeout = 10000


            '10秒<
            loHttp.UserAgent =" Code Sample Web Client"


 

            '***检索请求信息标题


            Dim loWebResponse As HttpWebResponse =


DirectCast(loHttp.GetResponse(),HttpWebResponse)

 

     ;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; Dim enc As Encoding = Encoding.GetEncoding(1252)


            'Windows默认代码页


            Dim loResponseStream As New


StreamReader(loWebResponse.GetResponseStream(),enc)

 

    &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; Dim lcHtml As String = loResponseStream.ReadToEnd()


 

          &NBSP;&NBSP; loWebResponse.Close()


            loResponseStream.Close()


 

            LogResponseStream.WriteLine(lcHtml)


 

问题是我们在代码中获得的回复在
时不完整与浏览器中实际呈现的内容进行比较。 我们得到的html


在html的主体中显示了一个javascript函数,我们真正想要的是函数的
结果,其中包括数据我们需要捕捉。 这是


通过在Google Chrome中加载页面,点击文本"100360"确认。

并选择"检查元素"这样我们就可以根据需要查看整页回复

,特别是以下行:

 

< pre class = QUOT;预LONGTEXT缠绕"> 100360< /预>

 

任何人都可以帮助我们弄清楚如何获得这个"原始"的页面回复? 脚本花费几秒钟的时间来回复,而我们只看到初始页面回复的b
,这可能是


。 提前感谢您的意见。

We are using the following code to try and parse some text data from the URL
below:
 
Dim strURL As String = " http://pictures.sprintpcs.com/share.do?invite=VEL42hPQY
Yk34YgLaQPo&shareName=MMS&messageState=RETRIEVED"
 
' *** Establish the request
            Dim loHttp As HttpWebRequest = DirectCast(WebRequest.Create(strURL),
HttpWebRequest)
 
            ' *** Set properties
            loHttp.Timeout = 10000
            ' 10 secs
            loHttp.UserAgent = "Code Sample Web Client"
 
            ' *** Retrieve request info headers
            Dim loWebResponse As HttpWebResponse =
DirectCast(loHttp.GetResponse(), HttpWebResponse)
 
            Dim enc As Encoding = Encoding.GetEncoding(1252)
            ' Windows default Code Page
            Dim loResponseStream As New
StreamReader(loWebResponse.GetResponseStream(), enc)
 
            Dim lcHtml As String = loResponseStream.ReadToEnd()
 
            loWebResponse.Close()
            loResponseStream.Close()
 
            LogResponseStream.WriteLine(lcHtml)
 
The problem is that the response we are getting in code is incomplete when
compared with what actually renders in the browser.  The html we are getting
shows a javascript function in the body of html where what we really want is the
result of the function, which includes the data we need to capture.  This is
confirmed by loading the page in Google Chrome, clicking on the text "100360"
and choosing "inspect element" which allows us to see the full page response
with the data we need, specifically the following line:
 
<pre class="pre-longText-wrap">100360</pre>
 
Can anyone help us figure out how to get this "raw" page response?  It may be
that the script is taking a few seconds to respond and that we are only seeing
the initial page response.  Thank you in advance for your input.

推荐答案

你好tbStrat,

Hello tbStrat,

你是说你需要捕获的数据是由HTML中的javascript函数生成?

Are you saying that you need to capture the data that is generated by the javascript function inside the HTML?

如果你,你的方法是错误的。 你需要能够运行javascript,这需要一个浏览器 - 你只是发出一个http请求 - 你正在使用的是运行javascript函数。)

If you, you are going about this the wrong way.  You need to be able to let the javascript run, which needs a browser - your just making a http request - nothing you are using is running the javascript function).

我可以建议你吗?使用WebBrowser控件? ( http://msdn.microsoft.com/en- us / library / system.windows.forms.webbrowser.aspx )这是一个封装浏览器的winforms
控件。 删除表单上的WebBrowser控件,将其导航到您的页面,然后在页面完成下载后获取数据。 您甚至可以使用文档属性
(这是一个HtmlDocument类
http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.aspx

Can I suggest that you use the WebBrowser control? (http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx) This is a winforms control that encapsulates a browser.  Drop a WebBrowser control on your form, navigate it to your page, then get at the data after the page has completed downloading.  You can even fire scripts inside the downloaded page using the document property (which is a HtmlDocument class http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.aspx)

webbrowser的文档属性将有一个InvokeScript方法。

The document property off the webbrowser will have an InvokeScript method.

如果我误解了,请告诉我。

Let me know if ive misunderstood.

干杯,

马特


这篇关于如何从URL检索文本数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆