从网页中获取价值 [英] Get a value off a web page

查看:77
本文介绍了从网页中获取价值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

美好的一天。



以下网址有2012年12 30的摘要数据

http://howickweather.info/summaries /day.php?d=20121230





我正在尝试使用下面的平均温度和Windrun。

问题是我只能获得



+''

+'平均温度'

+''+ getVal(meanTemp,1)+'°C'

+'Windrun'

+''+ getVal(windrun,1)+ 'km'

+''

+''



不是实际值(但它们显示在网页)



任何帮助非常感谢 - 我花了好几个小时试图解决。谢谢Ian。



Imports System.Net

Imports System.IO

Imports System.Xml

Imports System.Text

Imports System.Diagnostics





Public Class Form1

Private Sub Button1_Click(发送者为对象,e为EventArgs)处理Button1.Click

呼叫主()

结束子



Sub main()

Const URLString As String =http://howickweather.info/summaries/day.php?d=20121230



Dim SourceRequest As Net.HttpWebRequest = Net.HttpWebRequest.Create(URLString)

Dim SourceResponse As Net.HttpWebResponse = SourceRequest.GetResponse()

Dim SourceStream As New IO.StreamReader(SourceResponse.GetResponseStream)

Dim SourceCode As String = String.Empty



虽然SourceStream.EndOfStream = False

SourceCode& = vbNewLi ne& SourceStream.ReadLine

结束时



SourceCode = SourceCode.Trim(vbNewLine)

RichTextBox1.Text = SourceCode



结束子



结束类

Good day all.

The below url has summary data for 2012 12 30
http://howickweather.info/summaries/day.php?d=20121230


I am trying to get the Mean temperature and Windrun using the below.
The problem is I only get

+ ''
+ 'Mean temperature'
+ '' + getVal("meanTemp", 1) + '°C '
+ 'Windrun'
+ '' + getVal("windrun", 1) + ' km '
+ ''
+ ''

not the actual values ( yet they show on the web page )

Any help greatly appreciated - I've spent hours trying to resolve. Thanks Ian.

Imports System.Net
Imports System.IO
Imports System.Xml
Imports System.Text
Imports System.Diagnostics


Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Call main()
End Sub

Sub main()
Const URLString As String = "http://howickweather.info/summaries/day.php?d=20121230"

Dim SourceRequest As Net.HttpWebRequest = Net.HttpWebRequest.Create(URLString)
Dim SourceResponse As Net.HttpWebResponse = SourceRequest.GetResponse()
Dim SourceStream As New IO.StreamReader(SourceResponse.GetResponseStream)
Dim SourceCode As String = String.Empty

While SourceStream.EndOfStream = False
SourceCode &= vbNewLine & SourceStream.ReadLine
End While

SourceCode = SourceCode.Trim(vbNewLine)
RichTextBox1.Text = SourceCode

End Sub

End Class

推荐答案

请参阅我对该问题的评论。



使用 HttpWebRequest 您将获得一份文件。如果这是一个HTML文档,您将需要解析它。如果这是一个格式良好的XML文档,那么你可以使用.NET XML解析器解析它。不幸的是,并非所有网页都是这样的,因此您可能需要HTML解析器,它不需要格式良好的XML兼容性。试试这个: http://www.majestic12.co.uk/projects/html_parser.php [ ^ ]。



顺便说一下,这叫做 Web Scraping http:/ /en.wikipedia.org/wiki/Web_scraping [ ^ ]。



-SA
Please see my comment to the question.

Using HttpWebRequest you obtain a document. If this is a HTML document, you will need to parse it. It would be good if this is a document well-formed as XML, then you could parse it using one of .NET XML parsers. Not all Web pages are like that, unfortunately, so you may need HTML parser which does not require well-formed XML compliance. Try this one: http://www.majestic12.co.uk/projects/html_parser.php[^].

This is, by the way, is called Web Scraping, http://en.wikipedia.org/wiki/Web_scraping[^].

—SA


这篇关于从网页中获取价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆