如何从HTML页面读取特定数字 [英] How to read a specific number from a HTML page

查看:83
本文介绍了如何从HTML页面读取特定数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,如果我想从此页面设置索引值:
http://ca.finance.yahoo.com/q;_ylt=Agfc5O8HHTlOLgX.q6V4HEtyzJpG;_ylu=X3oDMTFkdnZqMHBkBHBvcwMyBHNlYwN5ZmlNYXJrZXRTdW1tYXJ5RnJvbnRwYWdlBHNsawNzcHRzeA--?s= ^ GSPTSE



给一个变量,我该怎么做?



我对编程非常新颖,如果你愿意,我会非常感激每一行解释。
我的观点不是完成,我想理解它。



非常感谢您提前!

解决方案 < span id =yfs_l10_ ^ gsptse> 13,702.33< / span> 。



这意味着您可以刮页面和然后挑出那个单独的标签。



您需要先连接主机并下载页面。你这样做的方式取决于你使用的是哪种语言。有很多教程 - 只是搜索[语言]网页抓取。

然后,您需要从html源代码创建一个文档对象模型 - 再次,这取决于语言,在一些语言中很容易,在其他语言中很困难。完成后,只需搜索标识为 yfs_l10_ ^ gsptse 的标签并获取内容即可。



<希望有所帮助 - 显然有很多我没有说过,但这取决于你想使用的语言。


for example , if I wanted to set the Index value from this page: http://ca.finance.yahoo.com/q;_ylt=Agfc5O8HHTlOLgX.q6V4HEtyzJpG;_ylu=X3oDMTFkdnZqMHBkBHBvcwMyBHNlYwN5ZmlNYXJrZXRTdW1tYXJ5RnJvbnRwYWdlBHNsawNzcHRzeA--?s=^GSPTSE

to a variable, how can I do that??

I am VERY NEW to programming, I would really appreciate if you explained every line. My point isnt to get it done, I want to understand it.

Thank you very much in advance!

解决方案

If you look at the source code of the web page, you find that the index number is within a span tag which has a unique id: <span id="yfs_l10_^gsptse">13,702.33</span>.

This means that you can scrape the page and then single out that individual tag.

You need to start by connecting to the host and downloading the page. The way in which you do this depends on which language you're using. There are plenty of tutorials around - just search for "[language] web scraping".

Then you need to create a Document Object Model from the html source code - again, this depends on the language, it's easy in some and difficult in others. Once you've done that, simply search for the tag with an id of yfs_l10_^gsptse and grab the content.

Hope that helps - obviously there's a lot I haven't said, but it depends what language you want to use.

这篇关于如何从HTML页面读取特定数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆