使用Python的BeautifiulSoup库解析Span HTML标记中的信息 [英] Using Python's BeautifiulSoup Library to Parse info in a Span HTML tag

查看：128 发布时间：2020/9/20 7:51:36 python html web-scraping beautifulsoup

本文介绍了使用Python的BeautifiulSoup库解析Span HTML标记中的信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在写一个抓取特定股票价格的Python网络抓取工具.在程序的最后，有一些打印语句可以正确地解析html数据，这样我就可以在某个HTML span标签内获取股票的价格信息.我的问题是:我该怎么做?我到目前为止已经获得了正确的HTML span标记.我以为您可以简单地进行字符串拼接，但是库存的价格会不断变化，因此我认为这种解决方案不利于解决此问题.我最近开始使用BeautifulSoup，因此不胜感激.

I am writing a Python web scraper that grabs the price of a certain stock. At the end of my program, there are a few print statements to correctly parse the html data so I can grab the stock's price info within a certain HTML span tag. My question is: How do I do this? I have gotten so far as to get the correct HTML span tag. I thought you could simply do a string splice, however the price of the stock is subject to incessant change and I figure this solution would not be conducive for this problem. I recently started using BeautifulSoup, so a little advice would be much appreciated.

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

# webscrapping reference http://altitudelabs.com/blog/web-scraping-with-python-and-beautiful-soup/

my_url = 'https://quotes.wsj.com/GRPS/options'
#opens up a web connection and "downloads"a copy of the desired webpage
uClient = uReq(my_url)

#dumps the information read on the webpade into a variable for later use/parsing
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "lxml")

#find the html location for the price of the stock
#<span id="quote_val">0.0008</span>

all_stock_info = page_soup.find("section",{"class":"sector cr_section_1"})
find_spans = all_stock_info.find("span",{"id":"quote_val"})
price = page_soup.findAll("span",{"id":"quote_val"})

#sanity checks to make sure the scraper is finding the correct info
print(all_stock_info)
print(len(all_stock_info))
print(len(price))
print(price)  #this gives me the right span, I just need to be able to parse 
              #the price of the stock between here (in this case 0.0008) no 
              #matter what the price is
print(all_stock_info.span)
print(find_spans)

使用Python的BeautifiulSoup库解析Span HTML标记中的信息 [英] Using Python's BeautifiulSoup Library to Parse info in a Span HTML tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用Python的BeautifiulSoup库解析Span HTML标记中的信息 [英] Using Python&#39;s BeautifiulSoup Library to Parse info in a Span HTML tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

使用Python的BeautifiulSoup库解析Span HTML标记中的信息 [英] Using Python's BeautifiulSoup Library to Parse info in a Span HTML tag

登录关闭