如何使用beautifulsoup从元素获取属性? [英] How to get attribute from element using beautifulsoup?
本文介绍了如何使用beautifulsoup从元素获取属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是网页中的一些 html:
Here's a bit html from a web page:
<bg-quote class="value negative" field="Last" format="0,0.00" channel="/zigman2/quotes/203558040/composite,/zigman2/quotes/203558040/lastsale" data-last-stamp="1624625999626" data-last-raw="671.68">671.68</bg-quote>
所以我想获取属性data-last-raw"的值,但是 find() -method 在搜索此元素时似乎返回 None .这是为什么,我该如何解决?
So I want to get the value of attribute "data-last-raw", but find() -method seems to return None when searching for this element. Why is this and how can I fix it?
我的代码和回溯如下:
import requests
from bs4 import BeautifulSoup as BS
import tkinter as tk
class Scraping:
@classmethod
def get_to_site(cls, stock_name):
sitename = 'https://www.marketwatch.com/investing/stock/tsla' + stock_name
site = requests.get(sitename, headers={
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"en-GB,en;q=0.9,en-US;q=0.8,ml;q=0.7",
"Connection":"keep-alive",
"Host":"www.marketwatch.com",
"Referer":"https://www.marketwatch.com",
"Upgrade-Insecure-Requests":"1",
"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"
})
print(site.status_code)
src = site.content
Scraping.get_price(src)
@classmethod
def get_price(cls, src):
soup = BS(src, "html.parser")
price_holder = soup.find("bg-quote", {"channel":"/zigman2/quotes/203558040/composite,/zigman2/quotes/203558040/lastsale"})
price = price_holder["data-last-raw"]
print(price)
Scraping.get_to_site('tsla')
200
Traceback (most recent call last):
File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 41, in <module>
Scraping.get_to_site('tsla')
File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 30, in get_to_site
Scraping.get_price(src)
File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 36, in get_price
price = price_holder["data-last-raw"]
TypeError: 'NoneType' object is not subscriptable
所以 site.status_code 返回 200 表示该站点已正确打开,但我认为soup.find() -method 返回 None 表示未找到我要查找的元素.
So site.status_code returns 200 to indicate that the site is opened correctly, but I think the soup.find() -method returns None to indicate that the element I was looking for was not found.
有人请帮忙!
推荐答案
import requests
from bs4 import BeautifulSoup
def main(ticker):
r = requests.get(f'https://www.marketwatch.com/investing/stock/{ticker}')
soup = BeautifulSoup(r.text, 'lxml')
print(soup.select_one('bg-quote.value:nth-child(2)').text)
if __name__ == "__main__":
main('tsla')
输出:
670.99
这篇关于如何使用beautifulsoup从元素获取属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文