如何使用beautifulsoup从元素获取属性? [英] How to get attribute from element using beautifulsoup?

查看:74
本文介绍了如何使用beautifulsoup从元素获取属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是网页中的一些 html:

Here's a bit html from a web page:

<bg-quote class="value negative" field="Last" format="0,0.00" channel="/zigman2/quotes/203558040/composite,/zigman2/quotes/203558040/lastsale" data-last-stamp="1624625999626" data-last-raw="671.68">671.68</bg-quote>

所以我想获取属性data-last-raw"的值,但是 find() -method 在搜索此元素时似乎返回 None .这是为什么,我该如何解决?

So I want to get the value of attribute "data-last-raw", but find() -method seems to return None when searching for this element. Why is this and how can I fix it?

我的代码和回溯如下:

import requests
from bs4 import BeautifulSoup as BS
import tkinter as tk


class Scraping:

    @classmethod
    def get_to_site(cls, stock_name):
        sitename = 'https://www.marketwatch.com/investing/stock/tsla' + stock_name
        site = requests.get(sitename, headers={
            "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
            "Accept-Encoding":"gzip, deflate",
            "Accept-Language":"en-GB,en;q=0.9,en-US;q=0.8,ml;q=0.7",
            "Connection":"keep-alive",
            "Host":"www.marketwatch.com",
            "Referer":"https://www.marketwatch.com",
            "Upgrade-Insecure-Requests":"1",
            "User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"
        })
        print(site.status_code)
        src = site.content
        Scraping.get_price(src)
        
    @classmethod
    def get_price(cls, src):
        soup = BS(src, "html.parser")
        price_holder = soup.find("bg-quote", {"channel":"/zigman2/quotes/203558040/composite,/zigman2/quotes/203558040/lastsale"})
        price = price_holder["data-last-raw"]
        print(price)



Scraping.get_to_site('tsla')


200
Traceback (most recent call last):
  File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 41, in <module>
    Scraping.get_to_site('tsla')
  File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 30, in get_to_site
    Scraping.get_price(src)
  File "c:\Users\Aatu\Documents\python\pythonleikit\stock_price_scraper.py", line 36, in get_price
    price = price_holder["data-last-raw"]
TypeError: 'NoneType' object is not subscriptable

所以 site.status_code 返回 200 表示该站点已正确打开,但我认为soup.find() -method 返回 None 表示未找到我要查找的元素.

So site.status_code returns 200 to indicate that the site is opened correctly, but I think the soup.find() -method returns None to indicate that the element I was looking for was not found.

有人请帮忙!

推荐答案

import requests
from bs4 import BeautifulSoup


def main(ticker):
    r = requests.get(f'https://www.marketwatch.com/investing/stock/{ticker}')
    soup = BeautifulSoup(r.text, 'lxml')
    print(soup.select_one('bg-quote.value:nth-child(2)').text)


if __name__ == "__main__":
    main('tsla')

输出:

670.99

这篇关于如何使用beautifulsoup从元素获取属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆