美丽的汤返回“无" [英] Beautiful Soup returns 'none'

查看:49
本文介绍了美丽的汤返回“无"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下代码使用漂亮的汤提取数据:

I am using the following code to extract data using beautiful soup:

import requests
import bs4
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = bs4.BeautifulSoup(res.text, 'xml')
soup.find_all("span", class_="text")

我尝试了最后一行的不同变体,以使该程序在每次返回"None"或一个空列表时不显示任何内容.我唯一可以显示的是使用 print(soup.contents)的网站的整个html.我尝试提取的数据是每个signID标签中的"Display"标签值.打印网站的整个HTML时,数据显然就在那里.

I've tried different variations of the last line trying to get the program to display anything at all but each time it returns "None" or an empty list. The only thing i can get to display is the entire html of the site using: print(soup.contents). The data I am trying to extract is the "Display" tag value within each of the signID tags. The data is clearly there when it prints the entire HTML of the site.

其他信息:我要提取的数字是停车场当前的停车位数量,因此该网站将在第二秒更新.

Additional Information: The the number I am trying to extract is the current number of spaces in a parking deck, so the website is updated by the second.

其他信息2:此网站是 https://www.jmu.edu/parking/的内嵌框架.我需要的数据位于通勤停车"的右下角

Additional Information 2: This site is an iframe of https://www.jmu.edu/parking/. The data I am after is in the bottom right corner under "commuter parking"

网址: HTTPS://www.jmu.EDU/的cgi-bin/parking_sign_data.cgi散列= 53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0 |?869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5

推荐答案

我看到您正在尝试提取每个 Sign 标签下的 Display 标签值.希望这对您有所帮助.

I can see that you're trying to extract Display tag values under each Sign tags. Hope this helps for you.

代码:

import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = BeautifulSoup(res.text, 'lxml')
for data in soup.find_all('sign'):
    print(data.signid.text, data.display.text)

输出:

1  442
2  442
3  442
4 Happy Holidays
5 Happy Holidays

我只显示了5个值的输出,这给出了57个 signId Display 值.

I have showed output for 5 values only and this gives 57 signId and Display values.

如果只需要 Display 值,则可以直接使用 soup.find_all('display').在示例中,我已使用 signId Display 仅供参考.

You can directly use soup.find_all('display') if you want only Display values. I have used signId and Display in the example just for reference.

这篇关于美丽的汤返回“无"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆