美丽的汤返回“无" [英] Beautiful Soup returns 'none'
问题描述
我正在使用以下代码使用漂亮的汤提取数据:
I am using the following code to extract data using beautiful soup:
import requests
import bs4
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = bs4.BeautifulSoup(res.text, 'xml')
soup.find_all("span", class_="text")
我尝试了最后一行的不同变体,以使该程序在每次返回"None"或一个空列表时不显示任何内容.我唯一可以显示的是使用 print(soup.contents)
的网站的整个html.我尝试提取的数据是每个signID标签中的"Display"标签值.打印网站的整个HTML时,数据显然就在那里.
I've tried different variations of the last line trying to get the program to display anything at all but each time it returns "None" or an empty list. The only thing i can get to display is the entire html of the site using: print(soup.contents)
. The data I am trying to extract is the "Display" tag value within each of the signID tags. The data is clearly there when it prints the entire HTML of the site.
其他信息:我要提取的数字是停车场当前的停车位数量,因此该网站将在第二秒更新.
Additional Information: The the number I am trying to extract is the current number of spaces in a parking deck, so the website is updated by the second.
其他信息2:此网站是 https://www.jmu.edu/parking/的内嵌框架.我需要的数据位于通勤停车"的右下角
Additional Information 2: This site is an iframe of https://www.jmu.edu/parking/. The data I am after is in the bottom right corner under "commuter parking"
推荐答案
我看到您正在尝试提取每个 Sign
标签下的 Display
标签值.希望这对您有所帮助.
I can see that you're trying to extract Display
tag values under each Sign
tags. Hope this helps for you.
代码:
import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = BeautifulSoup(res.text, 'lxml')
for data in soup.find_all('sign'):
print(data.signid.text, data.display.text)
输出:
1 442
2 442
3 442
4 Happy Holidays
5 Happy Holidays
我只显示了5个值的输出,这给出了57个 signId
和 Display
值.
I have showed output for 5 values only and this gives 57 signId
and Display
values.
如果只需要 Display
值,则可以直接使用 soup.find_all('display')
.在示例中,我已使用 signId
和 Display
仅供参考.
You can directly use soup.find_all('display')
if you want only Display
values. I have used signId
and Display
in the example just for reference.
这篇关于美丽的汤返回“无"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!