需要创建一个包含两个span标签的字典,并将其包装在一个容器中.使用美丽的汤 [英] Need to create a dictionary of two span tags with in a wrapped up in a container. Using beautiful soups
本文介绍了需要创建一个包含两个span标签的字典,并将其包装在一个容器中.使用美丽的汤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在抓取一个网站列表,并设法使大多数功能都可以使用,但抓取说明除外.
I am scraping some listings of a website and managed to get most of the features to work except scraping the description.
这是一个广告的网址: https://eg.hatla2ee.com/en/car/honda/civic/3289785
here is the URL of one ad : https://eg.hatla2ee.com/en/car/honda/civic/3289785
这是我的代码:
for link in df['New Carlist Unit 1_link']:
url = requests.get(link)
soup = BeautifulSoup(url.text, 'html.parser')
### Get title
title =[]
try:
title.append(soup.find('h1').text.strip())
except Exception as e:
None
## Get price
price = []
try:
price.append(soup.find('span',class_="usedUnitCarPrice").text.strip())
except Exception as e:
None
##Get Description box
label =[]
text =[]
try:
for span in soup.find_all('span',class_="DescDataSubTit"):
label.append(span.text.strip())
text.append(span.find_next_sibling().text.strip())
except Exception as e:
None
print('*'*100)
print(title)
print(price)
print(label)
print(text)
time.sleep(1)
由于某种原因,我似乎无法收集所有span标签.
I cant seem to collect all the span tags for some reason.
这是我想要的输出:
{'Make': 'Honda'}
{'Model': 'Crosstour'}
{'Used since': '2012'}
{'Km': '0 Km'}
{'Transmission': 'automatic'}
{'City': 'Cairo'}
{'Color': 'Gold'}
{'Fuel': 'gas'}
推荐答案
import requests
from bs4 import BeautifulSoup
def main(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
target = soup.select_one("div.DescDataRow").select("span.DescDataSubTit")
for tar in target:
g = {tar.text: tar.find_next("span").get_text(strip=True)}
print(g)
main("https://eg.hatla2ee.com/en/car/honda/civic/3289785")
输出:
{'Make': 'Honda'}
{'Model': 'Civic'}
{'Used since': '1990'}
{'Km': '1,500 Km'}
{'Transmission': 'automatic'}
{'City': 'Port Said'}
{'Color': 'Dark red'}
{'Fuel': 'gas'}
import requests
from bs4 import BeautifulSoup
def main(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
target = [list(item.stripped_strings)
for item in soup.select("div.DescDataContain")][0][:16]
print(dict(zip(*[iter(target)]*2)))
main("https://eg.hatla2ee.com/en/car/honda/civic/3289785")
输出:
{'Make': 'Honda', 'Model': 'Civic', 'Used since': '1990', 'Km': '1,500 Km', 'Transmission': 'automatic', 'City': 'Port Said', 'Color': 'Dark red', 'Fuel': 'gas'}
这篇关于需要创建一个包含两个span标签的字典,并将其包装在一个容器中.使用美丽的汤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文