BeautifulSoup 看不到元素，即使它存在于页面上 [英] BeautifulSoup does not see element , even though it is present on a page

查看：32 发布时间：2021/12/17 14:08:33 python web-scraping beautifulsoup

本文介绍了BeautifulSoup 看不到元素，即使它存在于页面上的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从 Airbnb 抓取房源.每个列表都有自己的 ID.但是，下面代码的输出是None:

导入请求，bs4response = requests.get('https://www.airbnb.pl/s/Girona--Hiszpania/homes?refinement_paths%5B%5D=%2Fhomes&query=Girona%2C%20Hiszpania&checkin=2018-07-04&;结帐= 2018年7月25日&安培; allow_override％5B％5D =安培; ne_lat = 42.40450221314142&安培; ne_lng = 3.3245690859736214&安培; sw_lat = 41.97668610374056&安培; sw_lng = 1.7960961855829964和缩放= 10安培; search_by_map =真安培; s_tag = nrGiXgWC')汤 = bs4.BeautifulSoup(response.text, "html.parser")元素 = 汤.find(id="listing-18354577")打印(元素)

为什么汤里已经加载了这个元素，却看不到这个元素?

它是否在某种类型的容器中，我需要以不同的方式刮擦?

解决方案

requests 不用等js，可以使用selenium 加载所有页面，然后使用 bs4 例如这有效:

导入请求，bs4从硒导入网络驱动程序# 把路径放到chromedriverdriver = webdriver.Chrome('path/to/chromedriver')网站 = "https://www.airbnb.pl/s/Girona--Hiszpania/homes?refinement_paths%5B%5D=%2Fhomes&query=Girona%2C%20Hiszpania&checkin=2018-07-04&checkout=2018-07-25&安培; allow_override％5B％5D =安培; ne_lat = 42.40450221314142&安培; ne_lng = 3.3245690859736214&安培; sw_lat = 41.97668610374056&安培; sw_lng = 1.7960961855829964和缩放= 10安培; search_by_map =真安培; s_tag = nrGiXgWC"driver.get(网站)html = driver.page_source汤 = bs4.BeautifulSoup(html, "html.parser")元素 = 汤.find(id="listing-18354577")打印(元素)

输出

... #和许多其他数据

I am trying to scrape listings from Airbnb. Every listing has its own ID. However, the output of the code below is None:

import requests, bs4

response = requests.get('https://www.airbnb.pl/s/Girona--Hiszpania/homes?refinement_paths%5B%5D=%2Fhomes&query=Girona%2C%20Hiszpania&checkin=2018-07-04&checkout=2018-07-25&allow_override%5B%5D=&ne_lat=42.40450221314142&ne_lng=3.3245690859736214&sw_lat=41.97668610374056&sw_lng=1.7960961855829964&zoom=10&search_by_map=true&s_tag=nrGiXgWC')  
soup = bs4.BeautifulSoup(response.text, "html.parser")

element = soup.find(id="listing-18354577")
print(element)

Why does the soup does not see this element, even though it is already loaded on the page?

Is it in a container of some type I need to scrape differently?

解决方案

requests don't wait for js, you can use selenium to load all page and after this use bs4 for example this works:

import requests, bs4
from selenium import webdriver

# put the path to chromedriver
driver = webdriver.Chrome('path/to/chromedriver') 
website = "https://www.airbnb.pl/s/Girona--Hiszpania/homes?refinement_paths%5B%5D=%2Fhomes&query=Girona%2C%20Hiszpania&checkin=2018-07-04&checkout=2018-07-25&allow_override%5B%5D=&ne_lat=42.40450221314142&ne_lng=3.3245690859736214&sw_lat=41.97668610374056&sw_lng=1.7960961855829964&zoom=10&search_by_map=true&s_tag=nrGiXgWC"
driver.get(website) 
html = driver.page_source
soup = bs4.BeautifulSoup(html, "html.parser")

element = soup.find(id="listing-18354577")
print(element)

Output

<div class="_1wq3lj" id="listing-18354577"> ...  #and many other data

这篇关于BeautifulSoup 看不到元素，即使它存在于页面上的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BeautifulSoup 看不到元素，即使它存在于页面上 [英] BeautifulSoup does not see element , even though it is present on a page

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

BeautifulSoup 看不到元素，即使它存在于页面上 [英] BeautifulSoup does not see element , even though it is present on a page

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭