在Python中的HTML标签内查找特定文本 [英] Find Specific Text Within HTML Tag in Python

查看：558 发布时间：2020/9/20 8:09:23 python beautifulsoup html-parsing

本文介绍了在Python中的HTML标签内查找特定文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经尝试了100万种方法来解析热情，但尚未成功.

I've tried a million different ways to parse out the zestimate, but have yet to be successful.

这是带有zestimate信息的html标签:

here's the html tag with the zestimate info:

<span>
  <span tabindex="0" role="button">
    <span class="sc-bGbJRg iiEDXU ds-dashed-underline">
      Zestimate
    <sup>®</sup>
    </span>
  </span>
  :&nbsp;
  <span>$331,425</span>
</span>

老实说，我认为这会让我接近，但是我得到了一个空名单:

Honestly I thought this would get me close, but I get an empty list:

link = 'https://www.zillow.com/homedetails/1404-Clearwing-Cir-Georgetown-TX-78626/121721750_zpid/'
searched_word = '<span class="sc-bGbJRg iiEDXU ds-dashed-underline">Zestimate<sup>®</sup></span>'
test_page = requests.Session().get(link, headers=req_headers)
test_soup = BeautifulSoup(test_page.content, 'lxml')
results = test_soup('span',string='searched_word')
print(results)[0]

推荐答案

要从站点获取正确的HTML，请添加User-Agent标头以进行请求.

To get correct HTML from the site, add User-Agent header to request.

例如:

import requests
from bs4 import BeautifulSoup


url = 'https://www.zillow.com/homedetails/1404-Clearwing-Cir-Georgetown-TX-78626/121721750_zpid/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')

home_value = soup.select_one('h4:contains("Home value")').find_next('p').get_text(strip=True)
print(home_value)

打印:

$331,425

这篇关于在Python中的HTML标签内查找特定文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Python中的HTML标签内查找特定文本 [英] Find Specific Text Within HTML Tag in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python中的HTML标签内查找特定文本 [英] Find Specific Text Within HTML Tag in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭