Web抓取程序找不到在浏览器中可以看到的元素 [英] Web scraping program cannot find element which I can see in the browser

查看:110
本文介绍了Web抓取程序找不到在浏览器中可以看到的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 https://www上获取流的标题. twitch.tv/directory/game/Dota%202 (使用请求"和BeautifulSoup).我知道我的搜索条件正确,但是我的程序找不到我需要的元素.

I am trying to get the titles of the streams on https://www.twitch.tv/directory/game/Dota%202, using Requests and BeautifulSoup. I know that my search criteria are correct, yet my program does not find the elements I need.

以下是屏幕截图,显示了浏览器中源代码的相关部分:

Here is a screenshot showing the relevant part of the source code in the browser:

HTML源文本:

<div class="tw-media-card-meta__title">
  <div class="tw-c-text-alt">
    <a class="tw-full-width tw-interactive tw-link tw-link--button tw-link--hover-underline-none tw-link--inherit" data-a-target="preview-card-title-link" href="/weplayesport_en">
      <div class="tw-align-items-start tw-flex">
        <h3 class="tw-ellipsis tw-font-size-5" title="NAVI vs HellRaisers | BO5 | ODPixel &amp; S4 | WeSave! Charity Play">NAVI vs HellRaisers | BO5 | ODPixel &amp; S4 | WeSave! Charity Play</h3>
      </div>
    </a>
  </div>
</div>

这是我的代码:

import requests
from bs4 import BeautifulSoup

req = requests.get("https://www.twitch.tv/directory/game/Dota%202")

soup = BeautifulSoup(req.content, "lxml")

title_elems = soup.find_all("h3", attrs={"title": True})

print(title_elems)

当我运行它时,title_elems只是空列表([]).

When I run it, title_elems is just the empty list ([]).

为什么我的程序找不到元素?

Why is my program not finding the elements?

推荐答案

在初始页面加载后,您感兴趣的元素是动态生成的,这意味着您的浏览器在其中执行了JavaScript,发出了其他网络请求等.为了建立页面.请求只是一个HTTP库,因此不会执行这些操作.

The element you're interested in is dynamically generated, after the initial page load, which means that your browser executed JavaScript, made other network requests, etc. in order to build the page. Requests is just an HTTP library, and as such will not do those things.

您可以使用Selenium之类的工具,甚至可以分析网络流量以获取所需数据并直接发出请求.

You could use a tool like Selenium, or perhaps even analyze the network traffic for the data you need and make the requests directly.

这篇关于Web抓取程序找不到在浏览器中可以看到的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆