网页抓取速度如何? [英] How to web scrape a speed amount?
本文介绍了网页抓取速度如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想知道如何用python抓取速度量是Fast.com网站
I was wondering how to web scrape the speed amount is the Fast.com website with python
我做了一些努力,这是我目前所做的:
I did some effort, here is what I've done so far:
import requests
from bs4 import BeautifulSoup
response = requests.get('https://fast.com/', headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/600.7.12 (KHTML, like Gecko) Version/8.0.7 Safari/600.7.12"})
soup = BeautifulSoup(response.text, 'lxml')
speed = soup.find('span', {'id' : 'speed-value'}).text
print(speed)
输出总是0";有时它会给我一个错误
The output is always "0" and sometimes it gives me an error
我的目标是获得扫描后网站上显示的速度数(以 MB/s 为单位).
My goal is to get the speed number in MB/s as shown in the website after the scan.
我忘了做什么?
推荐答案
根据我的个人经验,BeautifulSoups 更适合静态页面.我会推荐 Selenium 以获得更多动态使用.它将允许在 javascript 等加载后进行访问,以便更轻松地进行网页抓取.
BeautifulSoups is more for Static Pages from my personal experience. I would recommend Selenium for more dynamic usage. It would allow for access after javascript and etc has loaded for easier web scraping.
from selenium import webdriver
driver_path = r"C:\chromedriver.exe"
driver = webdriver.Chrome(driver_path)
MBPS_CLASS = "speed-results-container"
driver.get("https://fast.com/")
while True:
print(driver.find_elements_by_class_name(MBPS_CLASS)[0].text)
# driver.find_element_by_id("speed-value").text # This works with ID also
这篇关于网页抓取速度如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文