Python从网站获取所有内容到html文件 [英] Python get all the contents from a website to html file

查看:51
本文介绍了Python从网站获取所有内容到html文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请有人帮忙,我想将所有内容从url转移到html文件,有人可以帮助我吗?我也必须使用用户代理!

someone please help, i want to transfer all to contents from url to a html file can someone help me please? I have to use user-agent too!

推荐答案

因为我不知道您需要抓取哪个网站,所以我说有些浪费

because I don't know what site you need scrape so I say a few wasy

如果站点包含JS前端并且需要等待洗衣,那么我建议您使用 requests_html 模块,该模块具有呈现内容的方法

if site contains JS frontend and for laoding needed waiting then I recommend you use requests_html module which has method for rendering content

from requests_html import HTMLSession

url = "https://some-url.org"

with HTMLSession() as session:
    response = session.get(url)
    response.html.render() #  rendering JS code
    content = response.html.html #  full content

如果网站不使用JS作为前端,那么 requests 模块对您来说确实是不错的选择

if site doesn't use JS for frontent then requests module is really good choice for you

import requests

url = "https://some-url.org"

response = requests.get(url)
content = response.content #  html content in bytes()

否则,您可以使用 selenium 网络驱动程序,但它对python的运行速度却很慢

else you can use selenium webdriver but it works few slowly for python

这篇关于Python从网站获取所有内容到html文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆