Python 保存网页 [英] Python to Save Web Pages
问题描述
这可能是一项非常简单的任务,但我找不到任何帮助.我有一个采用 www.xyz.com/somestuff/ID 形式的网站.我有一个我需要从中获取信息的 ID 列表.我希望有一个简单的脚本来访问站点并以简单的形式下载每个 ID 的(完整)网页 ID_whatever_the_default_save_name_is 位于特定文件夹中.
This is probably a very simple task, but I cannot find any help. I have a website that takes the form www.xyz.com/somestuff/ID. I have a list of the IDs I need information from. I was hoping to have a simple script to go one the site and download the (complete) web page for each ID in a simple form ID_whatever_the_default_save_name_is in a specific folder.
我可以运行一个简单的 python 脚本来为我做这件事吗?我可以手工完成,只有 75 个不同的页面,但我希望将来能用它来学习如何做这样的事情.
Can I run a simple python script to do this for me? I can do it by hand, it is only 75 different pages, but I was hoping to use this to learn how to do things like this in the future.
推荐答案
您是否只想要网站的 html 代码?如果是这样,只需使用主机站点创建一个 url 变量并随时添加页码.我将以 http://www.notalwaysright.com
Do you want just the html code for the website? If so, just create a url variable with the host site and add the page number as you go. I'll do this for an example with http://www.notalwaysright.com
import urllib.request
url = "http://www.notalwaysright.com/page/"
for x in range(1, 71):
newurl = url + x
response = urllib.request.urlopen(newurl)
with open("Page/" + x, "a") as p:
p.writelines(reponse.read())
这篇关于Python 保存网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!