循环创建URL [英] Creating URLs in a loop
问题描述
我正在尝试使用for循环创建URL列表.它会打印所有正确的URL,但不会将它们保存在列表中.最终,我想使用 urlretrieve下载多个文件
.
用于zip中的i,j(范围(0,17),范围(1,18)):如果我<8或j<10:url ="https://这里是URL/P200 {}".format(i)+-0 {}".format(j)+.xls"打印(URL)如果i == 9并且j == 10:url ="https://这里是URL/P200 {}".format(i)+-{}".format(j)+.xls"打印(URL)如果我>9:如果我>9或j<8:url ="https://这里是URL/P20 {}".format(i)+-{}".format(j)+.xls"打印(URL)
以上代码的输出为:
https://这里是URL/P2000-01.xlshttps://这里是URL/P2001-02.xlshttps://这里是URL/P2002-03.xlshttps://这里是URL/P2003-04.xlshttps://这里是URL/P2004-05.xlshttps://这里是URL/P2005-06.xlshttps://这里是URL/P2006-07.xlshttps://这里是URL/P2007-08.xlshttps://这里是URL/P2008-09.xlshttps://这里是URL/P2009-10.xlshttps://这里是URL/P2010-11.xlshttps://这里是URL/P2011-12.xlshttps://这里是URL/P2012-13.xlshttps://这里是URL/P2013-14.xlshttps://这里是URL/P2014-15.xlshttps://这里是URL/P2015-16.xlshttps://这里是URL/P2016-17.xls
但是这个:
url
仅给予:
'https://这里是URL/P2016-17.xls'
如何获取所有URL,而不仅仅是最终URL?
有几件事可以大大简化您的代码.首先,这是
"https://这里是URL/P200 {}".format(i)+"-0 {}".format(j)+".xls"
可以简化为:
"https://这里是URL/P200 {}-0 {}.xls" .format(i,j)
如果您至少具有Python 3.6,则可以使用 f字符串相反:
f"https://这里是URL/P200 {i} -0 {j} .xls"
第二,Python字符串具有内置的 zfill
方法,该方法自动处理左侧的零填充至指定长度.此外,默认情况下, range
从零开始.>
因此您的整个原始代码等效于:
范围(17)中的num:首先= str(num).zfill(2)秒= str(num + 1).zfill(2)print(f'https://这里是URL/P20 {first}-{second} .xls')
现在,您想实际使用这些URL,而不仅仅是打印出来.您提到了建立列表,可以这样进行:
urls = []对于范围内的num(17):首先= str(num).zfill(2)秒= str(num + 1).zfill(2)urls.append(f'https://这里是URL/P20 {first}-{second} .xls')
根据您的评论此处和您的其他问题,您似乎对此感到困惑您需要这些URL以什么形式出现.像这样的字符串已经已经是您所需要的了. urlretrieve
接受URL 作为字符串,因此您无需进行任何进一步处理.请参阅文档中的示例:
local_filename,标头= urllib.request.urlretrieve('http://python.org/')html =打开(local_filename)html.close()
但是,出于两个原因,我建议不要使用 urlretrieve
.
-
如文档所述,
urlretrieve
是一种可能会被弃用的旧方法.如果要使用urllib
,请使用urlopen
方法代替. -
但是,正如保罗·贝科特(Paul Becotte)在答案中提到的那样,您对其他问题的看法是:获取网址,我建议安装并使用请求,而不是
urllib
.更加人性化.
无论选择哪种方法,字符串都是可以的.这是使用请求"将每个指定的电子表格下载到当前目录的代码:
导入请求base_url ='https://这里是URL/'对于范围内的num(17):首先= str(num).zfill(2)秒= str(num + 1).zfill(2)filename = f'P20 {first}-{second} .xls'xls = requests.get(base_url +文件名)使用open(filename,'wb')as f:f.write(xls.content)
I am trying to create a list of URLs using a for loop. It prints all the correct URLs, but is not saving them in a list. Ultimately I want to download multiple files using urlretrieve
.
for i, j in zip(range(0, 17), range(1, 18)):
if i < 8 or j < 10:
url = "https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"
print(url)
if i == 9 and j == 10:
url = "https://Here is a URL/P200{}".format(i) + "-{}".format(j) + ".xls"
print(url)
if i > 9:
if i > 9 or j < 8:
url = "https://Here is a URL/P20{}".format(i) + "-{}".format(j) + ".xls"
print(url)
Output of above code is:
https://Here is a URL/P2000-01.xls
https://Here is a URL/P2001-02.xls
https://Here is a URL/P2002-03.xls
https://Here is a URL/P2003-04.xls
https://Here is a URL/P2004-05.xls
https://Here is a URL/P2005-06.xls
https://Here is a URL/P2006-07.xls
https://Here is a URL/P2007-08.xls
https://Here is a URL/P2008-09.xls
https://Here is a URL/P2009-10.xls
https://Here is a URL/P2010-11.xls
https://Here is a URL/P2011-12.xls
https://Here is a URL/P2012-13.xls
https://Here is a URL/P2013-14.xls
https://Here is a URL/P2014-15.xls
https://Here is a URL/P2015-16.xls
https://Here is a URL/P2016-17.xls
But this:
url
gives only:
'https://Here is a URL/P2016-17.xls'
How do I get all the URLs, not just the final one?
There are several things that could significantly simplify your code. First of all, this:
"https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"
could be simplified to this:
"https://Here is a URL/P200{}-0{}.xls".format(i, j)
And if you have at least Python 3.6, you could use an f-string instead:
f"https://Here is a URL/P200{i}-0{j}.xls"
Second of all, Python strings have a builtin zfill
method that automatically handles filling in zeroes on the left to a specified length. Additionally, range
starts from zero by default.
So your entire original code is equivalent to:
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
print(f'https://Here is a URL/P20{first}-{second}.xls')
Now, you want to actually use these URLs, not just print them out. You mentioned building a list, which can be done like so:
urls = []
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
urls.append(f'https://Here is a URL/P20{first}-{second}.xls')
Based on your comments here and on your other question, you seem to be confused about what form you need these URLs to be in. Strings like this are already what you need. urlretrieve
accepts the URL as a string, so you don't need to do any further processing. See the example in the docs:
local_filename, headers = urllib.request.urlretrieve('http://python.org/') html = open(local_filename) html.close()
However, I would recommend not using urlretrieve
, for two reasons.
As the documentation mentions,
urlretrieve
is a legacy method that may become deprecated. If you're going to useurllib
, use theurlopen
method instead.However, as Paul Becotte mentioned in an answer to your other question: if you're looking to fetch URLs, I would recommend installing and using Requests instead of
urllib
. It's more user-friendly.
Regardless of which method you choose, again, strings are fine. Here's code that that uses Requests to download each of the specified spreadsheets to your current directory:
import requests
base_url = 'https://Here is a URL/'
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
filename = f'P20{first}-{second}.xls'
xls = requests.get(base_url + filename)
with open(filename, 'wb') as f:
f.write(xls.content)
这篇关于循环创建URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!