遍历python中的daterange [英] Iterating through a daterange in python

查看:43
本文介绍了遍历python中的daterange的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,所以我对编程还比较陌生,这让我感到非常沮丧.我从网站上抓取数据,并且数据每周都会更改.每当数据从2015年9月9日开始更改为最新数据时,我都希望运行我的抓取过程.

Okay so I am relatively new to programming and this has me absolutely stumped. Im scraping data from a website and the data changes every week. I want to run my scraping process each time the data changes starting back on 09-09-2015 and running to current.

我知道如何通过每个数字(例如0909然后0910然后0911)轻松地运行此操作,但这不是我所需要的,因为这将向服务器请求太多毫无意义的请求.

I know how to do this easily running thru every number like 0909 then 0910 then 0911 but that is not what I need as that will be requesting way too many requests from the server that are pointless.

这是URL的格式 http://www.myexamplesite.com/?date=09092015

我知道很简单:

for i in range(startDate, endDate):
    url = 'http://www.myexamplesite.com/?date={}'.format(i)
    driver.get(url)

但是我一直无法弄清的一件事是操纵python的dateTime来准确反映网站使用的格式.

But one thing i've never been able to figure out is manipulate pythons dateTime to accurately reflect the format the website uses.

即:0909201509162015092320150930201510072015...09272017

i.e: 09092015 09162015 09232015 09302015 10072015 ... 09272017

如果所有其他方法都失败了,那么我只需要执行一次操作,这样就不会花费太长时间来完全忽略循环,只需手动输入我希望从中抓取的日期,然后将所有数据帧附加在一起即可.我主要对如何在将来可能需要更多数据的项目中从这种意义上操纵日期时间函数感到好奇.

If all else fails I only need to do this once so it wouldnt take too long to just ignore the loop altogether and just manually enter the date I wish to scrape from and then just append all of my dataframes together. Im mainly curious on how to manipulate the datetime function in this sense for future projects that may require more data.

推荐答案

datetime timedelta 对象文档.

A good place to start are datetime, date and timedelta objects docs.

首先,让我们构造我们的开始日期和结束日期(今天):

First, let's construct our starting date and ending date (today):

>>> from datetime import date, timedelta
>>> start = date(2015, 9, 9)
>>> end = date.today()
>>> start, end
(datetime.date(2015, 9, 9), datetime.date(2017, 9, 27))

现在让我们定义增量的单位-一天:

Now let's define the unit of increment -- one day:

>>> day = timedelta(days=1)
>>> day
datetime.timedelta(1)

关于日期( date / datetime )和时间增量( timedelta )的一件好事是它们,可以将其添加:

A nice thing about dates (date/datetime) and time deltas (timedelta) is they and can be added:

>>> start + day
datetime.date(2015, 9, 10)

我们还可以使用 format() 以易于理解的形式获取该日期:

We can also use format() to get that date in a human-readable form:

>>> "{date.day:02}{date.month:02}{date.year}".format(date=start+day)
'10092015'

因此,当我们将所有这些放在一起时:

So, when we put all this together:

from datetime import date, timedelta

start = date(2015, 9, 9)
end = date.today()
week = timedelta(days=7)

mydate = start
while mydate < end:
    print("{date.day:02}{date.month:02}{date.year}".format(date=mydate))
    mydate += week

我们获得了从 2015-09-09 开始到今天结束的日期的简单迭代,增加了7天(一周):

we get a simple iteration over dates starting with 2015-09-09 and ending with today, incremented by 7 days (a week):

09092015
16092015
23092015
30092015
07102015
...

这篇关于遍历python中的daterange的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆