用于更改 URL 的脚本 [英] Script for a changing URL

查看:34
本文介绍了用于更改 URL 的脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在编写执行以下操作的流程或脚本时遇到了一些麻烦:

I am having a bit of trouble in coding a process or a script that would do the following:

我需要从以下网址获取数据:

I need to get data from the URL of:

nomads.ncep.noaa.gov/dods/gfs_hd/gfs_hd20140430/gfs_hd_00z

但是文件 URL(日期和模型运行发生变化),因此它必须为变量假定此基本结构.

But the file URL's (the days and model runs change), so it has to assume this base structure for variables.

Y - Year 
M - Month
D - Day
C - Model Forecast/Initialization Hour
F- Model Frame Hour

像这样:

nomads.ncep.noaa.gov/dods/gfs_hd/gfs_hdYYYYMMDD/gfs_hd_CCz

此脚本将运行,然后导入该日期(在 YYYYMMDD 以及 CC 中)以及这些变量编码 -

This script would run, and then import that date (in the YYYYMMDD, as well as CC) with those variables coded -

所以虽然任务是获得

http://nomads.ncep.noaa.gov/dods/gfs_hd/gfs_hd20140430/gfs_hd_00z

虽然这些变量对应的是获取当前日期的格式:

While these variables correspond to get the current dates in the format of:

http://nomads.ncep.noaa.gov/dods/gfs_hd/gfs_hdYYYYMMDD/gfs_hd_CCz

你能告诉我如何去获取 URL 以找到这种格式的最新日期吗?无论是脚本还是 wget 的东西,我都听得一清二楚.先感谢您.

Can you please advise how to go about and get the URL's to find the latest date in this format? Whether it'd be a script or something with wget, I'm all ears. Thank you in advance.

推荐答案

Python 中,requests 库可用于获取 URL.

In Python, the requests library can be used to get at the URLs.

您可以使用基本 URL 字符串和使用 datetime 类及其 timedelta 方法结合其 strftime 生成时间戳的组合来生成 URL 方法以所需格式生成日期.

You can generate the URL using a combination of the base URL string plus generating the timestamps using the datetime class and its timedelta method in combination with its strftime method to generate the date in the format required.

即首先使用 datetime.datetime.now() 获取当前时间,然后在循环中通过 timedelta 减去一个小时(或您认为他们正在使用的任何时间梯度)并继续使用 requests 库检查 URL.您看到的第一个是最新的,然后您可以对其进行任何需要的进一步处理.

i.e. start by getting the current time with datetime.datetime.now() and then in a loop subtract an hour (or whichever time gradient you think they're using) via timedelta and keep checking the URL with the requests library. The first one you see that's there is the latest one, and you can then do whatever further processing you need to do with it.

如果您需要抓取页面内容,scrapy 非常适合.

If you need to scrape the contents of the page, scrapy works well for that.

这篇关于用于更改 URL 的脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆