每分钟用新值更新CSV:sched,time vs. apscheduler [英] Update CSV with new values every minute: sched,time vs. apscheduler

查看:47
本文介绍了每分钟用新值更新CSV:sched,time vs. apscheduler的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我具有以下代码,旨在从网站提取json数据并将其记录到csv文件中:

I have the following code designed to pull json data from a website and record it to a csv file:

def rec_price():
    with urllib.request.urlopen('some_url') as url:
        data = json.loads(url.read().decode())
    df = pd.DataFrame(data)

    df1 = df[['bpi','time']]

    x = df1.loc['USD', 'bpi']['rate']
    y = df1.loc['updated', 'time']

    df2 = pd.DataFrame({'data': [x], 'time' : [y]}) 

    df2['time'] = pd.to_datetime(df2['time'])

    with open('out.csv', 'a') as f:
        df2.to_csv(f, header=False)

我想无限期地每60秒运行一次此代码.似乎可用的两个选项是安装 apscheduler 或使用pythons标准的 import sched,time 模块...我想知道,两者之间有什么区别两个模块?一个更适合该任务的人吗?我将如何实施该模块?

I would like to run this code every 60 seconds, indefinitely. It seems like the two options available are to install apscheduler or to use pythons standard import sched, time module... I would like to know, what are the differences between the two modules? Is one better suited to the task? How would I implement the module?

推荐答案

from threading import Timer

t = None # It is advisable to have a Timer() saved globally

def refresh ():
    global t
    # Get your CSV and save it here, then:
    t = Timer(60, refresh)
    t.daemon = True
    t.start()

refresh()

或者:

from thread import start_new_thread as thread
from time import sleep
from urllib2 import URLError, HTTPError, urlopen
import urllib2

def refresh ():
    while 1:
        try:
            # Get and save your CSV here, then:
            sleep(60)
        except (URLError, HTTPError):
            pass
        except urllib2.socket.timeout:
            pass
        except:
            break

thread(refresh,())
# Or just refresh() if you want your script to do just this and nothing else

要完成我的答案:sched模块的功能与上面的代码非常相似,但是它允许您随时添加无限"个要调用的函数,还可以指定其执行优先级以尝试实时执行.简而言之,它模仿了cron的一部分.但是,对于您的需要,这将是一个过大的杀伤力.您将必须设置一个在固定时间后启动的事件,然后在执行后将其重新添加回去,依此类推.当您有多个函数要在不同的时间间隔或使用不同的参数触发时,可以使用sched.老实说,我个人不会使用sched模块.太粗糙了.相反,我将改编上面提供的代码以模拟sched的功能.

To complete my answer: sched module does very similar thing as code above, but it allows you to add "indefinite" number of functions to be called at any time and you can, also, specify priorities of their executions to attempt real-time execution. In short, it emulates part of cron. But, for what you need, this would be an overkill. You would have to setup an event to be launched after fixed amount of time, then re-add it back after its execution and so on. You use sched when you have more than one function to be fired in different time intervals or with different arguments etc. To be honest, I would personally never use sched module. It is too rough. Instead I would adapt codes I presented above to emulate sched's capabilities.

这篇关于每分钟用新值更新CSV:sched,time vs. apscheduler的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆