记忆SQL查询 [英] Memoizing SQL queries
本文介绍了记忆SQL查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
说我有一个运行SQL查询并返回数据帧的函数:
Say I have a function that runs a SQL query and returns a dataframe:
import pandas.io.sql as psql
import sqlalchemy
query_string = "select a from table;"
def run_my_query(my_query):
# username, host, port and database are hard-coded here
engine = sqlalchemy.create_engine('postgresql://{username}@{host}:{port}/{database}'.format(username=username, host=host, port=port, database=database))
df = psql.read_sql(my_query, engine)
return df
# Run the query (this is what I want to memoize)
df = run_my_query(my_query)
我想:
- 能够通过每个
query_string
值(即每个查询)一个缓存条目来记住我的查询 - 能够按需(例如基于某个标志)强制执行缓存重置,例如以便我认为数据库已更改时可以更新缓存.
- Be able to memoize my query above with one cache entry per value of
query_string
(i.e. per query) - Be able to force a cache reset on demand (e.g. based on some flag), e.g. so that I can update my cache if I think that the database has changed.
推荐答案
是的,您可以使用 joblib 完成此操作(此示例基本上会粘贴自己):
Yes, you can do this with joblib (this example basically pastes itself):
>>> from tempfile import mkdtemp
>>> cachedir = mkdtemp()
>>> from joblib import Memory
>>> memory = Memory(cachedir=cachedir, verbose=0)
>>> @memory.cache
... def run_my_query(my_query)
... ...
... return df
您可以使用memory.clear()
清除缓存.
请注意,您还可以使用lru_cache
甚至是手动"使用简单的字典:
Note you could also use lru_cache
or even "manually" with a simple dict:
def run_my_query(my_query, cache={})
if my_query in cache:
return cache[my_query]
...
cache[my_query] = df
return df
您可以用run_my_query.func_defaults[0].clear()
清除缓存(虽然不确定我会推荐这样做,只是认为这是一个有趣的示例).
You could clear the cache with run_my_query.func_defaults[0].clear()
(not sure I'd recommend this though, just thought it was a fun example).
这篇关于记忆SQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文