在Python中智能缓存昂贵的对象 [英] Smart caching of expensive objects in Python
问题描述
我有一个按顺序排列的图像目录.通常,我的代码将使用来自图像的顺序子集(例如图像5-10)的数据,访问这些图像的天真选项是:
I have a directory of images in order. Typically my code will be using data from a sequential subset of images (e.g. images 5-10), and the naive options for accessing these are:
-
创建一个包装器对象,该方法可以在需要时加载图像并读取我的数据(例如,像素值).这几乎没有内存开销,但是会很慢,因为每次都需要加载每个图像.
Create a wrapper object with a method that loads the image when needed and reads my data (e.g. a pixel value). This has little memory overhead but will be slow as it will need to load each image every time.
将所有图像存储在内存中.这将很快,但是显然我们可以存储多少图像是有限制的.
Store all the images in memory. This will be fast but obviously there's a limit to how many images we can store.
我想找到:
- 某些方法,通过该方法,我可以定义如何读取与索引或路径相对应的图像,然后使我可以进行访问,例如
magic_image_collection[index]
,而不必担心它将返回内存中的对象还是返回该对象.重新阅读.理想情况下,这会将适当的图像或n
最近访问的图像保留在内存中.
- Some method by which I can define how to read the image corresponding to an index or a path, and then allows me to access, say
magic_image_collection[index]
without me having to worry about whether it's going to return the object in memory or read it afresh. This would ideally keep the appropriate images or then
most recently accessed images in memory.
推荐答案
如果缺少键,则可以扩展默认字典并使用__missing__
方法来调用加载函数:
You can extend the default dict and use __missing__
method to call a loading function if the key is missing:
class ImageDict(dict):
def __missing__(self, key):
self[key] = img = self.load(key)
return img
def load(self, key):
# create a queue if not exist (could be moved to __init__)
if not hasattr(self, '_queue'):
self._queue = []
# pop the oldest entry in the list and the dict
if len(self._queue) >= 100:
self.pop(self._queue.pop(0))
# append this key as a newest entry in the queue
self._queue.append(key)
# implement image loading here and return the image instance
print 'loading', key
return 'Image for %s' % key
和输出(仅当键不存在时才进行加载.)
And the output (the loading happen only when the key doesn't exist yet.)
>>> d = ImageDict()
>>> d[3]
loading 3
'Image for 3'
>>> d[3]
'Image for 3'
>>> d['bleh']
loading bleh
'Image for bleh'
>>> d['bleh']
'Image for bleh'
一种演变是将仅N个last元素存储在dict中,并清除最旧的条目.您可以通过保留要订购的钥匙清单来实现它.
这篇关于在Python中智能缓存昂贵的对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!