如何缓存分页的 Django 查询集 [英] How to cache a paginated Django queryset

查看:40
本文介绍了如何缓存分页的 Django 查询集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何缓存分页的 Django 查询集,特别是在 ListView 中?

How do you cache a paginated Django queryset, specifically in a ListView?

我注意到一个查询需要很长时间才能运行,所以我试图缓存它.查询集很大(超过 10 万条记录),所以我试图只缓存它的分页子部分.我无法缓存整个视图或模板,因为有些部分是用户/会话特定的并且需要不断更改.

I noticed one query was taking a long time to run, so I'm attempting to cache it. The queryset is huge (over 100k records), so I'm attempting to only cache paginated subsections of it. I can't cache the entire view or template because there are sections that are user/session specific and need to change constantly.

ListView 有两个检索查询集的标准方法,get_queryset(),它返回非分页数据,以及 paginate_queryset(),它通过当前页面.

ListView has a couple standard methods for retrieving the queryset, get_queryset(), which returns the non-paginated data, and paginate_queryset(), which filters it by the current page.

我第一次尝试在 get_queryset() 中缓存查询,但很快意识到调用 cache.set(my_query_key, super(MyView, self).get_queryset()) 是导致整个查询被序列化.

I first tried caching the query in get_queryset(), but quickly realized calling cache.set(my_query_key, super(MyView, self).get_queryset()) was causing the entire query to be serialized.

然后我尝试覆盖 paginate_queryset() 像:

So then I tried overriding paginate_queryset() like:

import time
from functools import partial
from django.core.cache import cache
from django.views.generic import ListView

class MyView(ListView):

    ...

    def paginate_queryset(self, queryset, page_size):
        cache_key = 'myview-queryset-%s-%s' % (self.page, page_size)
        print 'paginate_queryset.cache_key:',cache_key
        t0 = time.time()
        ret = cache.get(cache_key)
        if ret is None:
            print 're-caching'
            ret = super(MyView, self).paginate_queryset(queryset, page_size)
            cache.set(cache_key, ret, 60*60)
        td = time.time() - t0
        print 'paginate_queryset.time.seconds:',td
        (paginator, page, object_list, other_pages) = ret
        print 'total objects:',len(object_list)
        return ret

然而,即使只检索了 10 个对象,这也需要将近一分钟的时间来运行,并且每个请求都显示重新缓存",这意味着没有将任何内容保存到缓存中.

However, this takes almost a minute to run, even though only 10 objects are retrieved, and every requests shows "re-caching", implying nothing is being saved to cache.

我的 settings.CACHE 看起来像:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '127.0.0.1:11211',
    }
}

service memcached status 显示 memcached 正在运行,tail -f/var/log/memcached.log 完全没有显示.

and service memcached status shows memcached is running and tail -f /var/log/memcached.log shows absolutely nothing.

我做错了什么?缓存分页查询以便不检索整个查询集的正确方法是什么?

What am I doing wrong? What is the proper way to cache a paginated query so that the entire queryset isn't retrieved?

我认为它们可能是 memcached 或 Python 包装器中的错误.Django 似乎支持两种不同的 memcached 后端,一种使用 python-memcached,一种使用 pylibmc.python-memcached 似乎默默地隐藏了缓存 paginate_queryset() 值的错误.当我切换到 pylibmc 后端时,现在我收到一条明确的错误消息来自 memcached_set 的错误 10:服务器错误",追溯到 django/core/cache/backends/memcached.py in set,第 78 行.

I think their may be a bug in either memcached or the Python wrapper. Django appears to support two different memcached backends, one using python-memcached and one using pylibmc. The python-memcached seems to silently hide the error caching the paginate_queryset() value. When I switched to the pylibmc backend, now I get an explicit error message "error 10 from memcached_set: SERVER ERROR" tracing back to django/core/cache/backends/memcached.py in set, line 78.

推荐答案

您可以扩展 Paginator 以通过提供的 cache_key 支持缓存.

You can extend the Paginator to support caching by a provided cache_key.

可以在 CachedPaginator 的使用和实现的博客文章"nofollow">此处.源代码发布在 djangosnippets.org(这里是 web-acrhive 链接 因为原来的不起作用).

A blog post about usage and implementation of a such CachedPaginator can be found here. The source code is posted at djangosnippets.org (here is a web-acrhive link because the original is not working).

不过,我将发布一个对原始版本稍加修改的示例,它不仅可以缓存每页的对象,还可以缓存总数.(有时甚至计数也可能是一项昂贵的操作).

However I will post a slightly modificated example from the original version, which can not only cache objects per page, but the total count too. (sometimes even the count can be an expensive operation).

from django.core.cache import cache
from django.utils.functional import cached_property
from django.core.paginator import Paginator, Page, PageNotAnInteger


class CachedPaginator(Paginator):
    """A paginator that caches the results on a page by page basis."""
    def __init__(self, object_list, per_page, orphans=0, allow_empty_first_page=True, cache_key=None, cache_timeout=300):
        super(CachedPaginator, self).__init__(object_list, per_page, orphans, allow_empty_first_page)
        self.cache_key = cache_key
        self.cache_timeout = cache_timeout

    @cached_property
    def count(self):
        """
            The original django.core.paginator.count attribute in Django1.8
            is not writable and cant be setted manually, but we would like
            to override it when loading data from cache. (instead of recalculating it).
            So we make it writable via @cached_property.
        """
        return super(CachedPaginator, self).count

    def set_count(self, count):
        """
            Override the paginator.count value (to prevent recalculation)
            and clear num_pages and page_range which values depend on it.
        """
        self.count = count
        # if somehow we have stored .num_pages or .page_range (which are cached properties)
        # this can lead to wrong page calculations (because they depend on paginator.count value)
        # so we clear their values to force recalculations on next calls
        try:
            del self.num_pages
        except AttributeError:
            pass
        try:
            del self.page_range
        except AttributeError:
            pass

    @cached_property
    def num_pages(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).num_pages

    @cached_property
    def page_range(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).page_range

    def page(self, number):
        """
        Returns a Page object for the given 1-based page number.

        This will attempt to pull the results out of the cache first, based on
        the requested page number. If not found in the cache,
        it will pull a fresh list and then cache that result + the total result count.
        """
        if self.cache_key is None:
            return super(CachedPaginator, self).page(number)

        # In order to prevent counting the queryset
        # we only validate that the provided number is integer
        # The rest of the validation will happen when we fetch fresh data.
        # so if the number is invalid, no cache will be setted
        # number = self.validate_number(number)
        try:
            number = int(number)
        except (TypeError, ValueError):
            raise PageNotAnInteger('That page number is not an integer')

        page_cache_key = "%s:%s:%s" % (self.cache_key, self.per_page, number)
        page_data = cache.get(page_cache_key)

        if page_data is None:
            page = super(CachedPaginator, self).page(number)
            #cache not only the objects, but the total count too.
            page_data = (page.object_list, self.count)
            cache.set(page_cache_key, page_data, self.cache_timeout)
        else:
            cached_object_list, cached_total_count = page_data
            self.set_count(cached_total_count)
            page = Page(cached_object_list, number, self)

        return page

这篇关于如何缓存分页的 Django 查询集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆