django querysets + memcached:最佳做法 [英] django querysets + memcached: best practices

查看:133
本文介绍了django querysets + memcached:最佳做法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试了解在django低级缓存中发生的情况.get()
特别是有关queryset的哪些部分存储在memcached中的详细信息。



首先,我正确地解释django文档?




  • 一个查询器(python对象)拥有/维护自己的缓存

  • 访问数据库是懒惰的;即使queryset.count是1000,
    如果我做一个object.get为1记录,那么dbase将只有
    访问一次,为该记录。

  • 当通过apache prefork MPM访问django视图时,每次
    a特定后台进程实例X最终调用包含
    的特定视图,如tournres_qset = TournamentResult.objects.all(),
    这将导致,每次,在一个新的tournres_qset对象
    被创建。也就是说,任何可能已经由前一个(tcp / ip)访问的tournres_qset python对象内部
    缓存的内容,
    根本不被新请求的tournres_qset使用。


现在有关将内容保存到视图中的memcached的问题。
假设我在视图的顶部添加了这样的东西:

  tournres_qset = cache.get('tournres ',无)
如果tournres_qset为None:
tournres_qset = TournamentResult.objects.all()
cache.set('tournres',tournres_qset,timeout)
#现在开始访问tournres_qset
#...

在cache.set()?中存储什么? / p>


  • 整个查询器(python对象)是否被序列化并保存?


  • 由于查询器尚未被使用以获取任何记录,这个
    是否浪费时间,因为在memcache中没有保存特定记录的内容
    ? (任何未来的请求都将从memcache获取查询器
    对象,这将永远开始新鲜,空的本地
    查询缓存;对dbase的访问将始终发生。)


  • 如果上述是正确的,那么我应该总是重新保存查询结束时的
    ,在整个vierw被使用后才能访问
    一些记录,这将导致查询器的本地缓存更新,
    ,哪些应该总是重新保存到memcached?但是,这将永远是
    导致再一次序列化查询对象。
    为了加快速度,这么多。或者,cache.set()强制查询对象迭代,
    从dbase访问所有的记录,这也将获得保存在
    memcache?一切都会得到保存,即使视图只访问
    查询集的一个子集?




我在各个方向看到陷阱,这让我觉得我是

误会一大堆东西。



希望这是有道理的,欣赏澄清或指向一些
标准指南。谢谢。

解决方案

查询是懒惰的,这意味着它们在评估之前不会调用数据库。他们可以得到评估的一种方法是将它们序列化,这就是幕后的 cache.set 。所以不,这不是浪费时间:你的比赛模型的全部内容将被缓存,如果这是你想要的。它可能不是:如果您进一步过滤查询器,Django将只返回到数据库,这将使整个事情有点无意义。您应该只是缓存实际需要的模型实例。



请注意,初始集中的第三点不太正确,因为这与Apache无关或预约。这只是一个视图是像任何其他的功能,当函数返回时,在函数内的局部变量中定义的任何内容都将超出范围。因此,当视图返回响应时,视图内定义和评估的查询集超出了范围,下次调用视图时即将在下一个请求时创建一个新的查询。无论您为Django提供哪种方式,都是这种情况。



然而,这一点非常重要,如果你做一些设置你的查询器到全局(模块级)变量,它将在请求之间持续存在。 Django服务的大部分方式,这绝对包含mod_wsgi,在回收之前,会为许多请求保留一个进程,因此所有这些请求的查询集的值都是一样的。这可以作为一种讨价还价的基础缓存是有用的,但是很难正确的,因为你不知道进程将持续多长时间,其他进程可能并行运行,它们有自己的版本的全局变量



更新以回覆问题



您的问题显示您仍然没有完全了解查询器的工作原理。这些都是关于它们被评估的时候:如果你列出或迭代或切片查询器,对它进行评估,那么数据库调用就是这样做的(我在这里迭代迭代计算序列化),并且存储在查询内部缓存。所以,如果你已经完成了一个这样的事情到你的queryset,然后将它设置为(外部)缓存,这不会导致另一个数据库的命中。



但是,每个 filter()在一个查询器中运行,即使是一个评估,是另一个数据库命中。这是因为它是基础SQL查询的修改,所以Django可以返回到数据库,并返回一个新的查询集,其自身内部缓存。


Trying to understand what happens during a django low-level cache.set() Particularly, details about what part of the queryset gets stored in memcached.

First, am I interpreting the django docs correctly?

  • a queryset (python object) has/maintains its own cache
  • access to the database is lazy; even if the queryset.count is 1000, if I do an object.get for 1 record, then the dbase will only be accessed once, for that 1 record.
  • when accessing a django view via apache prefork MPM, everytime that a particular daemon instance X ends up invoking a particular view that includes something like "tournres_qset = TournamentResult.objects.all()", this will then result, each time, in a new tournres_qset object being created. That is, anything that may have been cached internally by a tournres_qset python object from a previous (tcp/ip) visit, is not used at all by a new request's tournres_qset.

Now the questions about saving things to memcached within the view. Let's say I add something like this at the top of the view:

tournres_qset = cache.get('tournres', None)
if tournres_qset is None:
    tournres_qset = TournamentResult.objects.all()
    cache.set('tournres', tournres_qset, timeout)
# now start accessing tournres_qset
# ...

What gets stored during the cache.set()?

  • Does the whole queryset (python object) get serialized and saved?

  • Since the queryset hasn't been used yet to get any records, is this just a waste of time, since no particular records' contents are actually being saved in memcache? (Any future requests will get the queryset object from memcache, which will always start fresh, with an empty local queryset cache; access to the dbase will always occur.)

  • If the above is true, then should I just always re-save the queryset at the end of the view, after it's been used throughout the vierw to access some records, which will result in the queryset's local cache to get updated, and which should always get re-saved to memcached? But then, this would always result in once again serializing the queryset object. So much for speeding things up.

  • Or, does the cache.set() force the queryset object to iterate and access from the dbase all the records, which will also get saved in memcache? Everything would get saved, even if the view only accesses a subset of the query set?

I see pitfalls in all directions, which makes me think that I'm
misunderstanding a whole bunch of things.

Hope this makes sense and appreciate clarifications or pointers to some "standard" guidelines. Thanks.

解决方案

Querysets are lazy, which means they don't call the database until they're evaluated. One way they could get evaluated would be to serialize them, which is what cache.set does behind the scenes. So no, this isn't a waste of time: the entire contents of your Tournament model will be cached, if that's what you want. It probably isn't: and if you filter the queryset further, Django will just go back to the database, which would make the whole thing a bit pointless. You should just cache the model instances you actually need.

Note that the third point in your initial set isn't quite right, in that this has nothing to do with Apache or preforking. It's simply that a view is a function like any other, and anything defined in a local variable inside a function goes out of scope when that function returns. So a queryset defined and evaluated inside a view goes out of scope when the view returns the response, and a new one will be created the next time the view is called, ie on the next request. This is the case whichever way you are serving Django.

However, and this is important, if you do something like set your queryset to a global (module-level) variable, it will persist between requests. Most of the ways that Django is served, and this definitely includes mod_wsgi, keep a process alive for many requests before recycling it, so the value of the queryset will be the same for all of those requests. This can be useful as a sort of bargain-basement cache, but is difficult to get right because you have no idea how long the process will last, plus other processes are likely to be running in parallel which have their own versions of that global variable.

Updated to answer questions in the comment

Your questions show that you still haven't quite grokked how querysets work. It's all about when they are evaluated: if you list, or iterate, or slice a queryset, that evaluates it, and it's at that point the database call is made (I count serialization under iterating, here), and the results stored in the queryset's internal cache. So, if you've already done one of those things to your queryset, and then set it to the (external) cache, that won't cause another database hit.

But, every filter() operation on a queryset, even one that's already evaluated, is another database hit. That's because it's a modification of the underlying SQL query, so Django goes back to the database - and returns a new queryset, with its own internal cache.

这篇关于django querysets + memcached:最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆