put()后读取App Engine数据存储区中的延迟 [英] Read delay in App Engine Datastore after put()

查看:109
本文介绍了put()后读取App Engine数据存储区中的延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为博客/新闻网站编写代码。主页有10个最近的文章,还有一个存档部分,所有文章按修改时间降序排序。在归档部分,我使用基于游标的分页,并缓存从第二页开始的结果,因为只有在发布新文章或现有文档出于某种原因而变为草稿时页面才会更改。每页有10篇文章。因此,当用户使用某个数字(而不是第一个数字)访问存档页面时,首先会检查该memcache的页码结果。如果页面不存在,则检查memcache是​​否为该页面的游标,然后使用该游标从数据存储中获取结果:

  class archivePage:
def GET(self,page):
如果不是页面:
articles = memcache.get('archivePage')
如果不是文章:
articles = fetchArticles()
memcache.set('archivePage',articles)
else:$ b $如果int(页面)== 0或int(页面)== 1:
提高web.seeother('/ archive')
articles = memcache.get('archivePage'+ page)
如果不是文章:
pageCursor = memcache.get('ArchivePageMapping'+页面)
if pageCursor:
pageMapping = ArchivePageMapping.query(ArchivePageMapping.page == int(page))。get()
pageCursor = pageMapping.cursor
memcache.set('弧hivePageMapping'+ page,pageCursor)$ b $ articles = fetchArticles(cursor = Cursor(urlsafe = pageCursor))
memcache.set('archivePage'+ page,articles)


  class addArticlePage:
def POST(self):
formData = web.input()
如果formData.title和formData.content:
article = Article(title = formData.title,
content = formData.content,
status = int(formData.status))
key = article.put()
if(formData.status)== 1:
cacheArchivePages()
raise web.seeother ('/ article /%s'%key.id())

def cacheArchivePages():
articles,cursor,moreArticles = fetchArticlesPage()
memcache.set(' archivePage',articles)
pageNumber = 2
while moreArticles:
pageMapping = ArchivePageMapping.query(ArchivePageMapping.page == pageNumber).get()
if pageMapping:
pageMapping.cursor = cursor.urlsafe()
else:
pageMapping = ArchivePageMapping(page = pageNumber,
cursor = cursor.urlsafe())
pageMapping.put()
memcache.set('ArchivePageMapping'+ str(pageNumber),cursor.urlsafe ))
articles,cursor,moreArticles = fetchArticlesPage(cursor = cursor)
memcache.set('archivePage'+ str(pageNumber),articles)
pageNumber + = 1

问题出在这里。刷新缓存后有时(没有规律,它随机发生)我得到与刷新前相同的结果和归档页面的光标。例如,我添加了一篇新文章。它被保存在数据存储中,并且出现在首页和存档的第一页上(存档的第一页未被高速缓存)。但其他存档页面不会更新。我测试了我的cacheArchivePages()函数,它按预期工作。在将数据存储的更新放入()之后,并在cacheArchivePages()函数中执行fetchArticlesPage()之前,是否可以这样过去所花费的时间太少?也许写交易还没有完成,所以我得到旧的结果?我试图使用time.sleep()并在调用cacheArchivePages()之前等待几秒钟,在这种情况下,我无法重现该行为,但在我看来,time.sleep()并不是一个好主意。无论如何,我需要知道该行为的确切原因以及如何处理它。

通过最终一致的查询。当使用HR数据存储时,查询可能会使用稍微旧的数据,并且put()写入的数据对于查询可见(使用key或id没有get()的延迟)需要一段时间。延迟通常以秒为单位进行测量,但我认为我们不保证上限 - 如果您遇到不幸的网络分区,可能需要数小时,我想象一下。



有几种不完全的解决方案,从最近一次写作的作者查看查询结果到使用祖先查询(它们有其自身的限制)时作弊。您可以简单地给您的缓存一个有限的生命周期,并更新它在读取而不是写入。



祝你好运!


I write a code for a blog/news site. Main page has 10 most recent articles and also there is an archive section with all articles sorted by modification time descending. In archive section I use pagination based on cursors and I cache results starting from the second page as pages are changed only when new article is published or existing goes to drafts for some reason. Every page has 10 articles. So when a user hits an archive page with some number (not the first one) memcache is checked for that page number results first. If the page is not there, memcache is checked for the cursor for that page and then results are fetched from datastore using that cursor:

class archivePage:
    def GET(self, page):
        if not page:
            articles = memcache.get('archivePage')
            if not articles:
                articles = fetchArticles()
                memcache.set('archivePage', articles)
        else:
            if int(page) == 0 or int(page) == 1:
                raise web.seeother('/archive')
            articles = memcache.get('archivePage'+page)
            if not articles:
                pageCursor = memcache.get('ArchivePageMapping'+page)
                if not pageCursor:
                    pageMapping = ArchivePageMapping.query(ArchivePageMapping.page == int(page)).get()
                    pageCursor = pageMapping.cursor
                    memcache.set('ArchivePageMapping'+page, pageCursor)
                articles = fetchArticles(cursor=Cursor(urlsafe=pageCursor))
                memcache.set('archivePage'+page, articles)

Every time a new article is created or the status of an existing article is changed (draft/published) I refresh the cache for archive pages results and cursors. I do it after saving an article to the datastore:

class addArticlePage:     
    def POST(self):
        formData = web.input()
        if formData.title and formData.content:
            article = Article(title=formData.title,
                              content=formData.content,
                              status=int(formData.status))
            key = article.put()
            if int(formData.status) == 1:
                cacheArchivePages()
            raise web.seeother('/article/%s' % key.id())

def cacheArchivePages():
    articles, cursor, moreArticles = fetchArticlesPage()
    memcache.set('archivePage', articles)
    pageNumber=2
    while moreArticles:
        pageMapping = ArchivePageMapping.query(ArchivePageMapping.page == pageNumber).get()
        if pageMapping:
            pageMapping.cursor = cursor.urlsafe()
        else:
            pageMapping = ArchivePageMapping(page=pageNumber,
                                            cursor=cursor.urlsafe())
        pageMapping.put()
        memcache.set('ArchivePageMapping'+str(pageNumber), cursor.urlsafe())
        articles, cursor, moreArticles = fetchArticlesPage(cursor=cursor)
        memcache.set('archivePage'+str(pageNumber), articles)
        pageNumber+=1

And here comes the problem. Sometimes (there is no law, it happens randomly) after refreshing the cache I get the same results and cursors for archive pages as before the refresh. For example I add a new article. It is saved in the datastore and it appears on the front page and on the first page in the archive (the first page of the archive is not cached). But other archive pages are not updated. I've tested my cacheArchivePages() function and it works as expected. Could it be so that too little time has passed after I put() an update to the datastore and before I fetchArticlesPage() in cacheArchivePages() function? Maybe the write transaction didn't finish yet and so I get old results? I tried to use time.sleep() and wait a few seconds before calling cacheArchivePages() and in that case I was not able to reproduce that behaviour, but it seems to me that time.sleep() is not a good idea. Anyway I need to know the exact cause for that behaviour and how to deal with it.

解决方案

You're most likely being hit by "eventually consistent queries". When using the HR datastore, queries may use slightly old data, and it takes a while for data written by put() to be visible to queries (there is no such delay for get() by key or id). The delay is typically measured in seconds but I don't think we guarantee an upper bound -- if you're hit by an unfortunate network partition it might be hours, I imagine.

There are all sorts of partial solutions, from cheating when the author of the most recent writes is viewing the query results to using ancestor queries (which have their own share of limitations). You might simply give your cache a limited lifetime and update it on read instead of write.

Good luck!

这篇关于put()后读取App Engine数据存储区中的延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆