appengine python中的parent->子关系(bigtable) [英] parent->child relationships in appengine python (bigtable)
问题描述
我还在学习关于bigtable / nosql中的数据建模方面的经验教训,并会感谢一些反馈。如果我经常需要在父母之间共同处理孩子,那么说我应该避免父母 - 子女的关系?
举个例子,假设我正在建立一个由一些作者贡献的博客,而且每个帖子都有帖子,每个帖子都有标签。所以我可以设置这样的东西:
类作者(db.Model):
owner = db。 UserProperty()
class Post(db.Model):
owner = db.ReferenceProperty(作者,
collection_name ='posts')
标签= db.StringListProperty ()
据了解,这将创建一个基于作者父母的实体组。 如果我主要需要通过我希望跨越多个作者的标签来查询帖子,这是否会导致效率低下?
我明白做一个查询列表属性可能无效。让我们说每个帖子平均有大约3个标签,但是可以一直到7个。我期望我收集的可能的标签是在几百个。
class Author(db.Model )
owner = db.UserProperty()
class Post(db.Model):
owner = db.ReferenceProperty(作者,
collection_name ='posts' )
tags = db.ListProperty(db.Key)
class标签(db.Model):
name = db.StringProperty()
或者我会做得更好吗?
$ b $
$ b class Post(db.Model):
owner = db.ReferenceProperty(作者,
collection_name ='posts')
类标签(db.Model):
name = db.StringProperty()
class PostTag(db.Model):
post = db.ReferenceProperty(Post,
collection_name ='posts')
tag = db.ReferenceProperty(Tag,
collection_name ='tags')
最后一个问题...如果我最常用的用例会要查询多个标签的帖子。例如,在{'苹果','橘子','黄瓜','自行车'}中找到标签的所有帖子这些方法之一更适合于查找具有任何集合的帖子的标签?
谢谢,我知道这是一个嘴巴。 : - )
像第一种或第二种方法一样,非常适合App Engine。考虑以下设置:
类作者(db.Model):
owner = db.UserProperty()
class Post(db.Model):
author = db.ReferenceProperty(作者,
collection_name ='posts')
tags = db.StringListProperty()
class Tag(db.Model):
post_count = db.IntegerProperty()
如果您使用字符串标签(案例归一化)作为标签实体key_name,您可以有效地查询具有特定标记的帖子,或列出帖子的标签或获取标签统计信息:
post = Post(author = some_author,tags = ['app-engine','google','python'])
post_key = post.put()
#调用一些方法来增加发布计数...
increment_tag_post_counts(post_key)
#获取具有给定标签的帖子:
matching_posts = Post.all()。filter('tags =','google')。fetch(100)
#或两个标签:
matching_posts = Post.all() ,'google')。filter('tags =','python')。fetch(100)
#从标签获取标签列表:
tag_stats = Tag.get_by_key_name(post。标签)
第三种方法需要对大多数基本操作进行额外的查询或提取,而且更难您要查询多个标签。
I'm still learning my lessons about data modeling in bigtable/nosql and would appreciate some feedback. Would it be fair to say that I should avoid parent->child relationships in my data modeling if I frequently need to deal with the children in aggregate across parents?
As an example, let's say I'm building a blog that will be contributed to by a number of authors, and each other has posts, and each post has tags. So I could potentially set up something like this:
class Author(db.Model):
owner = db.UserProperty()
class Post(db.Model):
owner = db.ReferenceProperty(Author,
collection_name='posts')
tags = db.StringListProperty()
As I understand this will create an entity group based on the Author parent. Does this cause inefficiency if I mostly need to query for Posts by tags which I expect to cut across multiple Authors?
I understand doing a query on list properties can be inefficient. Let's say each post has about 3 tags on average, but could go all the way up to 7. And I expect my collection of possible tags to be in the low hundreds. Is there any benefit to altering that model to something like this?
class Author(db.Model):
owner = db.UserProperty()
class Post(db.Model):
owner = db.ReferenceProperty(Author,
collection_name='posts')
tags = db.ListProperty(db.Key)
class Tag(db.Model):
name = db.StringProperty()
Or would I be better off doing something like this?
class Author(db.Model):
owner = db.UserProperty()
class Post(db.Model):
owner = db.ReferenceProperty(Author,
collection_name='posts')
class Tag(db.Model):
name = db.StringProperty()
class PostTag(db.Model):
post = db.ReferenceProperty(Post,
collection_name='posts')
tag = db.ReferenceProperty(Tag,
collection_name='tags')
And last question... what if my most common use case will be querying for posts by multiple tags. E.g., "find all posts with tags in {'apples', 'oranges', 'cucumbers', 'bicycles'}" Is one of these approaches more appropriate for a query that looks for posts that have any of a collection of tags?
Thanks, I know that was a mouthful. :-)
Something like the first or second approach are well suited for App Engine. Consider the following setup:
class Author(db.Model):
owner = db.UserProperty()
class Post(db.Model):
author = db.ReferenceProperty(Author,
collection_name='posts')
tags = db.StringListProperty()
class Tag(db.Model):
post_count = db.IntegerProperty()
If you use the string tag (case-normalized) as the Tag entity key_name, you can efficiently query for posts with a specific tag, or list the tags of a post, or fetch tag statistics:
post = Post(author=some_author, tags=['app-engine', 'google', 'python'])
post_key = post.put()
# call some method to increment post counts...
increment_tag_post_counts(post_key)
# get posts with a given tag:
matching_posts = Post.all().filter('tags =', 'google').fetch(100)
# or, two tags:
matching_posts = Post.all().filter('tags =', 'google').filter('tags =', 'python').fetch(100)
# get tag list from a post:
tag_stats = Tag.get_by_key_name(post.tags)
The third approach requires additional queries or fetches for most basic operations, and it is more difficult if you want to query for multiple tags.
这篇关于appengine python中的parent->子关系(bigtable)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!