用于标签搜索的数据存储解决方案 [英] Datastore solution for tag search
问题描述
我已经通过预先计算的分数订购了数百万个商品.每个项目都有许多布尔属性. 让我们说一共有大约一万种可能的属性,每一项都有十几个.
I've got millions of items ordered by a precomputed score. Each item has many boolean attributes. Let says that there is about ten thousand possible attributes totally, each item having dozen of them.
我希望能够实时(<毫秒>)(几毫秒)请求给定属性组合的前n个项.
I'd like to be able to request in realtime (few milliseconds) the top n items given ~any combination of attributes.
您会推荐什么解决方案?我正在寻找可扩展的东西.
What solution would you recommend? I am looking for something extremely scalable.
-
-我们目前正在查看 mongodb 和数组索引,您看到任何限制吗?
- SolR 是可能的解决方案,但我们不需要文本搜索功能.
--
- We are currently looking at mongodb and array index, do you see any limitation ?
- SolR is a possible solution but we do not need text search capabilities.
推荐答案
如果您像这样存储对象,Mongodb可以满足您的需求
Mongodb can handle what you want, if you stored your objects like this
{ score:2131, attributes: ["attr1", "attr2", "attr3"], ... }
然后以下查询将匹配所有具有att1和attr2的项目
Then the following query will match all the items that have att1 and attr2
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] } })
但这不匹配
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr4" ] } })
查询返回一个游标,如果您希望对该游标进行排序,则只需将排序参数添加到查询中即可,
the query returns a cursor, if you want this cursor to be sorted, then just add the sort parameters to the query like so
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] }}).sort({score:1})
看看高级查询,看看有什么可能.
Have a look at Advanced Queries to see what's possible.
可以如下设置适当的索引
Appropriate indexes can be setup as follows
db.mycol.ensureIndex({attributes:1, score:1})
您可以使用来获取性能信息
And you can get performance information using
db.mycol.find({ attributes: { $all: [ "attr1" ] }}).explain()
Mongo解释了扫描了多少个对象,操作花费了多长时间 以及其他各种统计信息.
Mongo explains how many objects were scanned, how long the operation took and various other statistics.
这篇关于用于标签搜索的数据存储解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!