标签搜索的数据存储解决方案 [英] Datastore solution for tag search

查看:33
本文介绍了标签搜索的数据存储解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数以百万计的项目按预先计算的分数排序.每个项目都有许多布尔属性.假设总共有大约一万个可能的属性,每个项目都有十几个.

I've got millions of items ordered by a precomputed score. Each item has many boolean attributes. Let says that there is about ten thousand possible attributes totally, each item having dozen of them.

我希望能够实时请求(几毫秒)给定任意属性组合的前 n 个项目.

I'd like to be able to request in realtime (few milliseconds) the top n items given ~any combination of attributes.

您会推荐什么解决方案?我正在寻找可扩展性极强的东西.

What solution would you recommend? I am looking for something extremely scalable.

--
- 我们目前正在研究 mongodb 和数组索引,您发现有什么限制吗?
- SolR 是一种可能的解决方案,但我们不需要文本搜索功能.

--
- We are currently looking at mongodb and array index, do you see any limitation ?
- SolR is a possible solution but we do not need text search capabilities.

推荐答案

Mongodb 可以处理你想要的,如果你像这样存储你的对象

Mongodb can handle what you want, if you stored your objects like this

{ score:2131, attributes: ["attr1", "attr2", "attr3"], ... }

那么下面的查询将匹配所有具有att1和attr2的项目

Then the following query will match all the items that have att1 and attr2

c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] } })

但这不会匹配它

c = db.mycol.find({ attributes: { $all: [ "attr1", "attr4" ] } })

查询返回一个游标,如果你想对这个游标进行排序,那么只需像这样将排序参数添加到查询中

the query returns a cursor, if you want this cursor to be sorted, then just add the sort parameters to the query like so

c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] }}).sort({score:1})

查看高级查询,看看有什么可能.

Have a look at Advanced Queries to see what's possible.

适当的索引可以设置如下

Appropriate indexes can be setup as follows

db.mycol.ensureIndex({attributes:1, score:1})

您可以使用

db.mycol.find({ attributes: { $all: [ "attr1" ] }}).explain()

Mongo 解释了扫描了多少对象,操作花了多长时间以及其他各种统计数据.

Mongo explains how many objects were scanned, how long the operation took and various other statistics.

这篇关于标签搜索的数据存储解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆