弹性搜索条件查询排除大量用户 [英] Elasticsearch Terms Query exclude large amount of users

查看:129
本文介绍了弹性搜索条件查询排除大量用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个像应用程序的tinder。为了排除用户以前刷过的个人资料,我使用了一个must_not查询:


must_not:[{terms :{swipedusers:[userid1,userid1,userid1...]}}]


使用这种方法的限制?这是一种可扩展的方法,当swipedusers数组包含2000个用户ID时,这种方法也会起作用?如果有一个更好的可扩展方法,我会很高兴知道...

解决方案

有一个更好的方法!它被称为术语查找,就像您可以在关系数据库上进行的传统加入...



我可以尝试在这里解释你,但是,所有您需要的信息在官方弹性搜索页面上有详细记录:



https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query -dsl-terms-query.html#query-dsl-terms-lookup



最终解决方案有2个索引,一个用于注册用户,另一个用于注册用户一个用于跟踪每个用户的滑动。
然后,对于每个滑动,您应该更新包含当前用户滑动的文档...您将需要向数组添加元素,这是ElasticSearch中的另一个问题(如果您使用AWS管理的ElasticSearch )只能使用脚本解决...
更多信息在 https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates



对于您的情况,查询将产生以下结果:

  GET / possible_matches / _search 
{
query:{
terms:{
user:{
index:swiped,
type:users
id:current-user-id,
path:swipedUserId
}
}
}
}

您应该考虑的另一件事是扫描索引的复制配置,因为每个节点将与该索引执行连接,因此强烈建议在每个节点中具有该索引的完整副本。您可以使用0-all值的auto_expand_replicas创建索引。

  PUT / swipes 
{
settings:{
auto_expand_replicas:0-all
}
}


I'm working on a tinder like app. In order to exclude profiles that user has swiped before, I use a "must_not" query like this:

must_not : [{"terms": { "swipedusers": ["userid1", "userid1", "userid1"…]}}]

I wonder what are the limits using this approach? is this a scalable approach that would also work when the swipedusers array contains 2000 user ids? If there is a better scalable approach to this I would be happy to know...

解决方案

there is a better approach! and it called "terms lookup", is something like the traditional join that you could do on relational databases...

I could try to explain you here, but, all the information that you need is well documented on the official Elastic Search page:

https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup

The final solution is having 2 indices, one for the registered users and another one to track swipes for each user. Then, for each swipe, you should update the document containing current user swipes... Here you will need to add elements to an array, and this is another problem in ElasticSearch (big problem if you are using AWS managed ElasticSearch) that only can be solved using scripting... More info at https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates

For your case, the query will result in something like:

GET /possible_matches/_search
{
    "query" : {
        "terms" : {
            "user" : {
                "index" : "swiped",
                "type" : "users",
                "id" : "current-user-id",
                "path" : "swipedUserId"
            }
        }
    }
}

Another thing that you should take in account is the replication configuration for the swipes index, since each node will perform "joins" with that index, is highly recommended to have a full copy of that index in each node. You could achieve this creating the index with the "auto_expand_replicas" with "0-all" value.

PUT /swipes
{
    "settings": {
        "auto_expand_replicas": "0-all"
    }
}

这篇关于弹性搜索条件查询排除大量用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆