Elasticsearch 查询时间提升会导致顺序不足 [英] Elasticsearch query time boosting produces result in inadequate order

查看:16
本文介绍了Elasticsearch 查询时间提升会导致顺序不足的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在对每个关键字应用 boost 功能后,给定搜索关键字 一二三 的 ES 搜索结果似乎是错误的.请帮我修改我的错误"查询,以便按照我的描述完成下面的预期结果".我在 ES 1.7.4 和 LUCENE 4.10.4

The ES search result for the given search keyword one two three seems to be wrong after applying boost feature per keyword. Please help me modifying my "faulty" query in order to accomplish "expected result" below as I described. I'm on ES 1.7.4 with LUCENE 4.10.4

提升标准 -三个 被认为是最重要的关键字:

word - boost
----   -----
one    1
two    2
three  3

ES 索引内容 - 只显示 MySQL 转储以缩短帖子

mysql> SELECT id, title FROM post;
+----+-------------------+
| id | title             |
+----+-------------------+
|  1 | one               |
|  2 | two               |
|  3 | three             |
|  4 | one two           |
|  5 | one three         |
|  6 | one two three     |
|  7 | two three         |
|  8 | none              |
|  9 | one abc           |
| 10 | two abc           |
| 11 | three abc         |
| 12 | one two abc       |
| 13 | one two three abc |
| 14 | two three abc     |
+----+-------------------+
14 rows in set (0.00 sec)

预期的 ES 查询结果 - 用户正在搜索 一二三.我不关心得分相同的记录的顺序.我的意思是如果记录 6 和 13 交换位置,我不介意.

+----+-------------------+
| id | title             | my scores for demonstration purposes
+----+-------------------+
|  6 | one two three     | (1+2+3 = 6)
| 13 | one two three abc | (1+2+3 = 6)
|  7 | two three         | (2+3 = 5)
| 14 | two three abc     | (2+3 = 5)
|  5 | one three         | (1+3 = 4)
|  4 | one two           | (1+2 = 3)
| 12 | one two abc       | (1+2 = 3)
|  3 | three             | (3 = 3)
| 11 | three abc         | (3 = 3)
|  2 | two               | (2 = 2)
| 10 | two abc           | (2 = 2)
|  1 | one               | (1 = 1)
|  9 | one abc           | (1 = 1)
|  8 | none              | <- This shouldn't appear
+----+-------------------+
14 rows in set (0.00 sec)

意外的 ES 查询结果 - 不幸的是,这就是我得到的结果.

+----+-------------------+
| id | title             | _score
+----+-------------------+
|  6 | one two three     | 1.0013864
| 13 | one two three abc | 1.0013864
|  4 | one two           | 0.57794875
|  3 | three             | 0.5310148
|  7 | two three         | 0.50929534
|  5 | one three         | 0.503356
| 14 | two three abc     | 0.4074363
| 11 | three abc         | 0.36586377
| 12 | one two abc       | 0.30806428
| 10 | two abc           | 0.23231897
|  2 | two               | 0.12812772
|  1 | one               | 0.084527075
|  9 | one abc           | 0.07408653
+----+-------------------+

ES 查询

curl -XPOST "http://127.0.0.1:9200/_search?post_dev" -d'
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "title": {
            "query": "one two three"
          }
        }
      },
      "should": [
        {
          "match": {
            "title": {
              "query": "one",
              "boost": 1
            }
          }
        },
        {
          "match": {
            "title": {
              "query": "two",
              "boost": 2
            }
          }
        },
        {
          "match": {
            "title": {
              "query": "three",
              "boost": 3
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    }
  ],
  "from": "0",
  "size": "100"
}'

更多测试查询:

  • This query doesn't produce any result.
  • This query doesn't order correctly as seem here.

推荐答案

# Index some test data
curl -XPUT "localhost:9200/test/doc/1" -d '{"title": "one"}'
curl -XPUT "localhost:9200/test/doc/2" -d '{"title": "two"}'
curl -XPUT "localhost:9200/test/doc/3" -d '{"title": "three"}'
curl -XPUT "localhost:9200/test/doc/4" -d '{"title": "one two"}'
curl -XPUT "localhost:9200/test/doc/5" -d '{"title": "one three"}'
curl -XPUT "localhost:9200/test/doc/6" -d '{"title": "one two three"}'
curl -XPUT "localhost:9200/test/doc/7" -d '{"title": "two three"}'
curl -XPUT "localhost:9200/test/doc/8" -d '{"title": "none"}'
curl -XPUT "localhost:9200/test/doc/9" -d '{"title": "one abc"}'
curl -XPUT "localhost:9200/test/doc/10" -d '{"title": "two abc"}'
curl -XPUT "localhost:9200/test/doc/11" -d '{"title": "three abc"}'
curl -XPUT "localhost:9200/test/doc/12" -d '{"title": "one two abc"}'
curl -XPUT "localhost:9200/test/doc/13" -d '{"title": "one two three abc"}'
curl -XPUT "localhost:9200/test/doc/14" -d '{"title": "two three abc"}'
# Make test data available for search
curl -XPOST "localhost:9200/test/_refresh?pretty"
# Search using function score
curl -XPOST "localhost:9200/test/doc/_search?pretty" -d'{
    "query": {
        "function_score": {
            "query": {
                "match": {
                    "title": "one two three"
                }
            },
            "functions": [
                {
                    "filter": {
                        "query": {
                            "match": {
                                "title": "one"
                            }
                        }
                    },
                    "weight": 1
                },
                {
                    "filter": {
                        "query": {
                            "match": {
                                "title": "two"
                            }
                        }
                    },
                    "weight": 2
                },
                {
                    "filter": {
                        "query": {
                            "match": {
                                "title": "three"
                            }
                        }
                    },
                    "weight": 3
                }
            ],
            "score_mode": "sum",
            "boost_mode": "replace"
        }
    },
    "sort": [
        {
            "_score": {
                "order": "desc"
            }
        }
    ],
    "from": "0",
    "size": "100"
}'

这篇关于Elasticsearch 查询时间提升会导致顺序不足的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆