弹性搜索排序预处理 [英] Elastic Search sort preprocessing

查看:70
本文介绍了弹性搜索排序预处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在ES中有一个索引,除其他字段外,该索引还具有Revenue_amount和Revenue_currency字段.收入以不同的货币存储.在运行时,所有货币都将转换为美元并进行渲染.

I have an index in ES that has, in addition to other fields, revenue_amount and revenue_currency fields. The revenue is stored in different currencies. At run time, all currencies are converted to USD and rendered.

现在,我想支持在Revenue_amount字段上进行排序.问题在于ES在转换为USD之前会根据收入对结果进行排序,因此返回顶部的收入可能不是转换为USD之后的最高收入.

Now, I would like to support sorting on the revenue_amount field. The problem is ES sorts results in terms of revenue prior to converting to USD, and so a revenue returned at the top might not be the highest revenue after converting to USD.

我想知道,是否有可能在排序之前,ES调用用户定义的函数来更改字段值,然后再应用排序?像这样:

I was wondering, if its possible that before sorting, ES calls a user-defined function that changes a field value, and then apply sort afterwards? Something like this:

revenue_converted = convertToUSD(收入)

revenue_converted = convertToUSD(revenue)

因此排序将应用于收入转换后的收入,而不是收入.

And so the sorting will be applied to revenue_converted, rather than revenue.

我知道我可以在索引时转换货币,但是这需要在每次更新汇率时刷新索引,因此,如果可能的话,我希望避免这种情况.

I know I can convert the currencies at index time, but that will require refreshing the index every time the rates are updated, and so I would like to avoid it, if possible.

推荐答案

您有两种方法可以实现此目的:一种是通过使用

You have two ways of achieving this: one is by using script-based sorting as keety mentioned:

{
    "query" : {
        ....                                    <--- your query goes here
    },
    "sort" : {
        "_script" : {
            "script" : "doc.revenue_amount.value * usd_conversion_rate",
            "type" : "number",
            "params" : {
                "usd_conversion_rate" : 0.4273  <--- the conversion rate to USD
            },
            "order" : "desc"
        }
    }
}

usd_conversion_rate因子是USD的转换率.因此,例如,如果1美元价值2.34单位另一种货币,则usd_conversion_rate因子将是1 / 2.34(或0.4273).乘以revenue_amount,它会为您提供美元参考货币的金额.

The usd_conversion_rate factor is the conversion rate to USD. So for instance, if 1 USD is worth 2.34 units of another currency, the usd_conversion_rate factor would be 1 / 2.34 (or 0.4273). When multiplied with revenue_amount it'll give you the amount in the USD reference currency.

的脚本排序效果不是很好,建议使用function_score,以便可以按分数对结果进行排序.这使我们找到了满足您需求的第二种方法,它就是这样.一种方法是使用 script_score 函数,但这需要再次编写脚本.

Script-based sorting is not very performant, though, and the recommendation is to use a function_score so results can be sorted by score instead. That leads us to the second way of achieving what you need and it goes like this. One way is by using a script_score function, but that involves scripting again.

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "script_score": {
            "script": "doc.revenue_amount.value * usd_conversion_rate",
            "boost_mode": "replace",
            "params": {
              "usd_conversion_rate": 0.4273
            }
          }
        }
      ]
    }
  }
}

由于我们上面的脚本非常简单(即,将字段乘以某种因素),所以最简单的方法将涉及使用

Since our above script was very simple (i.e. multiply a field by some factor), the simplest way would involve using field_value_factor and it goes like this:

{
  "query": {
    "function_score": {
      "query": {
        ...                              <--- your query goes here
      },
      "functions": [
        {
          "field_value_factor": {
            "field": "revenue_amount",
            "boost_mode": "replace",
            "factor": 0.4273             <--- insert the conversion rate here
          }
        }
      ]
    }
  }
}

更新

根据您的最新评论,看来适合您的选择毕竟是使用script_score.这里的想法是将查询表中可用的所有货币汇率输入为script_score脚本的参数,然后根据revenue_currency字段的值使用适当的货币汇率.

According to your latest comment, it seems that the right option for you is to use script_score after all. The idea here would be to input all your currency rates available in your lookup table as parameters of your script_score script and then use the proper one according to the value of the revenue_currency field.

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "script_score": {
            "script": "doc.revenue_amount.value * (doc.revenue_currency.value == 'EUR' ? EUR : (doc.revenue_currency.value == 'AUD' ? AUD : 1))",
            "boost_mode": "replace",
            "params": {
              "EUR": 0.4945,
              "AUD": 0.5623
            }
          }
        }
      ]
    }
  }
}

这篇关于弹性搜索排序预处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆