搜索数组值(位置列表除外) [英] Search for array values except for a list of positions

查看:42
本文介绍了搜索数组值(位置列表除外)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有成千上万的文档,如下所示.

I have tens of millions of documents like the following.

{
    id: "<some unit test id>",
    groupName: "<some group name>",
    result: [
        1, 0, 1, 1, ... 1
    ]
}

结果字段是200个数字数组,分别是0或1.

Result field is an 200 array of numbers, 0 or 1.

我的工作是根据给定的groupName查找"group17"和一些数字,例如3、8、27找到所有文档,其groupName的结果数组元素都等于1,而不论位置3、8、27的值如何.

My job is to find, given a groupName, say, "group17" and a few numbers, say, 3, 8, 27 find all the document whose result array elements for the groupName are all equal to 1 disregarding the values at positions 3, 8, 27.

如果有人可以指出是否可以进行快速搜索,将不胜感激.

Would appreciate if someone could point out if there is a quick search for it.

推荐答案

一种实现所需目标的方法是添加另一个字段,该字段包含 result 数组中包含的位集的等效整数值然后使用按位与运算.

One way to achieve what you want is to add another field that contains the equivalent integer value of the bitset contained in the result array and then use a bitwise AND operation.

例如,假设结果数组为

result: [1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0]

这些位表示的整数值为1470,因此我存储以下文档:

The integer value represented by those bits is 1470, so I store the following document:

PUT test/doc/1
{
    "groupName": "group12",
    "result": [
        1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0
    ],
    "resultLong": "1470"
}

现在,查询看起来像这样

Now, the query would look like this

POST test/_search 
{
  "query": {
    "script": {
      "script": {
        "source": """
        // 1. create a BigInt out of the resultLong value we just computed
        def value = new BigInteger(doc['resultLong'].value.toString());

        // 2. create a bitset filled with 1's except for those positions specified in the ignore parameters array
        def comp = IntStream.range(1, 12).mapToObj(i -> params.ignore.contains(i - 1) ? "0" : "1").collect(Collectors.joining());

        // 3. create a BigInt out of the number we've just created
        def compare = new BigInteger(comp, 2);

        // 4. compare both using a bitwise AND operation
        return value.and(compare).equals(compare);
        """,
        "params": {
          "ignore": [1, 4, 10]
        }
      }
    }
  }
}

如果当前索引位于 params.ignore 数组中,则第2步首先创建一个长度为11的字符串,该字符串用1或0填充.我们以字符串"10110111110" 结尾.

Step 2 first creates a string of length 11 filled with 1's or 0's if the current index is in the params.ignore array. We end up with the string "10110111110".

然后,第3步从该字符串中创建一个BigInteger(在基数2中).

Step 3 then creates a BigInteger out of that string (in base 2).

第4步逐位比较两个数字,即,仅当两个数字在相同位置都带有1时,才会返回文档.

Step 4 compares both numbers bit by bit, i.e. the document will only be returned if both numbers have 1's at the same positions.

注意:对于长度为200的数组,您需要改用 IntStream.range(1,201).

Note: for arrays of length 200, you need to use IntStream.range(1, 201) instead.

这篇关于搜索数组值(位置列表除外)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆