使用Collections.binarySearch()进行谓词搜索(即不完全匹配) [英] Using Collections.binarySearch() for predicate search (i.e., not complete match)

查看:63
本文介绍了使用Collections.binarySearch()进行谓词搜索(即不完全匹配)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个按升序排列的时间戳列表:

I have a list of timestamps sorted in ascending order:

List<Instant> timestamps = ...; // note: sorted in ascending order

现在,给定输入时间戳记Instant inputTs,我想在timestamps中找到满足t.isBefore(inputTs) && inputTs.isBefore(t.plusMillis(SOME_CONSTANT))的条目t,即,我只是在寻找t以便inputTs位于从t开始的某些固定长度持续时间的范围.我承认理论上可以有多个这样的t,因此允许搜索在这些之间任意选择.

Now, given an input timestamp Instant inputTs, I want to find an entry t in timestamps that satisfies t.isBefore(inputTs) && inputTs.isBefore(t.plusMillis(SOME_CONSTANT)), i.e., I am simply looking for a t such that inputTs lies within the bounds of some fixed-length duration starting at t. I acknowledge that there can theoretically be multiple such ts, so the search is allowed to arbitrarily choose between these.

Collections.binarySearch(...)重载期望有一个键,指示通用/预期用例将搜索完全匹配"/相同的条目(缺少更好的用词,很抱歉).但是,在我的情况下,inputTstimestamps中出现的条目不同,因为预计inputTs是在timestamps中某些条目t之后不久的一个时间点.

The Collections.binarySearch(...) overloads expect a key, indicating that the common/intended usecase is to search for a "complete match"/identical entry (in lack of better words, sorry). However, in my case inputTs will be different from the entries present in timestamps as inputTs is expected to be a point in time shortly after some entry t in timestamps.

我的想法是简单地使谓词成立时,我提供给Collections.binarySearch(...)Comparator<Instant>返回0:

My idea is to simply make the Comparator<Instant> that I provide to Collections.binarySearch(...) return 0 when the predicate holds:

public class TimestampFinder {
    private static final long SOME_CONSTANT = 10_000;
    private List<Instant> timestamps = ... ; // initialize and sort

    public Instant find(Instant inputTs) {
        int index = Collections.binarySearch(timestamps, inputTs, (t, key) -> {
            if (t.isBefore(key) && key.isBefore(t.plusMillis(SOME_CONSTANT))) {
                // inputTs is part of the duration after t
                // return 0 to indicate that we've found a match
                return 0;
            }
            // inputTs not in interval
            // use Instant's implementation of Comparator to indicate to binarySearch if it should continue the search in the upper or lower half of the list
            return t.compareTo(key);
        });
        return index >= 0 ? timestamps.get(index) : null;
    }
} 

这是解决此问题的适当(有效)方法,还是我忽略了更好的替代方法?请注意,对find(Instant)的调用数量将大大超过timestamps中的元素数量,这就是为什么我认为对timestamps进行排序会产生开销的原因.

Is this a proper (efficient) way to solve this problem, or is there a better alternative that I've overlooked? Note that the number of calls to find(Instant) will vastly outnumber the number of elements in timestamps, which is why I consider the overhead incurred by sorting timestamps to be warranted.

推荐答案

Collections.binarySearch没有必须用于精确匹配.根据文档中的指定,如果找不到完全匹配的内容,它将返回-1 - i,其中i是列表中下一个较高元素的索引.

Collections.binarySearch doesn't have to be used for exact matches. As specified in the documentation, if an exact match isn't found, it'll return -1 - i where i is the index of the next-higher element in the list.

只需使用自然顺序搜索inputTs.如果找不到,则可以从inputTs导出下一个更高的Instant的索引(只需执行-1 - resultOfBinarySearch).如果该索引处的Instantbefore(inputTs.plusMillis(CONSTANT)),则说明您已完成,否则,不存在这样的Instant.

Just do a search for inputTs with the natural ordering. If it isn't found, then you can derive the index of the next-higher Instant from inputTs (just do -1 - resultOfBinarySearch). If the Instant at that index is before(inputTs.plusMillis(CONSTANT)), then you're done, otherwise, no such Instant exists.

愿意认为您现有的解决方案滥用了有关价值的binarySearch.

I do think your existing solution abuses binarySearch in concerning ways, for what it's worth.

这篇关于使用Collections.binarySearch()进行谓词搜索(即不完全匹配)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆