在范围列表中搜索数字的最快方法 [英] Fastest way to search a number in a list of ranges

查看:40
本文介绍了在范围列表中搜索数字的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码可以在范围列表中查找数字的匹配项.

I have the following code to find a match for a number in a list of ranges.

public class RangeGroup
{
    public uint RangeGroupId { get; set; }
    public uint Low { get; set; }
    public uint High { get; set; }
    // More properties related with the range here
}

public class RangeGroupFinder
{
    private static readonly List<RangeGroup> RangeGroups=new List<RangeGroup>();

    static RangeGroupFinder()
    {
        // Populating the list items here
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023238144, High = 1023246335 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023246336, High = 1023279103 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023279104, High = 1023311871 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023311872, High = 1023328255 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023328256, High = 1023344639 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023344640, High = 1023410175 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023410176, High = 1023672319 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023672320, High = 1023688703 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023692800, High = 1023696895 });
       // There are many more and the groups are not sequential as it can seen on last 2 groups
    }

    public static RangeGroup Find(uint number)
    {
        return RangeGroups.FirstOrDefault(rg => number >= rg.Low && number <= rg.High);
    }
}

RangeGroup 的列表包含大约 5000000 个项目,Find() 方法将被大量使用,所以我正在寻找一种更快的方法来进行搜索.以任何方式更改数据结构或拆分数据都没有问题.

The list of the RangeGroup consists about 5000000 items and the Find() method will be used a lot, so I'm looking for a faster way to make the search. It's no problem to change the structure of the data or split it in any way.

所有范围都是唯一的,并按低的顺序添加,并且它们不重叠.

All ranges are unique and added by in order of Low and they don't overlap.

结果:

使用 ikh 的代码,结果比我的代码快大约 7000 倍.可以在此处查看测试代码和结果.

Did a test using ikh's code and the result is approximately 7000 times faster than my code. The test code and results can be seen here.

推荐答案

既然您指出 RangeGroup 是按照 RangeGroup.Low 的顺序添加的,而它们不是重叠,你不需要做任何进一步的预处理.您可以对 RangeGroups 列表进行二分搜索以找到范围(警告:未完全测试,您需要检查一些边缘条件):

Since you indicated that RangeGroups are added in order of RangeGroup.Low and that they do not overlap, you don't need to do any further pre-processing. You can do binary search on the RangeGroups list to find the range (warning: not fully tested, you'd need to check some edge conditions):

public static RangeGroup Find(uint number) {
    int position = RangeGroups.Count / 2;
    int stepSize = position / 2;

    while (true) {
        if (stepSize == 0) {
            // Couldn't find it.
            return null;
        }

        if (RangeGroups[position].High < number) {
            // Search down.
            position -= stepSize;

        } else if (RangeGroups[position].Low > number) {
            // Search up.
            position += stepSize;

        } else {
            // Found it!
            return RangeGroups[position];
        }

        stepSize /= 2;
    }
}

最坏情况下的运行时间应该在 O(log(N)) 左右,其中 N 是 RangeGroup 的数量.

The worst-case run time should be around O(log(N)), where N is the number of RangeGroups.

这篇关于在范围列表中搜索数字的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆