数据结构,C#:〜O(1)使用范围键查找? [英] Datastructures, C#: ~O(1) lookup with range keys?

查看:96
本文介绍了数据结构,C#:〜O(1)使用范围键查找?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集。此数据集将提供查找表。给定一个数字,我应该能够查找该数字的相应值。



数据集(让我们说它的CSV)有一些注意事项。而不是:

  1,ABC 
2,XYZ
3,LMN

数字是范围( - 是到,而不是减号):



ABC $ 1,2,3 = ABC
4-8,XYZ // 4,5,6,7,8 = XYZ
11-11,LMN // 11 = LMN

所有的数字都是签名的int。没有范围与另一个范围重叠。有一些差距;在数据集中没有定义范围(如上面的最后一个片段中的9和10)。
`



如何在C#中对这个数据集进行建模,以便在保持内存足够低的情况下,执行性能最佳的查找?



我想出来的唯一选择是因为内存过多而消失。我们的数据集是:

  1-2,ABC 
4-6,XYZ

然后我创建一个字典< int,string>() /值是:

  1 / ABC 
2 / ABC
4 / XYZ
5 / XYZ
6 / XYZ

现在我有哈希性能查询,但浪费了大量空格在哈希表中。



任何想法?也许只是使用PLINQ,希望表现良好? ;)

解决方案

您可以创建双重间接查找:

 字典< int,int>钥匙
字典< int,string>价值观

然后存储这样的数据:

  keys.Add(1,1); 
keys.Add(2,1);
keys.Add(3,1);
// ...
keys.Add(11,3);

values.Add(1,ABC);
// ...
values.Add(3,LMN);

然后查看数据:

 返回值[keys [3]]; //返回ABC

我不知道有多少内存占用这个琐碎的字符串,但是一旦超出了ABC,它应该有所帮助。



编辑



在丹涛的评论下面,我回去检查了他在问什么。以下代码:

  var abc =ABC; 
var def =ABC;
Console.WriteLine(ReferenceEquals(abc,def));

将向控制台写入True。这意味着编译器或运行时(澄清?)保持对ABC的引用,并将其分配为两个变量的值。



阅读后如果您使用字符串文字填充字典,或 Intern ing,则可以在 Intern 字符串,实际上比原来的字典要占用更多的空间来实现我的建议。如果您没有使用 Intern ed字符串,那么我的解决方案应该占用更少的空间。



FINAL编辑



如果您正确处理字符串,则原始 Dictionary< int,string> ; ,因为您可以将它们分配给变量,然后将该引用分配给该值(或者,如果需要,则可以由 Intern



只需确保您的作业代码包含中间变量赋值:

  while(thereAreStringsLeftToAssign)
{
var theString = theStringToAssign;
foreach(var i in range)
{
strings.Add(i,theString);
}
}


I have a dataset. This dataset will serve a lookup table. Given a number, I should be able to lookup a corresponding value for that number.

The dataset (let's say its CSV) has a few caveats though. Instead of:

1,ABC
2,XYZ
3,LMN

The numbers are ranges (- being "through", not minus):

1-3,ABC     // 1, 2, and 3 = ABC
4-8,XYZ     // 4, 5, 6, 7, 8 = XYZ
11-11,LMN   // 11 = LMN

All the numbers are signed ints. No ranges overlap with another ranges. There are some gaps; there are ranges that aren't defined in the dataset (like 9 and 10 in the last snippet above). `

How might I model this dataset in C# so that I have the most-performant lookup while keeping my in-memory footprint low?

The only option I've come up with suffers from overconsumption of memory. Let's say my dataset is:

1-2,ABC
4-6,XYZ

Then I create a Dictionary<int,string>() whose key/values are:

1/ABC
2/ABC
4/XYZ
5/XYZ
6/XYZ

Now I have hash performance-lookup, but tons of wasted space in the hash table.

Any ideas? Maybe just use PLINQ instead and hope for good performance? ;)

解决方案

You can create a doubly-indirected lookup:

Dictionary<int, int> keys;
Dictionary<int, string> values;

Then store the data like this:

keys.Add(1, 1);
keys.Add(2, 1);
keys.Add(3, 1);
//...
keys.Add(11, 3);

values.Add(1, "ABC");
//...
values.Add(3, "LMN");

And then look the data up:

return values[keys[3]];  //returns "ABC"

I'm not sure how much memory footprint this will save with trivial strings, but once you get beyond "ABC" it should help.

EDIT

After Dan Tao's comment below, I went back and checked on what he was asking about. The following code:

var abc = "ABC";
var def = "ABC";
Console.WriteLine(ReferenceEquals(abc, def));

will write "True" to the console. Which means that the either the compiler or the runtime (clarification?) is maintaining the reference to "ABC", and assigns it as the value of both variables.

After reading up some more on Interned strings, if you're using string literals to populate the dictionary, or Interning computed strings, it will in fact take more space to implement my suggestion than the original dictionary would have taken. If you're not using Interned strings, then my solution should take less space.

FINAL EDIT

If you're treating your strings correctly, there should be no excess memory usage from the original Dictionary<int, string> because you can assign them to a variable and then assign that reference as the value (or, if you need to, because you can Intern them)

Just make sure your assignment code includes an intermediate variable assignment:

while (thereAreStringsLeftToAssign)
{
    var theString = theStringToAssign;
    foreach (var i in range)
    {
        strings.Add(i, theString);
    }
}

这篇关于数据结构,C#:〜O(1)使用范围键查找?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆