什么是在.net 2中生成唯一集合的最快方法 [英] what is the fastest way to generate a unique set in .net 2

查看:106
本文介绍了什么是在.net 2中生成唯一集合的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有什么本质上是一个锯齿状的名称值对数组 - 我需要从这里生成一组唯一的名称值。锯齿阵列大约为86,000 x 11个值。
对我来说,我必须存储一个名称值对(单个字符串name = value或一个专门的类,例如KeyValuePair)。

附加信息:

I have what is essentially a jagged array of name value pairs - i need to generate a set of unique name values from this. the jagged array is approx 86,000 x 11 values. It does not matter to me what way I have to store a name value pair (a single string "name=value" or a specialised class for example KeyValuePair).
Additional Info: There are 40 distinct names and a larger number of distinct values - probably in the region 10,000 values.

我使用的是C#和.NET 2.0(以及性能是如此糟糕,我认为最好把我的整个锯齿阵列到一个sql数据库,并做一个选择不同于那里)。

I am using C# and .NET 2.0 (and the performance is so poor I am thinking that it may be better to push my entire jagged array into a sql database and do a select distinct from there).

下面是当前代码Im使用:

Below is the current code Im using:

List<List<KeyValuePair<string,string>>> vehicleList = retriever.GetVehicles();
this.statsLabel.Text = "Unique Vehicles: " + vehicleList.Count;

Dictionary<KeyValuePair<string, string>, int> uniqueProperties = new Dictionary<KeyValuePair<string, string>, int>();
foreach (List<KeyValuePair<string, string>> vehicle in vehicleList)
{
    foreach (KeyValuePair<string, string> property in vehicle)
    {
        if (!uniqueProperties.ContainsKey(property))
        {
            uniqueProperties.Add(property, 0);
        }
    }
}
this.statsLabel.Text += "\rUnique Properties: " + uniqueProperties.Count;


推荐答案

/ strong>从9分钟以上

I have it running in 0.34 seconds down from 9+ minutes

问题是比较KeyValuePair结构时。
我通过编写一个comparer对象,并将它的一个实例传递给Dictionary。

The problem is when comparing the KeyValuePair structs. I worked around it by writing a comparer object, and passing an instance of it to the Dictionary.

从我可以确定的是,KeyValuePair.GetHashCode )返回它的 Key 对象(在这个例子中是最不唯一的对象)的哈希码。

From what I can determine, the KeyValuePair.GetHashCode() returns the hashcode of it's Key object (in this example the least unique object).

(和检查存在)每个项目,它使用Equals和GetHashCode函数,但是当散列码不太独特时必须依赖Equals函数。

As the dictionary adds (and checks existence of) each item, it uses both Equals and GetHashCode functions, but has to rely on the Equals function when the hashcode is less unique.

通过提供更独特的GetHashCode函数,它更少地经常使用Equals函数。我还优化了Equals函数,在较少的unqiue键之前比较了更多的唯一值。

By providing a more unique GetHashCode function, it excerises the Equals function far less often. I also optimised the Equals function to compare the more unique Values before the less unqiue Keys.

86,000 * 11个具有10,000个唯一属性的项目使用下面的比较器对象在0.34秒内运行(没有比较器对象需要9分22秒)

86,000 * 11 items with 10,000 unique properties runs in 0.34 seconds using the comparer object below (without the comparer object it takes 9 minutes 22 seconds)

希望这有助于:)

    class StringPairComparer
        : IEqualityComparer<KeyValuePair<string, string>>
    {
        public bool Equals(KeyValuePair<string, string> x, KeyValuePair<string, string> y)
        {
            return x.Value == y.Value && x.Key == y.Key;
        }
        public int GetHashCode(KeyValuePair<string, string> obj)
        {
            return (obj.Key + obj.Value).GetHashCode();
        }
    }

EDIT :如果只是一个字符串(而不是KeyValuePair,其中string = Name + Value),这将是大约两倍快。这是一个很好的intresting问题,我花了 faaaaaar太多时间了(我学到了一点安静)

EDIT: If it was just one string (instead of a KeyValuePair, where string = Name+Value) it would be approx twice as fast. It's a nice intresting problem, and I have spent faaaaaar too much time on it (I learned quiet a bit though)

这篇关于什么是在.net 2中生成唯一集合的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆