不可变字典VS字典VS C5 VS F# - 性能 [英] Immutable Dictionary Vs Dictionary Vs C5 Vs F# - performance

查看:164
本文介绍了不可变字典VS字典VS C5 VS F# - 性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的应用程序使用大量的具有多层次的查找不经常改变的字典。我们正在调查,在把一些关键的code表示不使用字典使用备用的结构进行了大量的查找 - 更快的查找,光对内存/ GC。这让我们比较一下各种字典/可用库 - 词典(System.Collections.Generics.Dictionary - SCGD),ImmutableDictionary,C5.HashDictionary,FSharpMap。运行下面的程序与各个项目数 - 100,1000,10000,100000 - 表明,字典仍然是赢家最多范围。第一行表示收集的物品。 MS /蜱会抽出时间进行随机x查找(code是如下图)。

产品 - 100
SCGD - 0 MS - 50蜱
C5 - 1 MS - 1767蜱
入境事务处 - 4 MS - 5951蜱
FS - 0 MS - 240蜱

产品 - 1000
SCGD - 0 MS - 230蜱
C5 - 0 MS - 496蜱
入境事务处 - 0 MS - 1046蜱
FS - 1 MS - 1870年蜱

产品 - 10000
SCGD - 3 MS - 4722蜱
C5 - 4 MS - 6370蜱
入境事务处 - 9 MS - 13119蜱
FS - 15 MS - 22050蜱

产品 - 100000
SCGD - 62 MS - 89276蜱
C5 - 72 MS - 103658蜱
入境事务处 - 172 MS - 246247蜱
FS - 249 MS - 356176蜱

这是否意味着,我们已经用最快的,并且没有改变?我有presumed是不可变的结构应在表的顶部,但它不是这样的。难道我们做错了比较,还是我失去了一些东西?抱着对这个问题,只是觉得,它能够更好地问。任何链接或票据,或任何引用抛出一些光会很大。

完成code测试 -

命名空间CollectionsTest {     使用系统;     使用System.Collections.Generic;     使用System.Collections.Immutable;     使用System.Diagnostics程序;     使用System.Linq的;     使用System.Text;     使用System.Runtime;     使用Microsoft.FSharp.Collections;     ///<总结>     ///     ///< /总结>     类节目     {         静态程序()         {             ProfileOptimization.SetProfileRoot(@\日新。);             ProfileOptimization.StartProfile(Startup.Profile);         }         ///<总结>         ///主电源的使用ARGS。         ///< /总结>         ///< PARAM NAME =的args>在ARGS< /参数>         静态无效的主要(字串[] args)         {             // INIT测试数据--------------------------------------------- -------------------------------------------------- -             的foreach(INT MAXITEMS在新的INT [] {100,1000,10000,100000})             {                 Console.WriteLine(\ N# - {0},MAXITEMS);                 名单<字符串> accessIndex =新的名单,其中,串>(MAXITEMS);                 名单< KeyValuePair<字符串,对象>> listofkvps =新的名单,其中,KeyValuePair<字符串,对象>>();                 名单<元组LT;字符串,对象>> listoftuples =新名单,其中元组LT;字符串,对象>>();                 的for(int i = 0; I< MAXITEMS;我++)                 {                     listoftuples.Add(新行<字符串,对象>(i.ToString(),I));                     listofkvps.Add(新KeyValuePair<字符串,对象>(i.ToString(),I));                     accessIndex.Add(i.ToString());                 }                 //随机化进行查找                 随机R =新的随机(Environment.TickCount);                 名单<字符串> randomIndexesList =新的名单,其中,串>(MAXITEMS);                 而(accessIndex.Count大于0)                 {                     INT指数= r.Next(accessIndex.Count);                     字符串值= accessIndex [指数]                     accessIndex.RemoveAt(指数);                     randomIndexesList.Add(值);                 }                 //转换为阵列的最佳PERF                 字符串[] randomIndexes = randomIndexesList.ToArray();                 // 加载 - - - - - - - - - - - - - - - - - - - - - - - - -------------------------------------------------                 // IMMU                 ImmutableDictionary<字符串,对象> IDictionary的= ImmutableDictionary.Create<字符串,对象>(listofkvps);                 //Console.WriteLine(idictionary.Count);                 // SCGD                 字典<字符串,对象>词典=新词典<字符串,对象>();                 的for(int i = 0; I< MAXITEMS;我++)                 {                     dictionary.Add(i.ToString(),I);                 }                 //Console.WriteLine(dictionary.Count);                 // C5                 C5.HashDictionary<字符串,对象> c5dictionary =新C5.HashDictionary<字符串,对象>();                 的for(int i = 0; I< MAXITEMS;我++)                 {                     c5dictionary.Add(i.ToString(),I);                 }                 //Console.WriteLine(c5dictionary.Count);                 //如何更改为只读?                 // F#                 FSharpMap<字符串,对象> fdictionary =新FSharpMap<字符串,对象>(listoftuples);                 //Console.WriteLine(fdictionary.Count);                 // 测试 - - - - - - - - - - - - - - - - - - - - - - - - -------------------------------------------------                 秒表SW = Stopwatch.StartNew();                 对于(INT指数= 0,indexMax = randomIndexes.Length;指数< indexMax;指数++)                 {                     串I = randomIndexes [指数]                     对象的值;                     dictionary.TryGetValue(我,超时值);                 }                 sw.Stop();                 Console.WriteLine(SCGD - {0,10} MS - {1,10}蜱,sw.ElapsedMilliseconds,sw.ElapsedTicks);                 秒表c5sw = Stopwatch.StartNew();                 对于(INT指数= 0,indexMax = randomIndexes.Length;指数< indexMax;指数++)                 {                     字符串键= randomIndexes [指数]                     对象的值;                     c5dictionary.Find(REF键,超时值);                 }                 c5sw.Stop();                 Console.WriteLine(C5 - {0,10} MS - {1,10}蜱,c5sw.ElapsedMilliseconds,c5sw.ElapsedTicks);                 秒表ISW = Stopwatch.StartNew();                 对于(INT指数= 0,indexMax = randomIndexes.Length;指数< indexMax;指数++)                 {                     串I = randomIndexes [指数]                     对象的值;                     idictionary.TryGetValue(我,超时值);                 }                 isw.Stop();                 Console.WriteLine(入境处 - {0,10} MS - {1,10}蜱,isw.ElapsedMilliseconds,isw.ElapsedTicks);                 秒表FSW = Stopwatch.StartNew();                 对于(INT指数= 0,indexMax = randomIndexes.Length;指数< indexMax;指数++)                 {                     串I = randomIndexes [指数]                     fdictionary.TryFind(ⅰ);                 }                 fsw.Stop();                 Console.WriteLine(FS - {0,10} MS - {1,10}蜱,fsw.ElapsedMilliseconds,fsw.ElapsedTicks);             }         }     } }

解决方案

标准字典已经是相当不错的优化。它确实做到了,当你做一个查找是计算所提供的密钥(其速度取决于键的类型以及它是如何实现的哈希 GetHash code ),对散列值的模操作以找到合适的桶,然后将其通过桶的内容迭代,直到找到合适的值(其速度取决于 GetHash $的质量C $ç的功能,因此,如果桶是很好的平衡,并且不包含太多项目,而等于法的速度类型)。

所有的一切,它不会做那么多的查找,所以我不认为你能够找到一个显著更快的通用数据结构。然而,这是可能的,这取决于你打算如何使用字典,你能够建立一个更加的专业的解决方案。例如,我需要一个真正的快速查找,其中的关键是一个类型。而不是使用字典和做词典[typeof运算(T)] ,我做了一个泛型类,如下所示:

 类ValueStore< T>
{
  公共静态的T值;
}
 

所以,我可能只是做 ValueStore< T> .value的以pretty的很多零查找开销。

无论你是否可以做同样的事情(以及它是否是值得的)真的取决于你的用例;多少项目的结构将保持多久它被读取和写入,是否需要是线程,多么重要写入速度等。例如,如果需要的话写入速度没有在所有问题,但线程安全,你需要做一个副本上写的,其中数据结构不会被写入,而是复制,保证线程安全和无锁(是这样的:快)读取,在写入速度为代价。专业它可以尽可能重新排序上写的结构去优化它,因此散列桶不包含N多的项目。

PS:如果你真的绝望了速度,但不能建立一个更专业的数据结构,那么你也许可以得到复制小的收益词典< TKEY的,TValue> 和消除各种完整性检查(null检查和等)和虚拟/接口方法的调用。不过,我怀疑这会给你任何超过20%的涨幅,如果这一点。

Our application uses plenty of dictionaries which have multi level lookup that are not frequently changing. We are investigating at converting some of the critical code that does a lot of lookup using dictionaries to use alternate structures for - faster lookup, light on the memory/gc. This made us compare various dictionaries/libraries available - Dictionary(System.Collections.Generics.Dictionary-SCGD), ImmutableDictionary, C5.HashDictionary, FSharpMap. Running the following program with various items count - 100, 1000, 10000, 100000 - shows that Dictionary is still the winner at most ranges. First row indicates items in collection. MS/Ticks would the time taken to perform x lookups randomized (code is below).

Items - 100
SCGD - 0 MS - 50 Ticks
C5 - 1 MS - 1767 Ticks
Imm - 4 MS - 5951 Ticks
FS - 0 MS - 240 Ticks

Items - 1000
SCGD - 0 MS - 230 Ticks
C5 - 0 MS - 496 Ticks
Imm - 0 MS - 1046 Ticks
FS - 1 MS - 1870 Ticks

Items - 10000
SCGD - 3 MS - 4722 Ticks
C5 - 4 MS - 6370 Ticks
Imm - 9 MS - 13119 Ticks
FS - 15 MS - 22050 Ticks

Items - 100000
SCGD - 62 MS - 89276 Ticks
C5 - 72 MS - 103658 Ticks
Imm - 172 MS - 246247 Ticks
FS - 249 MS - 356176 Ticks

Does this mean, we are already using the fastest and don't have to change? I had presumed that immutable structures should be at the top of table, but it wasn't that way. Are we doing wrong comparison or am I missing something? Was holding on to this question, but felt, its better to ask. Any link or notes or any references that throws some light will be great.

Complete code for test -

namespace CollectionsTest
{
    using System;
    using System.Collections.Generic;
    using System.Collections.Immutable;
    using System.Diagnostics;
    using System.Linq;
    using System.Text;
    using System.Runtime;
    using Microsoft.FSharp.Collections;

    /// <summary>
    /// 
    /// </summary>
    class Program
    {
        static Program()
        {
            ProfileOptimization.SetProfileRoot(@".\Jit");
            ProfileOptimization.StartProfile("Startup.Profile");
        }

        /// <summary>
        /// Mains the specified args.
        /// </summary>
        /// <param name="args">The args.</param>
        static void Main(string[] args)
        {
            // INIT TEST DATA ------------------------------------------------------------------------------------------------

            foreach (int MAXITEMS in new int[] { 100, 1000, 10000, 100000 })
            {
                Console.WriteLine("\n# - {0}", MAXITEMS);

                List<string> accessIndex = new List<string>(MAXITEMS);
                List<KeyValuePair<string, object>> listofkvps = new List<KeyValuePair<string, object>>();
                List<Tuple<string, object>> listoftuples = new List<Tuple<string, object>>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    listoftuples.Add(new Tuple<string, object>(i.ToString(), i));
                    listofkvps.Add(new KeyValuePair<string, object>(i.ToString(), i));
                    accessIndex.Add(i.ToString());
                }

                // Randomize for lookups
                Random r = new Random(Environment.TickCount);
                List<string> randomIndexesList = new List<string>(MAXITEMS);
                while (accessIndex.Count > 0)
                {
                    int index = r.Next(accessIndex.Count);
                    string value = accessIndex[index];
                    accessIndex.RemoveAt(index);

                    randomIndexesList.Add(value);
                }

                // Convert to array for best perf
                string[] randomIndexes = randomIndexesList.ToArray();

                // LOAD ------------------------------------------------------------------------------------------------

                // IMMU
                ImmutableDictionary<string, object> idictionary = ImmutableDictionary.Create<string, object>(listofkvps);
                //Console.WriteLine(idictionary.Count);

                // SCGD
                Dictionary<string, object> dictionary = new Dictionary<string, object>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    dictionary.Add(i.ToString(), i);
                }
                //Console.WriteLine(dictionary.Count);

                // C5
                C5.HashDictionary<string, object> c5dictionary = new C5.HashDictionary<string, object>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    c5dictionary.Add(i.ToString(), i);
                }
                //Console.WriteLine(c5dictionary.Count);
                // how to change to readonly?

                // F#
                FSharpMap<string, object> fdictionary = new FSharpMap<string, object>(listoftuples);
                //Console.WriteLine(fdictionary.Count);

                // TEST ------------------------------------------------------------------------------------------------
                Stopwatch sw = Stopwatch.StartNew();
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    object value;
                    dictionary.TryGetValue(i, out value);
                }
                sw.Stop();
                Console.WriteLine("SCGD - {0,10} MS - {1,10} Ticks", sw.ElapsedMilliseconds, sw.ElapsedTicks);

                Stopwatch c5sw = Stopwatch.StartNew();
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string key = randomIndexes[index];
                    object value;
                    c5dictionary.Find(ref key, out value);
                }
                c5sw.Stop();
                Console.WriteLine("C5   - {0,10} MS - {1,10} Ticks", c5sw.ElapsedMilliseconds, c5sw.ElapsedTicks);

                Stopwatch isw = Stopwatch.StartNew();
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    object value;
                    idictionary.TryGetValue(i, out value);
                }
                isw.Stop();
                Console.WriteLine("Imm  - {0,10} MS - {1,10} Ticks", isw.ElapsedMilliseconds, isw.ElapsedTicks);


                Stopwatch fsw = Stopwatch.StartNew();
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    fdictionary.TryFind(i);
                }
                fsw.Stop();
                Console.WriteLine("FS   - {0,10} MS - {1,10} Ticks", fsw.ElapsedMilliseconds, fsw.ElapsedTicks);
            }
        }
    }
}

解决方案

The standard dictionary is already quite well optimized. All it really does when you do a lookup is calculate the hash of the provided key (the speed of which depends on the type of the key and how it implements GetHashCode), a modulo operation on the hash value to find the right bucket and then it iterates through the contents of the bucket until it finds the right value (the speed of which depends on the quality of the GetHashCode function, so if the buckets are well balanced and don't contain too many items, and the speed of the Equals method for the type).

All in all, it doesn't do that much for lookups, so I don't think you'll be able to find a significantly faster generic data structure. However, it's possible that depending on how you plan to use the dictionaries, you're able to build a more specialized solution. For example, I needed a really fast lookup where the key was a type. Instead of using a dictionary and doing dictionary[typeof(T)], I made a generic class like so:

class ValueStore<T> 
{
  public static T Value;
}

So I could just do ValueStore<T>.Value with pretty much zero lookup overhead.

Whether or not you could do something similar (and whether it's worth it) really depends on your usecase; how many items the structure would hold, how often it's being read and written to, whether it needs to be threadsafe, how important write speed is, etc. For example, if write speed didn't matter at all but if thread safety was required, you would need to do a copy-on-write, where the data structure is never written to but instead copied, ensuring thread safety and lockless (thus: fast) reads, at the cost of write speeds. Specializing it could go as far as reordering the structure on writes to optimize it so the hash buckets don't contain more than N items.

PS: if you were really desperate for speed but couldn't build a more specialized data structure then you could perhaps get small gains from copying Dictionary<TKey,TValue> and removing the various sanity checks (null checks and such) and virtual/interface method calls. However, I doubt this would give you any more than 20% gain, if that.

这篇关于不可变字典VS字典VS C5 VS F# - 性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆