天蓝色服务结构可靠字典linq查询非常慢 [英] azure service fabric reliable dictionary linq query very slow

查看:79
本文介绍了天蓝色服务结构可靠字典linq查询非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在服务结构状态服务中有可靠的字典.我有一个简单的linq表达式.
我正在使用Ix-Async包来构建asyncenumerable.

I have a reliable dictionary in service fabric stateful service. I have a simple linq expression.
I am using Ix-Async package for building an asyncenumerable.

using (ITransaction tx = this.StateManager.CreateTransaction())  
        {  

          var result = (await customers.CreateLinqAsyncEnumerable(tx))
                .Where(x => x.Value.NameFirst != null && x.Value.NameFirst.EndsWith(n, StringComparison.InvariantCultureIgnoreCase))
                    .Select(y => y.Value);

           return await result.ToList();


        }  


数据分为2个分区,每个分区中约有75,000条记录.我正在使用Int64 range作为分区键.在上面的代码中,每个分区执行"Result.ToList()"大约需要1分钟.另一个奇怪的事情是,实际结果为空!在sql服务器中运行的同一sql返回带有客户名字以"c"结尾的行.但是,这不是重点.我最担心的是"ReliableDictionary" linq查询的性能.
问候


The data is organized into 2 partitions with around 75,000 records in each partition. I am using Int64 range as the partition key. In the above code, the "Result.ToList()" takes around 1 minute to execute for each partition. Another weired thing is, the actual result is empty!. The same sql run in sql server returns rows with customer first names ending with "c". But, this is besides the point. My biggest concern is performance of "ReliableDictionary" linq query.
Regards

推荐答案

可靠的字典会定期从内存中删除最近最少使用的值.这是为了启用

Reliable Dictionary periodically removes least recently used values from memory. This is to enable

  • 大型可靠词典
  • 更高的密度:每个副本的可靠集合的密度更高,每个节点的副本的密度更高.

要权衡的是,这可能会增加读取延迟:需要磁盘IO来检索未在内存中缓存的值.

The trade-off is that, this can increase read latencies: disk IO is required to retrieve values that are not cached in-memory.

有两种方法可以减少枚举的延迟.

There are couple of options to get lower latency on enumerations.

1)键过滤的枚举:您可以将要在查询中使用的字段移到ReliableDictionary的TKey(在上面的示例中为NameFirst).这将使您可以使用

1) Key Filtered Enumeration: You can move the fields that you would like to use in your query in to the TKey of the ReliableDictionary (NameFirst in the above example). This would allow you use the CreateEnumerbleAsync overload that takes in a key filter. The key filter allows Reliable Dictionary to avoid retrieving values from the disk for keys that do not match your query. One limitation of this approach is that TKey (hence the fields inside it) cannot be updated.

2)使用通知的内存中二级索引:

2) In-memory Secondary Index using Notifications: Reliable Dictionary Notifications can be used to build any number of secondary indices. You could build a secondary index that keeps all of the values in-memory hence trading memory resources to provide lower read latency. Furthermore, since you have full control over the secondary index, you can keep the secondary index ordered (e.g. by reverse of NameFirst in your example).

我们还正在考虑使可靠字典"的内存中TValue扫描策略可配置.这样,如果您将读取延迟作为优先事项,就可以配置可靠字典"以将所有值保留在内存中.

We are also considering making Reliable Dictionary's in-memory TValue sweep policy configurable. With this, you will be able to configure the Reliable Dictionary to keep all values in-memory if read latencies is a priority for you.

由于在您的方案中,大多数枚举时间都花在磁盘IO上,因此您还可以受益于使用

Since in your scenario most of the time in enumeration is spent on disk IO, you can also benefit from using your Custom Serializer which can reduce the disk and network footprint.

谢谢您的提问.

这篇关于天蓝色服务结构可靠字典linq查询非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆