.NET GC失速桌面应用程序 - 性能问题 [英] .NET GC Stalling Desktop Application - Performance Issue

查看:76
本文介绍了.NET GC失速桌面应用程序 - 性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作是存储在一个项目文件的形式大量数据的大窗户桌面应用程序。我们有我们的自定义ORM,系列化,以有效地加载从CSV格式的对象数据。该任务由并行处理多个文件运行多个线程执行。我们的大型项目可以包含万元,可能更多的对象,它们之间许多关系。

I am working on a large windows desktop application that stores large amount of data in form of a project file. We have our custom ORM and serialization to efficiently load the object data from CSV format. This task is performed by multiple threads running in parallel processing multiple files. Our large project can contain million and likely more objects with many relationships between them.

最近我得到了任务,提高项目开放的表现而恶化的非常大的项目。经分析事实证明,大部分的时间花在可以归结为垃圾回收(GC)。

Recently I got tasked to improve the project open performance which deteriorated for very large projects. Upon profiling it turned out that most of the time spent can be attributed to garbage collection (GC).

我的理论是,由于大量的非常快速的分配GC正在挨饿,推迟了很长一段时间,然后当它终于踢它需要很长的时间来工作。这个想法得到进一步证实了两个矛盾的事实:

My theory is that due to large number of very fast allocations the GC is starved, postponed for a very long time and then when it finally kicks in it takes a very long time to the job. That idea was further confirmed by two contradicting facts:

  1. 优化反序列化code的工作速度更快只能让事情变得更糟
  2. 插入 Thread.sleep代码在做负载走得更快战略要地调用
  1. Optimizing deserialization code to work faster only made things worse
  2. Inserting Thread.Sleep calls at strategic places made load go faster

慢负荷7代2集和时间在GC巨大%的实例如下。

Example of slow load with 7 generation 2 collections and huge % of time in GC is below.

快速的负载与在code睡眠时间的例子,让GC一段时间如下。在这种情况下,凌晨有19代2集和第0代和1代收藏也超过两倍。

Example of fast load with sleep periods in the code to allow GC some time is below. In this case wee have 19 generation 2 collections and also more than double the number of generation 0 and generation 1 collections.

所以,我的问题是如何prevent这个GC饥饿?添加 Thread.sleep代码看起来傻傻的,这是非常困难的猜测在正确的地方毫秒适量。我的另一个想法是使用 GC.Collect的,但也带来了多少,并在那里把他们的困难。任何其他的想法?

So, my question is how to prevent this GC starvation? Adding Thread.Sleep looks silly and it is very difficult to guess the right amount of milliseconds in the right place. My other idea would be to use GC.Collect, but that also poses the difficulty of how many and where to put them. Any other ideas?

推荐答案

根据的评论,我猜你正在做一吨 String.Substring的()业务为CSV解析的一部分。所有这些创建一个新的字符串实例,我敢打赌,你经过进一步解析成一个整数或日期,或任何你需要再扔掉。你几乎肯定需要开始考虑使用不同的持久机制(CSV有很多,你无疑意识到了不足之处),但在此期间你会想看看解析器不分配子版本。如果你深入到$ C $下Int32.TryParse,你会发现,它确实有些字符重复,以避免分配多个字符串。我敢打赌,你可以花一个小时写一个版本,它接受一个启动结束参数,那么你就可以通过他们与偏移整条生产线,并避免做一个子调​​用来获取各个字段的值。否则,将节省分配你千万。

Based on the comments, I'd guess that you are doing a ton of String.Substring() operations as part of CSV parsing. Each of these creates a new string instance, which I'd bet you then throw away after further parsing it into an integer or date or whatever you need. You almost certainly need to start thinking about using a different persistence mechanism (CSV has a lot of shortcomings that you are undoubtedly aware of), but in the meantime you are going to want to look into versions of parsers that do not allocate substrings. If you dig into the code for Int32.TryParse, you'll find that it does some character iteration to avoid allocating more strings. I'd bet that you could spend an hour writing a version that takes a start and end parameter, then you can pass them the whole line with offsets and avoid doing a substring call to get the individual field values. Doing that will save you millions of allocations.

这篇关于.NET GC失速桌面应用程序 - 性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆