如何避免长串的生活造成第2代垃圾回收 [英] How to avoid long-living strings to cause generation 2 garbage collection

查看:131
本文介绍了如何避免长串的生活造成第2代垃圾回收的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我的地方保留日志字符串循环缓冲区的应用程序。当日志已满,对于每一个新的插入,老弦将被释放的垃圾收集,然后他们在第2代内存。因此,最终的一代2 GC会发生,这是我想避免的。



我试图元帅串入一个结构。出人意料的是,我仍然得到第2代GC:秒。看来结构仍保持一定的参考字符串。填写以下控制台应用程序。任何帮助表示赞赏。

 使用系统; 
使用System.Collections.Generic;使用System.Diagnostics程序
;
使用System.Linq的;使用System.Runtime.InteropServices
;
使用System.Text;使用System.Threading.Tasks
;

命名空间ConsoleApplication
{
类节目
{

[StructLayout(LayoutKind.Sequential)]
公共结构FixedString
{
[的MarshalAs(UnmanagedType.ByValTStr,SizeConst = 256)]
私人字符串str;

公共FixedString(字符串str)
{
this.str = str中;
}
}

[StructLayout(LayoutKind.Sequential)]
公共结构UTF8PackedString
{
私人诠释的长度;

[的MarshalAs(UnmanagedType.ByValArray,SizeConst = 256)]
私人字节[]海峡;

公共UTF8PackedString(INT长度)
{
this.length =长度;
海峡=新的字节[长度]
}

公共静态隐运营商UTF8PackedString(字符串str)
{
VAR OBJ =新UTF8PackedString(Encoding.UTF8.GetByteCount(STR));
VAR字节= Encoding.UTF8.GetBytes(STR);
Array.Copy(字节,obj.str,obj.length);
返回OBJ;
}
}

const int的缓冲区大小= 1000000;
const int的LoopCount =千万;

静态无效的主要(字串[] args)
{
Console.WriteLine({0} \t {1} \t {2} {\t 3} \t {4},
类型.PadRight(20),时间,GC(0),GC(1),GC(2));
Console.WriteLine();
的for(int i = 0;我小于5;我++)
{
TestPerformance<串将(S = GT S);
TestPerformance< FixedString>(S = GT;新FixedString(S));
TestPerformance&所述; UTF8PackedString&将(S =>氏);
Console.WriteLine();
}
Console.ReadKey();
}

私有静态无效TestPerformance< T>(Func键<字符串,T> FUNC)
{
变种缓冲=新的T [缓冲区大小]
GC.Collect的(2);
秒表=新的秒表();
变种initialCollectionCounts =新INT [] {GC.CollectionCount(0),GC.CollectionCount(1),GC.CollectionCount(2)};
stopWatch.Reset();
stopWatch.Start();
的for(int i = 0; I< LoopCount;我++)
缓冲区[我%BUFFERSIZE = FUNC(i.ToString());
stopWatch.Stop();
Console.WriteLine({0} \t {1} \t {2} \t {3} \t {4},
的typeof(T).Name.PadRight (20),
stopWatch.ElapsedMilliseconds,
(GC.CollectionCount(0) - initialCollectionCounts [0]),
(GC.CollectionCount(1) - initialCollectionCounts [1]),
(GC.CollectionCount(2) - initialCollectionCounts [2])$ b $二);
}
}
}



编辑:更新的代码UnsafeFixedString ,做必要的工作:

 使用系统; 
使用System.Collections.Generic;使用System.Diagnostics程序
;
使用System.Linq的;使用System.Runtime.InteropServices
;
使用System.Text;使用System.Threading.Tasks
;

命名空间ConsoleApplication
{
类节目
{
公共不安全结构UnsafeFixedString
{
私人诠释的长度;

私人固定字符海峡[256];

公共UnsafeFixedString(INT长度)
{
this.length =长度;
}

公共静态隐运营商UnsafeFixedString(字符串str)
{
VAR OBJ =新UnsafeFixedString(str.Length);
的for(int i = 0; I< str.Length;我++)
obj.str [I] = STR [I]
返回OBJ;
}
}

const int的缓冲区大小= 1000000;
const int的LoopCount =千万;

静态无效的主要(字串[] args)
{
Console.WriteLine({0} \t {1} \t {2} {\t 3} \t {4},
类型.PadRight(20),时间,GC(0),GC(1),GC(2));
Console.WriteLine();
的for(int i = 0;我小于5;我++)
{
TestPerformance(S = GT S);
TestPerformance&所述; UnsafeFixedString&将(S =>氏);
Console.WriteLine();
}
Console.ReadKey();
}

私有静态无效TestPerformance< T>(Func键<字符串,T> FUNC)
{
变种缓冲=新的T [缓冲区大小]
GC.Collect的(2);
秒表=新的秒表();
变种initialCollectionCounts =新INT [] {GC.CollectionCount(0),GC.CollectionCount(1),GC.CollectionCount(2)};
stopWatch.Reset();
stopWatch.Start();
的for(int i = 0; I< LoopCount;我++)
缓冲区[我%BUFFERSIZE = FUNC(的String.Format({0},i)段);
stopWatch.Stop();
Console.WriteLine({0} \t {1} \t {2} \t {3} \t {4},
的typeof(T).Name.PadRight (20),
stopWatch.ElapsedMilliseconds,
(GC.CollectionCount(0) - initialCollectionCounts [0]),
(GC.CollectionCount(1) - initialCollectionCounts [1]),
(GC.CollectionCount(2) - initialCollectionCounts [2])$ b $二);
}
}
}



我的电脑上输出为:

 键入时间GC(0)GC(1)GC(2)

字符串5746 160 71 19
UnsafeFixedString 5345 418 0 0


解决方案

据不应该是一个惊喜,一个结构字符串字段,使这里做区别:字符串字段的总是的只是托管堆上的引用一个对象 - 具体而言,字符串对象的某个地方。在字符串将依然存在,仍然会造成GC2最后。



修理这是不是唯一的方法它作为一个对象的所有的;而只有这样,才能做到这一点(没有外界完全去管理内存)是使用固定缓冲区:

 公共不安全结构FixedString 
{
私人固定字符海峡[100];
}



在这里,每一个结构实例 FixedString 先后为数据保留的200字节。 STR 只是一个相对偏移到的char * 这标志着该保留的开始。然而,的工作的本是棘手的 - 并且需要在整个不安全代码。还要注意的是每个 FixedString 保留相同的空间量,无论您是否真的要存储3个字符或170为避免出现内存问题,你要么需要使用空值teriminators,或单独存储负载长度。



请注意,在.NET 4.5中,的 < gcAllowVeryLargeObjects> 支持能够有这样值的体面大小的数组(一 FixedString [] ,例如) - 但请注意,您不希望经常复制数据。为了避免这种情况,你会希望始终允许空闲空间数组中(这样你就不会整个数组复制只是增加一个项目),并通过 REF ,即

  FixedString []数据= ... 
INT指数= ...
ProcessItem(REF数据[指数]);

无效ProcessItem(参考FixedString项目){
// ...
}

下面项目直接交谈元素的数组中 - 我们没有任何复制的数据出来。点



现在我们只有一个的对象的 - 数组本身


I have an application where I keep log strings in circular buffers. When a log gets full, for every new insert, old strings will be released for garbage collection and then they are in generation 2 memory. Thus, eventually a generation 2 GC will happen, which I would like to avoid.

I tried to marshal the string into a struct. Surprisingly, I still get generation 2 GC:s. It seems the struct still keeps some reference to the string. Complete console app below. Any help appreciated.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication
{
    class Program
    {

        [StructLayout(LayoutKind.Sequential)]
        public struct FixedString
        {
            [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 256)]
            private string str;

            public FixedString(string str)
            {
                this.str = str;
            }
        }

        [StructLayout(LayoutKind.Sequential)]
        public struct UTF8PackedString
        {
            private int length;

            [MarshalAs(UnmanagedType.ByValArray, SizeConst = 256)]
            private byte[] str;

            public UTF8PackedString(int length)
            {
                this.length = length;
                str = new byte[length];
            }

            public static implicit operator UTF8PackedString(string str)
            {
                var obj = new UTF8PackedString(Encoding.UTF8.GetByteCount(str));
                var bytes = Encoding.UTF8.GetBytes(str);
                Array.Copy(bytes, obj.str, obj.length);
                return obj;
            }
        }

        const int BufferSize = 1000000;
        const int LoopCount = 10000000;

        static void Main(string[] args)
        {
            Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
                "Type".PadRight(20), "Time", "GC(0)", "GC(1)", "GC(2)");
            Console.WriteLine();
            for (int i = 0; i < 5; i++)
            {
                TestPerformance<string>(s => s);
                TestPerformance<FixedString>(s => new FixedString(s));
                TestPerformance<UTF8PackedString>(s => s);
                Console.WriteLine();
            }
            Console.ReadKey();
        }

        private static void TestPerformance<T>(Func<string, T> func)
        {
            var buffer = new T[BufferSize];
            GC.Collect(2);
            Stopwatch stopWatch = new Stopwatch();
            var initialCollectionCounts = new int[] { GC.CollectionCount(0), GC.CollectionCount(1), GC.CollectionCount(2) };
            stopWatch.Reset();
            stopWatch.Start();
            for (int i = 0; i < LoopCount; i++)
                buffer[i % BufferSize] = func(i.ToString());
            stopWatch.Stop();
            Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
                typeof(T).Name.PadRight(20),
                stopWatch.ElapsedMilliseconds,
                (GC.CollectionCount(0) - initialCollectionCounts[0]),
                (GC.CollectionCount(1) - initialCollectionCounts[1]),
                (GC.CollectionCount(2) - initialCollectionCounts[2])
            );
        }
    }
}

Edit: Updated code with UnsafeFixedString that does the required work:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication
{
    class Program
    {
        public unsafe struct UnsafeFixedString
        {
            private int length;

            private fixed char str[256];

            public UnsafeFixedString(int length)
            {
                this.length = length;
            }

            public static implicit operator UnsafeFixedString(string str)
            {
                var obj = new UnsafeFixedString(str.Length);
                for (int i = 0; i < str.Length; i++)
                    obj.str[i] = str[i];                
                return obj;
            }
        }

        const int BufferSize = 1000000;
        const int LoopCount = 10000000;

        static void Main(string[] args)
        {
            Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
                "Type".PadRight(20), "Time", "GC(0)", "GC(1)", "GC(2)");
            Console.WriteLine();
            for (int i = 0; i < 5; i++)
            {
                TestPerformance(s => s);
                TestPerformance<UnsafeFixedString>(s => s);
                Console.WriteLine();
            }
            Console.ReadKey();
        }

        private static void TestPerformance<T>(Func<string, T> func)
        {
            var buffer = new T[BufferSize];
            GC.Collect(2);
            Stopwatch stopWatch = new Stopwatch();
            var initialCollectionCounts = new int[] { GC.CollectionCount(0), GC.CollectionCount(1), GC.CollectionCount(2) };
            stopWatch.Reset();
            stopWatch.Start();
            for (int i = 0; i < LoopCount; i++)
                buffer[i % BufferSize] = func(String.Format("{0}", i));
            stopWatch.Stop();
            Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
                typeof(T).Name.PadRight(20),
                stopWatch.ElapsedMilliseconds,
                (GC.CollectionCount(0) - initialCollectionCounts[0]),
                (GC.CollectionCount(1) - initialCollectionCounts[1]),
                (GC.CollectionCount(2) - initialCollectionCounts[2])
            );
        }
    }
}

Output on my computer is:

Type                    Time    GC(0)   GC(1)   GC(2)

String                  5746    160     71      19
UnsafeFixedString       5345    418     0       0

解决方案

It should not be a surprise that a struct with a string field makes do difference here: a string field is always simply a reference to an object on the managed heap - specifically, a string object somewhere. The string will still exist and still cause GC2 eventually.

The only way to "fix" this is to not have it as an object at all; and the only way to do that (without going completely outside of managed memory) is to use a fixed buffer:

public unsafe struct FixedString
{
    private fixed char str[100];
}

Here, every struct instance FixedString has 200 bytes reserved for the data. str is simply a relative offset to the char* that marks the start of this reservation. However, working with this is tricky - and requires unsafe code throughout. Also note that every FixedString reserves the same amount of space regardless of whether you actually want to store 3 characters or 170. To avoid memory issues, you would either need to use null-teriminators, or store the payload length separately.

Note that in .NET 4.5, the <gcAllowVeryLargeObjects> support makes it possible to have a decent sized array of such values (a FixedString[], for example) - but note that you don't want to copy the data very often. To avoid that, you would want to always allow spare space in the array (so you don't copy the entire array just to add one item), and work with individual items via ref, i.e.

FixedString[] data = ...
int index = ...
ProcessItem(ref data[index]);

void ProcessItem(ref FixedString item) {
    // ...
}

Here item is talking directly to the element in the array - we have not copied the data out at any point.

Now we only have one object - the array itself.

这篇关于如何避免长串的生活造成第2代垃圾回收的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆