C#错误:OutOfMemoryException-读取大文本文件并从字典中替换 [英] C# Error: OutOfMemoryException - Reading a large text file and replacing from dictionary

查看:55
本文介绍了C#错误:OutOfMemoryException-读取大文本文件并从字典中替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一般来说,我是C#和面向对象编程的新手.我有一个解析文本文件的应用程序.

I'm new to C# and object-oriented programming in general. I have an application which parses text file.

该应用程序的目的是读取提供的文本文件的内容并替换匹配的值.

The objective of the application is to read the contents of the provided text file and replace the matching values.

当提供大约800 MB到1.2GB的文件作为输入时,应用程序崩溃,错误为System.OutofMemoryException.

When a file about 800 MB to 1.2GB is provided as the input, the application crashes with error System.OutofMemoryException.

在研究中,我遇到了一些答案,建议将目标平台:更改为x64 .

On researching, I came across couple of answers which recommend changing the Target Platform: to x64.

更改目标平台后存在相同的问题.

Same issue exists after changing the target platform.

以下是代码:

// Reading the text file
                var _data = string.Empty;
                using (StreamReader sr = new StreamReader(logF))
                {
                    _data = sr.ReadToEnd();
                    sr.Dispose();
                    sr.Close();
                }

                foreach (var replacement in replacements)
                {
                    _data = _data.Replace(replacement.Key, replacement.Value);
                }


                //Writing The text File
                using (StreamWriter sw = new StreamWriter(logF))
                {
                    sw.WriteLine(_data);
                    sw.Dispose();
                    sw.Close();
                } 

错误指向

_data = sr.ReadToEnd();

_data = sr.ReadToEnd();

replacements是一本字典.键包含原始单词,值包含要替换的单词.

replacements is a dictionary. The Key contains the original word and the Value contains the word to be replaced.

Key元素被KeyValuePair的Value元素替换.

The Key elements are replaced with the Value elements of the KeyValuePair.

要遵循的方法是读取文件,替换并写入.

The approached being followed is Reading the file, replacing and writing.

我尝试使用StringBuilder代替字符串,但是应用程序崩溃了.

I tried using a StringBuilder instead of string yet the application crashed.

可以通过一次读取一行文件,替换并写入来解决此问题吗?什么是这样做的有效和更快的方法.

Can this be overcome by reading the file one line at a time, replacing and writing? What would be the efficient and faster way of doing the same.

更新:系统内存为8 GB,并且在监视性能时,内存使用率会飙升至100%.

Update: The system memory is 8 GB and on monitoring the performance it spikes upto 100% memory usage.

@Tim Schmelter的答案很好.

@Tim Schmelter answer works well.

但是,内存利用率峰值超过90%.可能是由于以下代码:

However, the memory utilization spikes over 90%. It could be due to the following code:

            String[] arrayofLine = File.ReadAllLines(logF);
            // Generating Replacement Information
            Dictionary<int, string> _replacementInfo = new Dictionary<int, string>();
            for (int i = 0; i < arrayofLine.Length; i++)
            {
                foreach (var replacement in replacements.Keys)
                {
                    if (arrayofLine[i].Contains(replacement))
                    {
                        arrayofLine[i] = arrayofLine[i].Replace(replacement, masking[replacement]);
                        if (_replacementInfo.ContainsKey(i + 1))
                        {
                            _replacementInfo[i + 1] = _replacementInfo[i + 1] + "|" + replacement;
                        }
                        else
                        {
                            _replacementInfo.Add(i + 1, replacement);
                        }
                    }
                }
            }

//Creating Replacement Information
                StringBuilder sb = new StringBuilder();
                foreach (var Replacement in _replacementInfo)
                {
                    foreach (var replacement in Replacement.Value.Split('|'))
                    {
                        sb.AppendLine(string.Format("Line {0}: {1} ---> \t\t{2}", Replacement.Key, replacement, masking[replacement]));
                    }
                }

                // Writing the replacement information
                if (sb.Length!=0)
                { 
                using (StreamWriter swh = new StreamWriter(logF_Rep.txt))
                {
                    swh.WriteLine(sb.ToString());
                    swh.Dispose();
                    swh.Close();
                }
                }
                sb.Clear();

它找到进行替换的行号.可以使用Tim的代码来捕获此数据,以避免多次将数据加载到内存中.

It finds the line number in which the replacement was made. Can this be captured using Tim's code in order to avoid loading the data into memory multiple times.

推荐答案

如果文件很大,则应尝试

If you have very large files you should try MemoryMappedFile which is designed for this purpose(files > 1GB) and enables to read "windows" of a file into memory. But it's not easy to use.

一个简单的优化就是逐行读取和替换

A simple optimization would be to read and replace line by line

int lineNumber = 0;
var _replacementInfo = new Dictionary<int, List<string>>();

using (StreamReader sr = new StreamReader(logF))
{
    using (StreamWriter sw = new StreamWriter(logF_Temp))
    {
        while (!sr.EndOfStream)
        {
            string line = sr.ReadLine();
            lineNumber++;
            foreach (var kv in replacements)
            {
                bool contains = line.Contains(kv.Key);
                if (contains)
                {
                    List<string> lineReplaceList;
                    if (!_replacementInfo.TryGetValue(lineNumber, out lineReplaceList))
                        lineReplaceList = new List<string>();
                    lineReplaceList.Add(kv.Key);
                    _replacementInfo[lineNumber] = lineReplaceList;

                    line = line.Replace(kv.Key, kv.Value);
                }
            }
            sw.WriteLine(line);
        }
    }
}

最后,如果您想覆盖旧版本,可以使用 File.Copy(logF_Temp,logF,true); .

At the end you can use File.Copy(logF_Temp, logF, true); if you want to overwite the old.

这篇关于C#错误:OutOfMemoryException-读取大文本文件并从字典中替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆