C# - 使用 StreamReader 并行化 While 循环导致高 CPU [英] C# - Parallelizing While Loop with StreamReader causing High CPU

查看:29
本文介绍了C# - 使用 StreamReader 并行化 While 循环导致高 CPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SemaphoreSlim sm = new SemaphoreSlim(10);

using (FileStream fileStream = File.OpenRead("..."))
using (StreamReader streamReader = new StreamReader(fileStream, Encoding.UTF8, true, 4096))
{
    String line;
    while ((line = streamReader.ReadLine()) != null)
    {
        sm.Wait();
        new Thread(() =>
        {
            doSomething(line);
            sm.Release();
        }).Start();
    }
}
MessageBox.Show("This should only show once doSomething() has done its LAST line.");

所以,我有一个非常大的文件,我想在每一行上执行代码.

So, I have an extremely large file that I want to execute code on every single line.

我想并行执行,但一次最多 10 个.

I want to do it in Parallel but at a maximum of 10 at a time.

我的解决方案是使用 SemaphoreSlim 在线程完成时等待和释放.(由于函数是同步的,所以 .Release() 的放置是有效的).

My solution for that was to use SemaphoreSlim to wait and release when the thread is finished. (Since the function is synchronous the placement of .Release() works).

问题是代码占用了大量 CPU.内存按预期运行,而不是加载超过 400 mb,它只是每隔几秒上升和下降几 mb.

The issue is the code takes a LOT of CPU. Memory is going just as expected and instead of loading in over 400mb, it just goes up and down a few mbs every few seconds.

但是 CPU 会发疯,它大部分时间都锁定在 100% 的 30 秒内,然后略微下降并返回.

But CPU goes crazy, its most of the time locked at 100% for a good 30 seconds and dips down slightly and goes back.

由于我不想将每一行都加载到内存中,并且想按原样运行代码,那么这里最好的解决方案是什么?

Since I don't want to load every line into memory, and want to run code as it goes, whats the best solution here?

我从 new Thread(()=>{}).Start(); 改为 Task.Factory.StartNew(()=>{});code> 根据评论中提到的,似乎是线程创建和销毁导致性能下降.而且似乎是对的.在我转移到 Task.Factory.StartNew 后,它的运行速度与信号量提到的一样,它的 CPU 与我的 Parallel.ForEach 代码版本完全一样.

I changed from new Thread(()=>{}).Start(); to Task.Factory.StartNew(()=>{}); as per mentioned in comments, it seems that the Thread Creation and Destroying is causing the performance drop. And it seems to be right. After I moved to Task.Factory.StartNew it runs same speed as per mentioned by the Semaphore, and its CPU is exactly like my Parallel.ForEach code version.

推荐答案

您的代码创建了大量线程,效率很低.C# 有更简单的方法来处理您的场景.一种方法是:

Your code creates a huge number of threads, which is inefficient. C# has easier ways of handling with your scenario. One approach is:

File.ReadLines(path, Encoding.UTF8)
    .AsParallel().WithDegreeOfParallelism(10)
    .ForAll(doSomething);

  • File.ReadLines 不会读取整个文件,而是逐行读取.
  • 使用 WithDegreeOfParallelism 设置最大并发执行任务数
  • 使用 ForAll 启动一个每行的方法.
  • 这篇关于C# - 使用 StreamReader 并行化 While 循环导致高 CPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆