Windows XP和多核处理器上C ++运行时流(流,ifstream,字符串流)的多线程性能不佳 [英] Bad multithread performance of C++ runtime streams (ofstream, ifstream, stringstream) on Windows XP and multi core processors

查看:122
本文介绍了Windows XP和多核处理器上C ++运行时流(流,ifstream,字符串流)的多线程性能不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

当多个线程同时读取或写入(分离)流时,C ++运行时流的性能很差.我正在使用Visual Studio 2008 SP1.

作为示例,附加的代码按顺序产生如下输出:

字符串流:
线程ID:4940 StringStream 0.27
线程ID:4940 StringStream 0.268735
线程ID:4940 StringStream 0.268111
线程ID:4940 Total 0.815061
并行的字符串流:
线程ID:6036 StringStream 8.37365
线程ID:2536 StringStream 9.92566
线程ID:5128 StringStream 9.93738
已完成

在Windows XP Sp3上的Intel Core 2 Duo 2.4 Ghz计算机上. IE.顺序执行acces的速度比并行执行acces的速度快10倍.在四核上,甚至更糟(速度降低了100倍).这不仅适用于stringstream,而且适用于ofstream和ifstream.

我尝试了

Hello everybody,

the C++ runtime streams perform badly when several threads read or write to (separate) streams at the same time. I am using Visual Studio 2008 SP1.

As an example the attached code produces output like

String stream sequentially:
Thread Id: 4940 StringStream 0.27
Thread Id: 4940 StringStream 0.268735
Thread Id: 4940 StringStream 0.268111
Thread Id: 4940 Total 0.815061
String stream parallel:
Thread Id: 6036 StringStream 8.37365
Thread Id: 2536 StringStream 9.92566
Thread Id: 5128 StringStream 9.93738
Finished

on a Intel Core 2 Duo 2.4 Ghz machine on Windows XP Sp3. I.e. performing the acces sequentially is 10 times faster than doing it in parallel. On a quad core it is even worse (100 times slowdown). This is not only true for stringstream but also for ofstream and ifstream.

I tried the suggestions in http://msdn.microsoft.com/en-us/library/ms235505(VS.80).aspx (defining _CRT_DISABLE_PERFCRIT_LOCKS, calling  _configthreadlocale(_ENABLE_PER_THREAD_LOCALE); ) but it didn't change anything.

Analyzing the performance using a profiler one can see, that the CRT uses a global lock in the locale implementation to protect the reference counting:

  _CRTIMP2_PURE void __CLR_OR_THIS_CALL _Incref()
   { // safely increment the reference count
   _BEGIN_LOCK(_LOCK_LOCALE)
    if (_Refs < (size_t)(-1))
     ++_Refs;
   _END_LOCK()
   }

This causes a huge number of context switches on XP, which is the reason for the bad performance.

On Vista, the performance is the same for sequential and parallel reads. So no speed-up, but at least no dramatic slowdown. I think, that this is due to the better thread scheduler on Vista.

Any suggestions?

Regards,
Bernd

// StringStream.cpp : Defines the entry point for the console application.
//

#define _CRT_DISABLE_PERFCRIT_LOCKS

#define _CRT_DISABLE_PERFCRIT_LOCKS

#include"stdafx.h"

#include "stdafx.h"

#include< Windows.h>
#include< stdio.h>
#include< tchar.h>
#include< iostream>
#包括< fstream>
#include< sstream>
#include< vector>

#include <Windows.h>
#include <stdio.h>
#include <tchar.h>
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>

CPerfTimer类
{
const char * m_strMsg;
DWORD m_dwThreadId;
LARGE_INTEGER m_liStart;
public:

class CPerfTimer
{
 const char* m_strMsg;
 DWORD m_dwThreadId;
 LARGE_INTEGER m_liStart;
public:

CPerfTimer(const char * strMsg)
{
m_dwThreadId = :: GetCurrentThreadId();
:: _ QueryPerformanceCounter(& m_liStart);
/>

 CPerfTimer(const char* strMsg)
 {
  m_dwThreadId = ::GetCurrentThreadId();
  m_strMsg = strMsg;
  ::QueryPerformanceCounter(&m_liStart);
 }

〜CPerfTimer(void)
{
LARGE_INTEGER liStop;
::: QueryPerformanceCounter(& liStop);
LARGE_INTEGER liFrequency;
::: QueryPerformanceFrequency(& liFrequency);

 ~CPerfTimer(void)
 {
  LARGE_INTEGER liStop;
  ::QueryPerformanceCounter(&liStop);
  LARGE_INTEGER liFrequency;
  ::QueryPerformanceFrequency(&liFrequency);

std :: stringstream strStream;
strStream<< "线程ID:" << m_dwThreadId<< " " << m_strMsg<< " " << double(liStop.QuadPart-m_liStart.QuadPart)/liFrequency.QuadPart<< std :: endl;
std :: cout<< strStream.str();
}
};

  std::stringstream strStream;
  strStream << "Thread Id: " << m_dwThreadId << " " << m_strMsg << " " << double(liStop.QuadPart - m_liStart.QuadPart) / liFrequency.QuadPart << std::endl;
  std::cout << strStream.str();
 }
};

struct StateStringStream
{
StateStringStream(int count)
:Count(count)
{
Event = :: CreateEvent(NULL,FALSE,FALSE,NULL );
}

struct StateStringStream
{
 StateStringStream(int count)
  : Count(count)
 {
  Event = ::CreateEvent(NULL, FALSE, FALSE, NULL);
 }

int Count;
HANDLE事件;
};

 int Count;
 HANDLE Event;
};

DWORD WINAPI StringStream(LPVOID pParameter)
{
const StateStringStream * state = static_cast< const StateStringStream *>(pParameter);

DWORD WINAPI StringStream(LPVOID pParameter)
{
 const StateStringStream* state = static_cast<const StateStringStream*>(pParameter);

try
{
CPerfTimer cPerfTimer("StringStream");

 try
 {
  CPerfTimer cPerfTimer("StringStream");

std :: stringstream流;

  std::stringstream stream;

for(int i = 0; i< state-> Count; ++ i)
stream<<我<< std :: endl;

  for (int i = 0; i < state->Count; ++i)
   stream << i << std::endl;

stream.seekg(0,std :: ios_base :: beg);

  stream.seekg(0, std::ios_base::beg);

for(int i = 0; i< state-> Count; ++ i)
{
int j;
stream>> j;
if(i!= j)
抛出std :: exception(``意外的字符串流读取'');
}
}
捕获(std: :exception& ex)
{
std :: cout<< "例外:" <<例如:what()<< std :: endl;

  for (int i = 0; i < state->Count; ++i)
  {
   int j;
   stream >> j;
   if (i != j)
    throw std::exception("unexpected string stream read");
  }
 }
 catch (std::exception& ex)
 {
  std::cout << "exception: " << ex.what() << std::endl;
 }

::: SetEvent(state-> Event);
返回0;
}

 ::SetEvent(state->Event);
 return 0;
}

int _tmain(int argc,_TCHAR * argv [])
{
///配置每个线程的语言环境以使所有随后创建的
//线程具有自己的语言环境. br/> _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);

int _tmain(int argc, _TCHAR* argv[])
{
 // Configure per-thread locale to cause all subsequently created
 // threads to have their own locale.
 _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);

StateStringStreamstatesSS [3] = {StateStringStream(100000),StateStringStream(100000),StateStringStream(100000)};
HANDLE handlesSS [3] = {statesSS [0] .Event,statesSS [1] .Event ,statesSS [2] .Event};

 StateStringStream statesSS[3] = { StateStringStream(100000), StateStringStream(100000), StateStringStream(100000) };
 HANDLE handlesSS[3] = { statesSS[0].Event, statesSS[1].Event, statesSS[2].Event };

{
CPerfTimer cPerfTimer("Total");
std :: cout<< "字符串流顺序地:" << std :: endl;
for(int i = 0; i< 3; ++ i)
StringStream(& statesSS [i]);
}

 {
  CPerfTimer cPerfTimer("Total");
  std::cout << "String stream sequentially: " << std::endl;
  for (int i = 0; i < 3; ++i)
   StringStream(&statesSS[i]);
 }

std :: cout<< "字符串流并行:" << std :: endl;
for(int i = 0; i< 3; ++ i)
{
::: ResetEvent(statesSS [i] .Event);
//WT_EXECUTEDEFAULT不会在WinXP上启动几个线程,而是在Vista上启动!
::: QueueUserWorkItem(& StringStream,& statesSS [i],WT_EXECUTELONGFUNCTION);
}

 std::cout << "String stream parallel: " << std::endl;
 for (int i = 0; i < 3; ++i)
 {
  ::ResetEvent(statesSS[i].Event);
  // WT_EXECUTEDEFAULT does not start several threads on WinXP but on Vista!
  ::QueueUserWorkItem(&StringStream, &statesSS[i], WT_EXECUTELONGFUNCTION);
 }

:: WaitForMultipleObjects(3,handlesSS,TRUE,INFINITE);

 ::WaitForMultipleObjects(3, handlesSS, TRUE, INFINITE);

std :: cout<< 完成" << std :: endl;

 std::cout << "Finished" << std::endl;

返回0;
}

 return 0;
}





 

推荐答案

是的,CRT包含许多细粒度的锁.您对此无能为力,它可以追溯到一个时代,在那个时代,库作者认为保护程序员免于犯错是很重要的.现代库,尤其是.NET,已经从这种保姆的态度移开了,完全由程序员来决定正确的锁.出于性能方面的考虑,还因为细粒度的锁定通常不足以解决所有并发问题.

我没有尝试过您的代码,但请注意,堆分配器.它可以解释为什么Vista会更好.
Yup, the CRT contains lots of fine-grained locks.  There isn't much you can do about that, it dates from an era where library writers thought it was important to protect the programmer from getting it wrong.  Modern libraries, notably .NET, moved away from that baby-sitting attitude, leaving it completely up to the programmer to get the locks right.  For performance reasons, but also because fine-grained locks are not typically good enough to solve all concurrency problems.

I haven't tried your code, but do note that there is also a global lock on the heap allocator.  It could explain why Vista works better.


这篇关于Windows XP和多核处理器上C ++运行时流(流,ifstream,字符串流)的多线程性能不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆