openmp中的printf性能问题 [英] printf performance issue in openmp

查看:129
本文介绍了openmp中的printf性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人告诉我不要在openmp程序中使用printf,因为它会降低并行仿真程序的性能.

I have been told not to use printf in openmp programs as it degrades the performance of parallel simulation program.

我想知道什么可以替代.我的意思是如何在不使用printf的情况下显示程序的输出.

I want to know what is the substitute for that. I mean how to display the output of a program without using printf.

我使用openmp遇到以下AES-128模拟问题,需要进一步说明 使用Openmp在C语言中对AES进行并行仿真

I have the following AES-128 simulation problem using openmp which needs further comments Parallel simulation of AES in C using Openmp

我想知道如何在不降低模拟性能的情况下输出密文?

I want to know how to output the cipher text without degrading the simulation performance?

谢谢.

推荐答案

不能同时吃馅饼.确定是否要具有出色的并行性能,或者在运行并行循环的同时查看算法 的输出是否很重要.

You cannot both have your pie and eat it. Decide if you want to have great parallel performance or if it's important to see the output of the algorithm while running the parallel loop.

一种明显的脱机解决方案是将纯文本,密钥和密文存储在数组中.在您的情况下,在原始情况下需要119 MiB(= 650000*(3*4*16)字节),在进行65000次试验的情况下仅需要12 MiB.具有GiBs RAM的现代计算机无法处理任何事情.后者甚至适合某些服务器级CPU的最后一级缓存.

The obvious offline solution is to store the plaintexts, keys and ciphertexts in arrays. In your case that would require 119 MiB (= 650000*(3*4*16) bytes) in the original case and only 12 MiB in the case with 65000 trials. Nothing that a modern machine with GiBs of RAM cannot handle. The latter case even even fits in the last-level cache of some server-class CPUs.

#define TRIALS 65000

int (*key)[16];
int (*pt)[16];
int (*ct)[16];

double timer;

key = malloc(TRIALS * sizeof(*key));
pt = malloc(TRIALS * sizeof(*pt));
ct = malloc(TRIALS * sizeof(*ct));

timer = -omp_get_wtime();

#pragma omp parallel for private(rnd,j)
for(i = 0; i < TRIALS; i++)
{
   ...

   for(j = 0; j < 4; j++)
   {
      key[i][4*j]   = (rnd[j] & 0xff);
      pt[i][4*j]    = key[i][4*j];
      key[i][4*j+1] = ((rnd[j] >> 8)  & 0xff) ; 
      pt[4*j+1]     = key[i][4*j+1];
      key[i][4*j+2] = ((rnd[j] >> 16) & 0xff) ;
      pt[i][4*j+2]  = key[i][4*j+2];
      key[i][4*j+3] = ((rnd[j] >> 24) & 0xff) ;
      pt[i][4*j+3]  = key[i][4*j+3];
   }

   encrypt(key[i],pt[i],ct[i]);
}

timer += omp_get_wtime();
printf("Encryption took %.6f seconds\n", timer);

// Now display the results serially
for (i = 0; i < TRIALS; i++)
{
    display pt[i], key[i] -> ct[i]
}

free(key); free(pt); free(ct);

要查看加速,您只需要测量在并行区域中花费的时间即可.如果您还测量显示结果所花费的时间,您将回到开始的地方.

To see the speed-up, you have to measure only the time spent in the parallel region. If you also measure the time it takes to display the results, you will be back to where you started.

这篇关于openmp中的printf性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆