如何尽快输出固定缓冲区? [英] How to output as fast as possible a fixed buffer?

查看:116
本文介绍了如何尽快输出固定缓冲区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

示例代码:

#include <stdio.h>
#include <unistd.h>
#include <sched.h>
#include <pthread.h>

int
main (int argc, char **argv)
{

  unsigned char buffer[128];
  char buf[0x4000];
  setvbuf (stdout, buf, _IOFBF, 0x4000);
  fork ();
  fork ();

  pthread_t this_thread = pthread_self ();

  struct sched_param params;

  params.sched_priority = sched_get_priority_max (SCHED_RR);

  pthread_setschedparam (this_thread, SCHED_RR, &params);


  while (1)
    {
      fwrite (&buffer, 128, 1, stdout);
    }
}

该程序打开4个线程,并在stdout上输出缓冲区"的内容,该内容是64位cpu上的128个字节或16个长整数.

This program opens 4 threads and outputs on stdout the contents of "buffer" which is 128 bytes or 16 long ints on a 64 bit cpu.

如果我随后运行:

./writetest | pv -ptebaSs 800G>/dev/null

./writetest | pv -ptebaSs 800G >/dev/null

我获得大约7.5 GB/s的速度.

I get a speed of about 7.5 GB/s.

顺便说一句,这是我得到的相同速度:

Incidentally, that is the same speed I get if I do:

$ mkfifo out
$ dd if=/dev/zero bs=16384 >out &
$ dd if=/dev/zero bs=16384 >out &
$ dd if=/dev/zero bs=16384 >out &
$ dd if=/dev/zero bs=16384 >out &
pv <out -ptebaSs 800G >/dev/null

有什么方法可以使速度更快? 笔记. 实际程序中的缓冲区未填充零.

Is there any way to make this faster? Note. the buffer in the real program is not filled with zeroes.

我的好奇心是了解单个程序(经过重复处理或多进程)可以输出多少数据

好像有4个人不明白这个简单的问题. 我什至大胆地提出了问题的原因.

It looks like 4 people didn't understand this simple question. I even put in bold the reason of the question.

推荐答案

看来,Linux调度程序和IO优先级在减速中起了很大的作用.

Well it seems that linux scheduler and IO priorities played had a big role in the slowdown.

此外,幽灵和其他CPU漏洞缓解措施也开始发挥作用.

Also, spectre and other cpu vunerability mitigations came to play.

进一步优化后,要获得更快的速度,我必须调整以下内容:

After further optimization, to achieve a faster speed I had to tune this things:

1) program nice level (nice -n -20)
2) program ionice level (ionice -c 1 -n 7)
3) pipe size increased 8 times.
4) disable cpu mitigations by adding "pti=off spectre_v2=off l1tf=off" in kernel command line
5) tuning the linux scheduler

echo -n -1 >/proc/sys/kernel/sched_rt_runtime_us
echo -n -1 >/proc/sys/kernel/sched_rt_period_us
echo -n -1 >/proc/sys/kernel/sched_rr_timeslice_ms
echo -n 0 >/proc/sys/kernel/sched_tunable_scaling

现在程序输出(在同一台计算机上)为8.00 GB/秒!

Now the program outputs (on the same pc) 8.00 GB/sec!

如果您有其他想法,欢迎您提供帮助.

If you have other ideas you're welcome to contribute.

这篇关于如何尽快输出固定缓冲区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆