C程序在_platform_memmove $ VARIANT $ Haswell中花费了77%的时间 [英] C program spends 77% of the time in _platform_memmove$VARIANT$Haswell

查看:396
本文介绍了C程序在_platform_memmove $ VARIANT $ Haswell中花费了77%的时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在分析一些用C编写的数字代码(分析程序是Instruments,在Mac OSX 10.11.6上,编译器是clang). _platform_memmove$VARIANT$Haswell中花费了多达77.3%的运行时间.

I'm profiling some numerical code written in C (profiler is Instruments, compiler is clang on Mac OSX 10.11.6). As much as 77.3% of the running time is spent in _platform_memmove$VARIANT$Haswell.

在汇编输出中,上述功能由DYLD-STUB$$memcpy调用.但是,我的C代码中没有memcpy(尽管我确实有一些malloc).

In the assembly output, the above function is called by DYLD-STUB$$memcpy. However, I have no memcpy's in my C code (I do have some malloc's though).

进一步讲,汇编命令rep似乎占用了很多时间.从这篇帖子中,看来rep不是做任何有用的事情.为什么编译器插入它? memcpy的来源是什么?

Going deeper, it seems that the assembly command rep is responsible for taking up so much time. From this post, it seems that rep is not doing anything useful. Why does the compiler insert it? And where do the memcpy's come from?

我也尝试使用-g进行编译,但是_platform_memmove$VARIANT$Haswell几乎不再一直吞噬.

I also tried compiling with -g, but then _platform_memmove$VARIANT$Haswell is not gobbling up almost all of the time anymore.

推荐答案

经过更多的搜索,我发现了问题:我正在将一个结构传递给一个函数,该函数每次都被复制,因此被复制为memcpy.

After a bit of more searching, I found the problem: I was passing a struct to a function, which gets copied each time, hence the memcpy.

我将函数更改为接受指向该结构的指针,该指针将代码加速了5倍.

I changed the function to accept a pointer to the struct, which sped up my code by a factor 5.

这篇关于C程序在_platform_memmove $ VARIANT $ Haswell中花费了77%的时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆