怎么了memchr()引擎盖下工作吗? [英] How does memchr() work under the hood?

查看:144
本文介绍了怎么了memchr()引擎盖下工作吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我试图创造一个纯净的D语言实现的功能,这大致相当于的 C'S了memchr 但使用数组和下标,而不是指针。其原因是,这样std.string将与编译时功能评价工作。对于那些不熟悉W / D,函数可以在编译时进行评估,如果有一定的限制得到满足。一个限制是他们不能使用指针。另一个原因是,他们无法调用C函数或使用内联汇编语言。拥有在编译时字符串库工作,是一些编译时code代黑客有用的。

Background: I'm trying to create a pure D language implementation of functionality that's roughly equivalent to C's memchr but uses arrays and indices instead of pointers. The reason is so that std.string will work with compile time function evaluation. For those of you unfamiliar w/ D, functions can be evaluated at compile time if certain restrictions are met. One restriction is that they can't use pointers. Another is that they can't call C functions or use inline assembly language. Having the string library work at compile time is useful for some compile time code gen hacks.

问:如何了memchr引擎盖下工作,以最快的速度确实执行?在Win32,任何东西,我已经能够用简单的循环纯的微分创造至少是2倍甚至更慢W /明显的优化技术,如禁止边界检查,循环展开,等等什么样的非显而易见的技巧可用于作为东西在一个字符串找到一个字符一样简单?

Question: How does memchr work under the hood to perform as fast as it does? On Win32, anything that I've been able to create in pure D using simple loops is at least 2x slower even w/ obvious optimization techniques such as disabling bounds checking, loop unrolling, etc. What kinds of non-obvious tricks are available for something as simple as finding a character in a string?

推荐答案

我建议采取看看 GNU libc的的来源。至于大部分功能,它包含的功能都通用优化的C版本和优化的汇编语言版本,为尽可能多的支持的体系结构成为可能,同时机器的具体技巧的优势。

I would suggest taking a look at GNU libc's source. As for most functions, it will contain both a generic optimized C version of the function, and optimized assembly language versions for as many supported architectures as possible, taking advantage of machine specific tricks.

借助 X86-64 SSE2版本结合从 PCMPEQB结果 上的数据一次(4 16B向量)的整体高速缓存行,分期偿还早退出的开销 PMOVMSKB / 测试 / 江铜

The x86-64 SSE2 version combines the results from pcmpeqb on a whole cache-line of data at once (four 16B vectors), to amortize the overhead of the early-exit pmovmskb/test/jcc.

gcc和铿锵目前不能自动向量化与如果()破早期退出条件循环,因此他们会从幼稚字节在-A-时间ASM明显的C语言实现。

gcc and clang are currently incapable of auto-vectorizing loops with if() break early-exit conditions, so they make naive byte-at-a-time asm from the obvious C implementation.

这篇关于怎么了memchr()引擎盖下工作吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆