strlen 性能实现 [英] strlen performance implementation

查看：76 发布时间：2021/6/15 19:06:10 c string performance strlen

本文介绍了strlen 性能实现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是一个多用途问题:

这与 glibc strlen 实现相比如何?立>
有没有更好的方法来解决这个问题，也可以用于自动向量化.

#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <stdbool.h>
#include <stdlib.h>

/* Todo: Document */
#define WORD_ONES_LOW   ((size_t)-1 / UCHAR_MAX)
#define WORD_ONES_HIGH  (((size_t)-1 / UCHAR_MAX) << (CHAR_BIT - 1))

/*@doc
 * @desc: see if an arch word has a zero
 * #param: w - string aligned to word size
 */
static inline bool word_has_zero(const size_t *w)
{
    return ((*w - WORD_ONES_LOW) & ~*w & WORD_ONES_HIGH);
}

/*@doc
 * @desc: see POSIX strlen()
 * @param: s - string
 */
size_t strlen(const char *s)
{
    const char *z = s;

    /* Align to word size */
    for (; ((uintptr_t)s & (sizeof(size_t) - 1)) && *s != '\0'; s++);

    if (*s != '\0') {
        const size_t *w;

        for (w = (const size_t *)s; !word_has_zero(w); w++);
        for (s = (const char *)w; *s != '\0'; s++);
    }

    return (s - z);
}

推荐答案

嗯，这个实现基于几乎相同的技巧 (确定一个单词是否具有零字节) 作为您链接的 glibc 实现.他们几乎做同样的事情，除了在 glibc 版本中一些循环被展开并且位掩码被明确地拼写出来.您发布的代码中的 ONES 和 HIGHS 正是 himagic = 0x80808080L 和 lomagic = 0x01010101L 形成 glibc 版本.

Well, this implementation is based on virtually the same trick (Determine if a word has a zero byte) as the glibc implementation you linked. They do pretty much the same thing, except that in glibc version some loops are unrolled and bit masks are spelled out explicitly. The ONES and HIGHS from the code you posted is exactly himagic = 0x80808080L and lomagic = 0x01010101L form glibc version.

我看到的唯一区别是 glibs 版本使用稍微不同的标准来检测零字节

The only difference I see is that glibs version uses a slightly different criterion for detecting a zero byte

if ((longword - lomagic) & himagic)

不做... &~longword(与示例中的 HASZERO(x) 宏相比，它与 x 做同样的事情，但还包括 ~(x) 成员).显然 glibc 的作者认为这个较短的公式更有效.然而，它可能导致误报.所以他们检查 if 下的误报.

without doing ... & ~longword (compare to HASZERO(x) macro in your example, which does the same thing with x, but also includes ~(x) member). Apparently glibc authors believed this shorter formula is more efficient. Yet it can result in false positives. So they check for false positives under that if.

这确实是一个有趣的问题，哪个更有效:单阶段精确测试(您的代码)或两阶段测试，首先进行粗略的不精确检查，然后在必要时进行精确的第二次检查(glibc 代码)).

It is indeed an interesting question, what is more efficient: a single-stage precise test (your code) or a two-stage test that begins with rough imprecise check followed, if necessary, by a precise second check (glibc code).

如果您想查看它们在实际性能方面的比较情况 - 在您的平台和数据上对它们进行计时.没有别的办法了.

If you want to see how they compare in terms of actual performance - time them on your platform and your data. There's no other way.

这篇关于strlen 性能实现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

strlen 性能实现 [英] strlen performance implementation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

strlen 性能实现 [英] strlen performance implementation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭