如果在 C 中为 null 检查大量数据的最快方法? [英] Fastest way to check mass data if null in C?

查看:59
本文介绍了如果在 C 中为 null 检查大量数据的最快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量数据,可能有 4MB.现在想检查是否所有位都是 0.

I have a mass of data, maybe 4MB. Now want to check if all bits in it are 0.

例如:数据如下:

void* data = malloc(4*1024*1024);
memset(data, 0, 4*1024*1024);

检查其中的所有位是否为 0.这是我的解决方案不够快:

Check if all bits in it are 0. Here is my solution which is not fast enough:

int dataisnull(char* data, int length)
{
    int i = 0;
    while(i<length){
        if (data[i]) return 0;
        i++;
    }
    return 1;
}

此代码可能有一些性能需要改进.例如,在 32/64 位机器中,一次检查 4/8 个字节可能会更快.

This code might have some things to improve in performance. For example, in 32/64 bits machine, checking 4/8 bytes at a time may be faster.

所以我想知道最快的方法是什么?

So I wonder what is the fastest way to do it?

推荐答案

您可以一次处理多个字节并展开循环:

You can handle multiple bytes at a time and unroll the loop:

int dataisnull(const void *data, size_t length) {
    /* assuming data was returned by malloc, thus is properly aligned */
    size_t i = 0, n = length / sizeof(size_t);
    const size_t *pw = data;
    const unsigned char *pb = data;
    size_t val;
#define UNROLL_FACTOR  8
#if UNROLL_FACTOR == 8
    size_t n1 = n - n % UNROLL_FACTOR;
    for (; i < n1; i += UNROLL_FACTOR) {
        val = pw[i + 0] | pw[i + 1] | pw[i + 2] | pw[i + 3] |
              pw[i + 4] | pw[i + 5] | pw[i + 6] | pw[i + 7];
        if (val)
            return 0;
    }
#endif
    val = 0;
    for (; i < n; i++) {
        val |= pw[i];
    }
    for (i = n * sizeof(size_t); i < length; i++) {
        val |= pb[i];
    }
    return val == 0;
}

根据您的具体问题,提前或延迟检测非零值可能更有效:

Depending on your specific problem, it might be more efficient to detect non zero values early or late:

  • 如果全零情况是最常见的,您应该将所有位计算到 val 累加器中,并仅在最后进行测试.
  • 如果全零的情况很少见,您应该更频繁地检查非零值.
  • If the all zero case is the most common, you should compute cumulate all bits into the val accumulator and test only at the end.
  • If the all zero case is rare, you should check for non zero values more often.

上面的展开版本是一种折衷方案,它根据 size_t 的大小每 64 或 128 字节测试一次非零值.

The unrolled version above is a compromise that tests for non zero values every 64 or 128 bytes depending on the size of size_t.

根据您的编译器和处理器,您可能会通过更少或更多展开来获得更好的性能.您还可以使用适用于您的特定架构的内在函数来利用向量类型,但它的可移植性较差.

Depending on your compiler and processor, you might get better performance by unrolling less or more. You could also use intrinsic functions available for your particular architecture to take advantage of vector types, but it would be less portable.

请注意,代码不会验证 data 指针是否正确对齐:

Note that the code does not verify proper alignment for the data pointer:

  • 它无法移植.
  • 它假定数据是通过 malloc 或类似方法分配的,因此对于任何类型都正确对齐.
  • it cannot be done portably.
  • it assumes the data was allocated via malloc or similar, hence properly aligned for any type.

一如既往,对不同的解决方案进行基准测试,看看它是否有真正的不同.这个函数可能根本不是瓶颈,编写一个复杂的函数来优化一个罕见的情况会适得其反,它使代码可读性降低,更可能包含错误,并且更不易维护.例如,如果您更改内存分配方案或使用静态数组,则数据对齐的假设可能不成立,那么该函数可能会调用未定义的行为.

As always, benchmark different solutions to see if it makes a real difference. This function might not be a bottleneck at all, writing a complex function to optimize a rare case is counterproductive, it makes the code less readable, more likely to contain bugs and much less maintainable. For example, the assumption on data alignment may not hold if you change memory allocation scheme or if you use static arrays, the function may invoke undefined behavior then.

这篇关于如果在 C 中为 null 检查大量数据的最快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆