什么是用于word_at_a_time.h中的has_zero和find_zero [英] what is has_zero and find_zero in word_at_a_time.h used for

查看:247
本文介绍了什么是用于word_at_a_time.h中的has_zero和find_zero的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在linux内核中,inlucde / linux / word_at_a_time.h有两个函数:
$ b $ pre $ static inline long find_zero(unsigned长掩码)
{
long byte = 0;
#ifdef CONFIG_64BIT
if(mask>> 32)
mask>> = 32;
else
byte = 4;
#endif
if(mask>> 16)
mask>> = 16;
else
byte + = 2;
return(掩码>> 8)?字节:字节+ 1;


static inline bool has_zero(unsigned long val,
unsigned long * data,
const struct word_at_a_time * c)
{
unsigned long rhs = val | C-> low_bits;
* data = rhs;
return(val + c-> high_bits)& 〜RHS;
}

它用在散列函数中,在git log中说:


   -  has_zero():取一个单词,并确定它是否包含零字节。 
它获取单词,指向常量池的指针和一个指向
的指针,它可以设置一个中间的数据字段。


但我仍然没有收到

(1)这个函数做什么?和

(2)它们是怎么做的?

解决方案

假设位从最低有效位(LSB)(编号0)到最高有效位(MSB)编号。

它是什么?



函数 find_zero()搜索 first N (<= 7)个字节,其值为零,使用类似于divide and conquer的技术。

它是怎么做的?



假设一个64位系统,其中 CONFIG_64BIT 被定义,然后执行一段代码:

  #ifdef CONFIG_64BIT 
if(mask> ;> 32)//左边32位中的任何位= 1
掩码>> = 32;
else
byte = 4; //< - 不,第4个字节为零

第一个注释掩码无符号,所以>> 是无符号右边筛选。 (像Java的>>>



它首先检查掩码的值是否超过2 32(/或我们可以说如果在无符号长掩码位编号 32 63 之间的任何位。是一)。



掩码>> 32 会产生一个纯值,即掩码右移零 0 扩展名由 32 位,并使 32 高位包含零。


$ b例如$ b

,如果 maks 位如下:

  63 56 55 48 47 40 39 32 31 24 23 16 15 7 0 
▼▼▼▼▼▼▼$
0000 1000 0000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲▲
MSB LSM

在本例中,位数 34 59 是1,所以(掩码>> 32) == true (或者说非零!0 )。因此,对于这个例子 if(mask>> 32) == if(!0)

在一般情况下,如果位数 32 63 中的任何一位为1,则 mask 会更新为 mask>> = 32; == mask = mask>> 32; (如果true和if语句执行)。这导致高位 32位由于右移而变为低位 32 位(并且位32至63变为<$在上面的例子中, mask 变为: 0 )。 p>

 移位OLD位编号----> 63 45 32 
63 47 32 31 15 0
▼▼b $ b 0000 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0000 0100
▲▲
MSB LSM
| ------------------------------------- |
|全部为零|

注意:从32到63的位是 0 和从 32 复制的 0 31 从上面到 63



这里需要:

案例1:

如果 32 63 中的任何一位是原始 > mask
,然后如果执行true并且屏蔽更新。 (如我上面解释的)。而变量长字节仍然是 0 。然后在下一个 if(掩码>> 16)中,函数 find_zero()将继续搜索一个字节在位范围内的值为零 48 63 (In if(mask>> 16),位编号 48 63 将检查是否有位为1 )。

情况2:

如果 32 63 在原始掩码中为零,然后 if 32) == if(0)),则c> condition变为false(即)和 mask 不更新,变量 long byte 变为 4 ,并且这表示在掩码中高四个字节是 0 。在 if(掩码>> 16)之后,函数 find_zero()将搜索一个零位范围 16 31


同样,在第二个 if()中:

  if(mask>> 16)
mask>> = 16;
else
byte + = 2;

它会检查位号 16到31 是一个或不是。如果所有位都是零,那么 byte 将在其他部分中增加 2 ,if-part掩码将被更新无符号右移16位。

注意:if mask 更新 mask code> then实际上这是 if(掩码>> 16)检查 48到63 是一个,否则在原始掩码中位 16 31

随后,最后一个条件运算符以类似的方式工作,以检查bit麻木 8 15 之间的任何位是一个或不是。

 长字节= 0; 
64位`mask`
|
|

if(从32到62的任何一位是1)--- +
| |
| False:if(0)| True:if(!0)
32到64之间的所有位| A字节=零未找到
为零有些位是位32- 63
| |
| |
byte = 4 byte = 0,
64 bit`mask`< - Orignal`mask`更新为`mask>> = 32;`
| |
| |
如果(从16到31的任何位都是1)if(从48到63的任何位都是1)
|由于高位是零|注意由于>> ;所有高位
|它检查编号为16-31的位|位变为零。
| |第二位检查16-31
|位|原来是
| |编号为48至63
| |
▼▼
Case False:if(0)Case False:if(0)
所有位从16-31开始零位从49-63
掩码不会更新并且掩码不会更新并且
字节= 6字节= 2

情况为真:if(!0)情况为假:if(!0)
有些位是从16 -31有些位是一个从49-63
掩码更新掩码更新
字节= 4字节= 0
| |
| |
▼▼
更多四个个案更多四个个案

共有8个不同的从0到7的不同答案输出

所以函数 find_zero() search first N 从左侧数值为零的字节。输出字节值可以从 0 7 ,并且不考虑最右边的字节( 8)。

(*注意:long是64位长= 8 * 8位= 8字节。*)

  byte NUMBER():
12345678
63 56 55 48 47 40 39 32 31 24 23 16 15 7 0
▼▼▼▼▼▼▼
0000 1000 0000 0000 0000 0000 0000 0000 01 0000 0000 0000 0000 0000 0000 0000 0001 0101
▲▲
MSB LSM

byte = 0 - >表示第一个字节不为零

byte = 1 - >高位字节为0,位编号 56 63

byte = 2 - >两个高位字节为0,位编号为 48 63

byte = 3 - >三个高位字节为零,位编号 40 63





同样,byte = 7 - >左边的七个字节是0,(或全部为0)。


流程图






代码



下面我写了一个小代码,它用不同的值调用 find_zero()函数,可以帮助您理解函数:

  int main(){
printf(\ nmask = 0x0,所有字节为零,结果=%ld,find_zero (0x0L)); //不打印8
printf(\ nmask = 0xff,剩下的7个字节为零,结果=%ld,find_zero(0xffL));
printf(\ nmask = 0xffff,left six bytes are zero,result =%ld,find_zero(0xffffL));
printf(\ nmask = 0xffffff,剩下的5个字节为零,结果=%ld,find_zero(0xffffffL));
printf(\ nmask = 0xffffffff,left four bytes are zero,result =%ld,find_zero(0xffffffffL));
printf(\ nmask = 0xffffffffff,左边三个字节为零,结果=%ld,find_zero(0xffffffffffL));
printf(\ nmask = 0xffffffffffff,left two bytes are zero,result =%ld,find_zero(0xffffffffffffL));
printf(\ nmask = 0xffffffffffffff,最左边一个字节为零,结果=%ld,find_zero(0xffffffffffffffL));
printf(\ nmask = 0xffffffffffffffff,无字节为零,结果=%ld,find_zero(0xffffffffffffffffL));
printf(\ nmask = 0x8000000000000000L,第一个字节不是零(所以不搜索),结果=%ld,find_zero(0x8000000000000000LL));
printf(\\\
);
返回1;
}

请观察输出以理解函数:

  mask = 0x0,所有字节为零,结果= 7 
掩码= 0xff,左边7个字节为零,结果= 7
掩码= 0xffff,左边六个字节为零,结果= 6
掩码= 0xffffff,剩下的五个字节为零,结果= 5
掩码= 0xffffffff,剩下的四个字节为零,结果= 4
掩码= 0xffffffffff,左边三个字节为零,结果= 3
掩码= 0xffffffffffff,左边两个字节为零,结果= 2
掩码= 0xffffffffffffff,最左边一个字节为零,结果= 1
掩码= 0xffffffffffffffff,没有字节为零,结果= 0
掩码= 0x8000000000000000L,第一个字节不为零(所以不搜索),结果= 0

注意:如果所有字节都为零,它会打印 7 不是8。

In linux kernel, inlucde/linux/word_at_a_time.h, there are two functions:

static inline long find_zero(unsigned long mask)
{
        long byte = 0;
#ifdef CONFIG_64BIT
        if (mask >> 32)
                mask >>= 32;
        else
                byte = 4;
#endif
        if (mask >> 16)
                mask >>= 16;    
        else
                byte += 2;
        return (mask >> 8) ? byte : byte + 1;
}

static inline bool has_zero(unsigned long val, 
                            unsigned long *data, 
                            const struct word_at_a_time *c)
{
        unsigned long rhs = val | c->low_bits;
        *data = rhs;
        return (val + c->high_bits) & ~rhs;
}

It's used in a hash function, in git log it says:

 - has_zero(): take a word, and determine if it has a zero byte in it.
   It gets the word, the pointer to the constant pool, and a pointer to
   an intermediate "data" field it can set.

But I still don't get

(1) What this function do?, and
(2) How they do it?

解决方案

Suppose bits are numbered from Least Significant Bit (LSB) (numbered 0) to Most Significant Bit (MSB).

What it does?

Function find_zero() search first N (<=7) bytes with value zero from left hand side using a technique similar to divide and conquer.

How it does?

Suppose a 64 bit-system where CONFIG_64BIT is defined, then following piece of code executes:

#ifdef CONFIG_64BIT
        if (mask >> 32)  //Any bit=1 in left 32 bits 
                mask >>= 32;
        else
                byte = 4;  //<--No, fist 4 bytes are zero

First note mask is unsigned, so >> is unsigned right sift. (like Java's >>>)

It first checks in if for mask value is more then 232 (or we can say if in unsigned long mask any bit between bit numbered 32 to 63 is one).

mask >> 32 will produces a pure value that is its mask right-shifted with zero 0 extension by 32 bits and causes the 32 high order bits to contain zero.

for example, if maks bits are as follows:

63     56 55     48 47     40 39     32 31     24 23     16 15        7       0
▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼         ▼       ▼
0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲                                                                             ▲
MSB                                                                         LSM

In this example, bit number 34 and 59 is one, so (mask >> 32) == true (or say non-zero !0). Hence for this example if (mask >> 32) == if(!0).
In genral, if any bit from bit number 32 to 63 is one then mask will be updated to mask >>= 32; == mask = mask >> 32; (as if true and if statement executes). This causes high order 32 bits becomes low order 32 bits due to right shift (and bits 32 to 63 becomes 0).

In above example mask becomes:

           shifted OLD bit number ----> 63                 45                32
63                  47               32 31                 15                 0
▼                   ▼                 ▼ ▼                   ▼                 ▼
0000 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0000 0100
▲                                                                             ▲
MSB                                                                         LSM
|-------------------------------------|
| All are zero                        |

Note: bits from 32 to 63 are 0 and bits from 0 to 31 copied from bits 32 to 63 from above.

Upto here:

Case-1:
If any bit from 32 to 63 is one in original mask then if executes true and mask updates. (as I explained above). And variable long byte remains 0. Then in next if(mask >> 16), function find_zero() will continue to search for a byte with value zero within bit range 48 to 63(In if(mask >> 16), bit numbered 48 to 63 will be checked for if any bit is 1).

Case-2:
If all bits from 32 to 63 are zero in original mask, then if condition becomes false (that is if(mask >> 32) == if(0)) and mask doesn't updates, and variable long byte becomes 4, and This indicates that high four bytes are 0 in mask. After this in if(mask >> 16), function find_zero() will search for a byte with zero in bit range 16 to 31.

Similarly, in Second if():

if (mask >> 16)
  mask >>= 16;    
else
  byte += 2; 

It will checks if any bit between bit numbered 16 to 31 is one or not. If all bits are zero then byte will incremented by 2 in else part, in if-part mask will be updated by unsigned shift right 16-bits.
(Note: if mask is updated mask then actually this if(mask >> 16) checking whether any bit between 48 to 63 is one, otherwise bits 16 to 31 in original mask)

Subsequently, last conditional operator works in similar fashion to check any bit between bit numbed 8 to 15 is one or not.

                  long byte = 0;
                  64 bit `mask`
                       |
                       |
                       ▼
            if(any bit from 32 to 62 is one)---+
            |                                  |
            |False: if(0)                      |True: if(!0)
      all bits between 32 to 64              |A byte=Zero NOT found
      are zero                         Some bits are one from bit 32-63
            |                                  |
            |                                  |
        byte = 4                          byte= 0, and
        64 bit `mask`  <--Orignal       `mask` updated as `mask >>= 32;`
           |                                        |
           |                                        |
           ▼                                        ▼
if(Any bit from 16 to 31 is one)       if(Any bit from 48 to 63 is one)
   |Because high order bits are Zero    |Note due to >> all high order
   |It check for bits numbered 16-31    |bits becomes zero.
   |                                    |2nd if checks for bit from 16-31
   |                                    |that were originally bit
   |                                    | numbered 48 to 63
   |                                    |
   ▼                                    ▼
Case False: if(0)                        Case False: if(0)
     All bits Zero from 16-31               All bits Zero from 49-63
     mask will NOT updated and              mask will NOT updated and
     byte = 6                                  byte = 2

Case True: if(!0)                         Case False: if(!0)
    Some bits are one from 16-31         Some bits are one from 49-63
    mask updated                            mask Updated
     byte = 4                                  byte = 0
   |                                        |
   |                                        |
   ▼                                        ▼
more four cases                          more four cases

Total 8 different answer outputs from `0` to `7`

So function find_zero() search first N bytes with value zero from left hand side. Output byte value can be from 0 to 7 and doesn't considers right most byte ("8").
(*note: long is 64 bits long = 8 * 8-bits = 8 bytes.*)

byte NUMBER ("):      
   "1"       "2"        "3"       "4"       "5"       "6"       "7"       "8"
63     56 55     48 47     40 39     32 31     24 23     16 15        7       0
▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼       ▼ ▼         ▼       ▼
0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲                                                                             ▲
MSB                                                                         LSM

byte = 0 --> means first byte is not zero
byte = 1 --> high order byte is zero that is bit numbered 56 to 63
byte = 2 --> two high order bytes are zero that is bit numbered 48 to 63
byte = 3 --> three high order byte is zero that is bit numbered 40 to 63
:
:
Similarly, byte = 7 --> Seven bytes from left are 0, (or all 0).

Flow-diagram

Code

Below I have written a small code that calls find_zero() function with different values, will help you to understand the function:

int main(){ 
 printf("\nmask=0x0, all bytes are zero, result =%ld", find_zero(0x0L)); // not prints 8
 printf("\nmask=0xff, left seven bytes are zero, result =%ld", find_zero(0xffL));
 printf("\nmask=0xffff, left six bytes are zero, result =%ld", find_zero(0xffffL));
 printf("\nmask=0xffffff, left five bytes are zero, result =%ld", find_zero(0xffffffL));
 printf("\nmask=0xffffffff, left four bytes are zero, result =%ld", find_zero(0xffffffffL));
 printf("\nmask=0xffffffffff, left three bytes are zero, result =%ld", find_zero(0xffffffffffL));
 printf("\nmask=0xffffffffffff, left two bytes are zero, result =%ld", find_zero(0xffffffffffffL));
 printf("\nmask=0xffffffffffffff, left most one byte is zero, result =%ld", find_zero(0xffffffffffffffL));
 printf("\nmask=0xffffffffffffffff, No byte is zero, result =%ld", find_zero(0xffffffffffffffffL));
 printf("\nmask=0x8000000000000000L, first byte is NOT zero (so no search), result =%ld", find_zero(0x8000000000000000LL));  
    printf("\n");
    return 1;
}

Please observe the output to understand function:

mask=0x0, all bytes are zero, result =7
mask=0xff, left seven bytes are zero, result =7
mask=0xffff, left six bytes are zero, result =6
mask=0xffffff, left five bytes are zero, result =5
mask=0xffffffff, left four bytes are zero, result =4
mask=0xffffffffff, left three bytes are zero, result =3
mask=0xffffffffffff, left two bytes are zero, result =2
mask=0xffffffffffffff, left most one byte is zero, result =1
mask=0xffffffffffffffff, No byte is zero, result =0
mask=0x8000000000000000L, first byte is NOT zero (so no search), result =0

Note: if all bytes are zero it prints 7 not 8.

这篇关于什么是用于word_at_a_time.h中的has_zero和find_zero的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆