什么是用于word_at_a_time.h中的has_zero和find_zero [英] what is has_zero and find_zero in word_at_a_time.h used for
问题描述
$ b $ pre $
static inline long find_zero(unsigned长掩码)
{
long byte = 0;
#ifdef CONFIG_64BIT
if(mask>> 32)
mask>> = 32;
else
byte = 4;
#endif
if(mask>> 16)
mask>> = 16;
else
byte + = 2;
return(掩码>> 8)?字节:字节+ 1;
static inline bool has_zero(unsigned long val,
unsigned long * data,
const struct word_at_a_time * c)
{
unsigned long rhs = val | C-> low_bits;
* data = rhs;
return(val + c-> high_bits)& 〜RHS;
}
它用在散列函数中,在git log中说:
- has_zero():取一个单词,并确定它是否包含零字节。
它获取单词,指向常量池的指针和一个指向
的指针,它可以设置一个中间的数据字段。
但我仍然没有收到
(1)这个函数做什么?和
(2)它们是怎么做的?
假设位从最低有效位(LSB)(编号0)到最高有效位(MSB)编号。
它是什么?
函数 find_zero()
搜索 first N
(<= 7)个字节,其值为零,使用类似于divide and conquer的技术。
它是怎么做的?
假设一个64位系统,其中 CONFIG_64BIT
被定义,然后执行一段代码:
#ifdef CONFIG_64BIT
if(mask> ;> 32)//左边32位中的任何位= 1
掩码>> = 32;
else
byte = 4; //< - 不,第4个字节为零
第一个注释掩码
是无符号
,所以>>
是无符号右边筛选。 (像Java的>>> )
它首先检查掩码
的值是否超过2 32(/或我们可以说如果在无符号长掩码
位编号 32
到 63
之间的任何位。是一)。
掩码>> 32
会产生一个纯值,即掩码
右移零 0
扩展名由 32
位,并使 32
高位包含零。
$ b例如$ b
,如果 maks
位如下:
63 56 55 48 47 40 39 32 31 24 23 16 15 7 0
▼▼▼▼▼▼▼$
0000 1000 0000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲▲
MSB LSM
在本例中,位数 34
和 59
是1,所以(掩码>> 32) ==
true
(或者说非零!0
)。因此,对于这个例子 if(mask>> 32)
== if(!0)
。
在一般情况下,如果位数 32
至 63
中的任何一位为1,则 mask
会更新为 mask>> = 32; ==
mask = mask>> 32;
(如果true和if语句执行)。这导致高位 32位由于右移而变为低位
)。 p> 32
位(并且位32至63变为<$在上面的例子中, mask
变为: 0
移位OLD位编号----> 63 45 32
63 47 32 31 15 0
▼▼b $ b 0000 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0000 0100
▲▲
MSB LSM
| ------------------------------------- |
|全部为零|
注意:从32到63的位是 0
和从 32
复制的 0
到 31
从上面到 63
。
这里需要:
案例1:
如果 32
到 63
中的任何一位是原始 > mask
,然后如果
执行true并且屏蔽更新。 (如我上面解释的)。而变量长字节
仍然是 0
。然后在下一个 if(掩码>> 16)
中,函数 find_zero()
将继续搜索一个字节在位范围内的值为零 48
到 63
(In if(mask>> 16)
,位编号 48
至 63
将检查是否有位为1 )。
情况2: 同样,在第二个 它会检查位号 随后,最后一个条件运算符以类似的方式工作,以检查bit麻木 所以函数 byte = 下面我写了一个小代码,它用不同的值调用 请观察输出以理解函数: 注意:如果所有字节都为零,它会打印 In linux kernel, inlucde/linux/word_at_a_time.h, there are two functions: It's used in a hash function, in git log it says:
But I still don't get (1) What this function do?, and Suppose bits are numbered from Least Significant Bit (LSB) (numbered 0) to Most Significant Bit (MSB). Function Suppose a 64 bit-system where First note It first checks in if for for example, if In this example, bit number In above example Note: bits from 32 to 63 are Upto here: Case-1: Case-2: Similarly, in Second It will checks if any bit between bit numbered Subsequently, last conditional operator works in similar fashion to check any bit between bit numbed So function byte =
如果 32
到 63
在原始掩码
中为零,然后 if(0)
),则c> condition变为false(即)和
mask
不更新,变量 long byte
变为 4
,并且这表示在掩码
中高四个字节是 0
。在 if(掩码>> 16)
之后,函数 find_zero()
将搜索一个零位范围 16
至 31
。
if()
中:
if(mask>> 16)
mask>> = 16;
else
byte + = 2;
16到31
是一个或不是。如果所有位都是零,那么 byte
将在其他部分中增加 2
,if-part掩码将被更新无符号右移16位。
(注意:if mask 更新
mask code> then实际上这是
if(掩码>> 16)
检查 48到63
是一个,否则在原始掩码中位 16
到 31
)
8
到 15
之间的任何位是一个或不是。
长字节= 0;
64位`mask`
|
|
▼
if(从32到62的任何一位是1)--- +
| |
| False:if(0)| True:if(!0)
32到64之间的所有位| A字节=零未找到
为零有些位是位32- 63
| |
| |
byte = 4 byte = 0,
64 bit`mask`< - Orignal`mask`更新为`mask>> = 32;`
| |
| |
如果(从16到31的任何位都是1)if(从48到63的任何位都是1)
|由于高位是零|注意由于>> ;所有高位
|它检查编号为16-31的位|位变为零。
| |第二位检查16-31
|位|原来是
| |编号为48至63
| |
▼▼
Case False:if(0)Case False:if(0)
所有位从16-31开始零位从49-63
掩码不会更新并且掩码不会更新并且
字节= 6字节= 2
情况为真:if(!0)情况为假:if(!0)
有些位是从16 -31有些位是一个从49-63
掩码更新掩码更新
字节= 4字节= 0
| |
| |
▼▼
更多四个个案更多四个个案
共有8个不同的从0到7的不同答案输出
find_zero()
search first N
从左侧数值为零的字节。输出字节值可以从 0
到 7
,并且不考虑最右边的字节( 8
)。
(*注意:long是64位长= 8 * 8位= 8字节。*)
byte NUMBER():
12345678
63 56 55 48 47 40 39 32 31 24 23 16 15 7 0
▼▼▼▼▼▼▼
0000 1000 0000 0000 0000 0000 0000 0000 01 0000 0000 0000 0000 0000 0000 0000 0001 0101
▲▲
MSB LSM
0
- >表示第一个字节不为零
byte = 1
- >高位字节为0,位编号 56
至 63
byte = 2
- >两个高位字节为0,位编号为 48
至 63
byte = 3
- >三个高位字节为零,位编号 40
到
63
:
:
同样,byte = 7
- >左边的七个字节是0,(或全部为0)。
流程图
代码
find_zero()
函数,可以帮助您理解函数:
int main(){
printf(\ nmask = 0x0,所有字节为零,结果=%ld,find_zero (0x0L)); //不打印8
printf(\ nmask = 0xff,剩下的7个字节为零,结果=%ld,find_zero(0xffL));
printf(\ nmask = 0xffff,left six bytes are zero,result =%ld,find_zero(0xffffL));
printf(\ nmask = 0xffffff,剩下的5个字节为零,结果=%ld,find_zero(0xffffffL));
printf(\ nmask = 0xffffffff,left four bytes are zero,result =%ld,find_zero(0xffffffffL));
printf(\ nmask = 0xffffffffff,左边三个字节为零,结果=%ld,find_zero(0xffffffffffL));
printf(\ nmask = 0xffffffffffff,left two bytes are zero,result =%ld,find_zero(0xffffffffffffL));
printf(\ nmask = 0xffffffffffffff,最左边一个字节为零,结果=%ld,find_zero(0xffffffffffffffL));
printf(\ nmask = 0xffffffffffffffff,无字节为零,结果=%ld,find_zero(0xffffffffffffffffL));
printf(\ nmask = 0x8000000000000000L,第一个字节不是零(所以不搜索),结果=%ld,find_zero(0x8000000000000000LL));
printf(\\\
);
返回1;
}
mask = 0x0,所有字节为零,结果= 7
掩码= 0xff,左边7个字节为零,结果= 7
掩码= 0xffff,左边六个字节为零,结果= 6
掩码= 0xffffff,剩下的五个字节为零,结果= 5
掩码= 0xffffffff,剩下的四个字节为零,结果= 4
掩码= 0xffffffffff,左边三个字节为零,结果= 3
掩码= 0xffffffffffff,左边两个字节为零,结果= 2
掩码= 0xffffffffffffff,最左边一个字节为零,结果= 1
掩码= 0xffffffffffffffff,没有字节为零,结果= 0
掩码= 0x8000000000000000L,第一个字节不为零(所以不搜索),结果= 0
7
不是8。static inline long find_zero(unsigned long mask)
{
long byte = 0;
#ifdef CONFIG_64BIT
if (mask >> 32)
mask >>= 32;
else
byte = 4;
#endif
if (mask >> 16)
mask >>= 16;
else
byte += 2;
return (mask >> 8) ? byte : byte + 1;
}
static inline bool has_zero(unsigned long val,
unsigned long *data,
const struct word_at_a_time *c)
{
unsigned long rhs = val | c->low_bits;
*data = rhs;
return (val + c->high_bits) & ~rhs;
}
- has_zero(): take a word, and determine if it has a zero byte in it.
It gets the word, the pointer to the constant pool, and a pointer to
an intermediate "data" field it can set.
(2) How they do it?What it does?
find_zero()
search first N
(<=7) bytes with value zero from left hand side using a technique similar to divide and conquer. How it does?
CONFIG_64BIT
is defined, then following piece of code executes: #ifdef CONFIG_64BIT
if (mask >> 32) //Any bit=1 in left 32 bits
mask >>= 32;
else
byte = 4; //<--No, fist 4 bytes are zero
mask
is unsigned
, so >>
is unsigned right sift. (like Java's >>>)mask
value is more then 232 (or we can say if in unsigned long mask
any bit between bit numbered 32
to 63
is one).mask >> 32
will produces a pure value that is its mask
right-shifted with zero 0
extension by 32
bits and causes the 32
high order bits to contain zero.maks
bits are as follows: 63 56 55 48 47 40 39 32 31 24 23 16 15 7 0
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲ ▲
MSB LSM
34
and 59
is one, so (mask >> 32)
== true
(or say non-zero !0
). Hence for this example if (mask >> 32)
== if(!0)
.
In genral, if any bit from bit number 32
to 63
is one then mask
will be updated to mask >>= 32;
== mask = mask >> 32;
(as if true and if statement executes). This causes high order 32
bits becomes low order 32
bits due to right shift (and bits 32 to 63 becomes 0
).mask
becomes: shifted OLD bit number ----> 63 45 32
63 47 32 31 15 0
▼ ▼ ▼ ▼ ▼ ▼
0000 0000 0000 0000 0000 0000 0000 0000 0000 1000 0000 0000 0000 0000 0000 0100
▲ ▲
MSB LSM
|-------------------------------------|
| All are zero |
0
and bits from 0
to 31
copied from bits 32
to 63
from above.
If any bit from 32
to 63
is one in original mask
then if
executes true and mask updates. (as I explained above). And variable long byte
remains 0
. Then in next if(mask >> 16)
, function find_zero()
will continue to search for a byte with value zero within bit range 48
to 63
(In if(mask >> 16)
, bit numbered 48
to 63
will be checked for if any bit is 1).
If all bits from 32
to 63
are zero in original mask
, then if
condition becomes false (that is if(mask >> 32)
== if(0)
) and mask
doesn't updates, and variable long byte
becomes 4
, and This indicates that high four bytes are 0
in mask
. After this in if(mask >> 16)
, function find_zero()
will search for a byte with zero in bit range 16
to 31
. if()
: if (mask >> 16)
mask >>= 16;
else
byte += 2;
16 to 31
is one or not. If all bits are zero then byte
will incremented by 2
in else part, in if-part mask will be updated by unsigned shift right 16-bits.
(Note: if mask
is updated mask
then actually this if(mask >> 16)
checking whether any bit between 48 to 63
is one, otherwise bits 16
to 31
in original mask)8
to 15
is one or not. long byte = 0;
64 bit `mask`
|
|
▼
if(any bit from 32 to 62 is one)---+
| |
|False: if(0) |True: if(!0)
all bits between 32 to 64 |A byte=Zero NOT found
are zero Some bits are one from bit 32-63
| |
| |
byte = 4 byte= 0, and
64 bit `mask` <--Orignal `mask` updated as `mask >>= 32;`
| |
| |
▼ ▼
if(Any bit from 16 to 31 is one) if(Any bit from 48 to 63 is one)
|Because high order bits are Zero |Note due to >> all high order
|It check for bits numbered 16-31 |bits becomes zero.
| |2nd if checks for bit from 16-31
| |that were originally bit
| | numbered 48 to 63
| |
▼ ▼
Case False: if(0) Case False: if(0)
All bits Zero from 16-31 All bits Zero from 49-63
mask will NOT updated and mask will NOT updated and
byte = 6 byte = 2
Case True: if(!0) Case False: if(!0)
Some bits are one from 16-31 Some bits are one from 49-63
mask updated mask Updated
byte = 4 byte = 0
| |
| |
▼ ▼
more four cases more four cases
Total 8 different answer outputs from `0` to `7`
find_zero()
search first N
bytes with value zero from left hand side. Output byte value can be from 0
to 7
and doesn't considers right most byte ("8"
).
(*note: long is 64 bits long = 8 * 8-bits = 8 bytes.*)byte NUMBER ("):
"1" "2" "3" "4" "5" "6" "7" "8"
63 56 55 48 47 40 39 32 31 24 23 16 15 7 0
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0001 0101
▲ ▲
MSB LSM
0
--> means first byte is not zero
byte = 1
--> high order byte is zero that is bit numbered 56
to 63
byte = 2
--> two high order bytes are zero that is bit numbered 48
to 63
byte = 3
--> three high order byte is zero that is bit numbered 40
to 63
:
:
Similarly, byte = 7
--> Seven bytes from left are 0, (or all 0). Flow-diagram
Code
Below I have written a small code that calls find_zero()
function with different values, will help you to understand the function:
int main(){
printf("\nmask=0x0, all bytes are zero, result =%ld", find_zero(0x0L)); // not prints 8
printf("\nmask=0xff, left seven bytes are zero, result =%ld", find_zero(0xffL));
printf("\nmask=0xffff, left six bytes are zero, result =%ld", find_zero(0xffffL));
printf("\nmask=0xffffff, left five bytes are zero, result =%ld", find_zero(0xffffffL));
printf("\nmask=0xffffffff, left four bytes are zero, result =%ld", find_zero(0xffffffffL));
printf("\nmask=0xffffffffff, left three bytes are zero, result =%ld", find_zero(0xffffffffffL));
printf("\nmask=0xffffffffffff, left two bytes are zero, result =%ld", find_zero(0xffffffffffffL));
printf("\nmask=0xffffffffffffff, left most one byte is zero, result =%ld", find_zero(0xffffffffffffffL));
printf("\nmask=0xffffffffffffffff, No byte is zero, result =%ld", find_zero(0xffffffffffffffffL));
printf("\nmask=0x8000000000000000L, first byte is NOT zero (so no search), result =%ld", find_zero(0x8000000000000000LL));
printf("\n");
return 1;
}
Please observe the output to understand function:
mask=0x0, all bytes are zero, result =7
mask=0xff, left seven bytes are zero, result =7
mask=0xffff, left six bytes are zero, result =6
mask=0xffffff, left five bytes are zero, result =5
mask=0xffffffff, left four bytes are zero, result =4
mask=0xffffffffff, left three bytes are zero, result =3
mask=0xffffffffffff, left two bytes are zero, result =2
mask=0xffffffffffffff, left most one byte is zero, result =1
mask=0xffffffffffffffff, No byte is zero, result =0
mask=0x8000000000000000L, first byte is NOT zero (so no search), result =0
Note: if all bytes are zero it prints 7
not 8.
这篇关于什么是用于word_at_a_time.h中的has_zero和find_zero的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!