去扫描仪 - 正确的空白? [英] Go scanner - correctness for whitespace?
本文介绍了去扫描仪 - 正确的空白?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
text / scanner / scanner.go
中的Go扫描程序包使用技巧来查找空白: const GoWhitespace = 1 <<'\t'| 1 <''\\\
'| 1<'\r'| 1<''
然后:
//跳过空格
用于s.Whitespace&(1 ch = s.next( )
}
由于字符值左移超过31,这不是独一无二的?我的意思是,当一些字符与标签模32相同时,它将被识别为空白字符?
解决方案
完全回答:
明确地说,对于无符号操作,我们得到高位被屏蔽掉,所以低位实际上是绕回。
它的工作原因是:
-
Scanner.Whitespace
是实际上uint64
so valueGoWhitespace
完全适合 - 操作<$ c $在运行时的无符号整数上的c> s.Whitespace&(1
可以具有任意大的中间值并且将环绕。因此,如果说char是a(96),我们有 1 << 96
,它溢出所以64位int的模数大小为零。
The Go scanner package in text/scanner/scanner.go
uses trick to find whitespace:
const GoWhitespace = 1<<'\t' | 1<<'\n' | 1<<'\r' | 1<<' '
And then:
// skip white space
for s.Whitespace&(1<<uint(ch)) != 0 {
ch = s.next()
}
Since character values shift left by more than 31, can there be cases where this is not unique? I mean, when some char is the same as tab modulo 32, it will be recognized as whitespace?
解决方案
Fully answer:
Spec explicitly say that for operations on unsigned, we get high bits masked out so the low bits are really "wrap around".
The reason it works is:
Scanner.Whitespace
is actuallyuint64
so value ofGoWhitespace
fully fit- Operation
s.Whitespace&(1<<uint(ch))
on unsigned integer at runtime can have intermediate values arbitrary large and will wrap around. So if say char is "a" (96) we have1 << 96
which overflows so modulo size of 64-bit int it's 0.
这篇关于去扫描仪 - 正确的空白?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文