如果 NUL 终止符不在切片的末尾,如何从 NUL 终止的字节切片中获取“&str"? [英] How to get a '&str' from a NUL-terminated byte slice if the NUL terminator isn't at the end of the slice?
问题描述
虽然 CStr
通常用于 FFI,但我正在从 &[u8]
读取,它以 NUL 结尾并确保为有效的 UTF-8,因此无需检查.
While CStr
is typically used for FFI, I am reading from a &[u8]
which is NUL-terminated and is ensured to be valid UTF-8 so no checks are needed.
然而,NUL 终止符不一定在切片的末尾.将其作为 &str
获取的好方法是什么?
However the NUL terminator isn't necessarily at the end of the slice. What's a good way to get this as a &str
?
建议使用 CStr::from_bytes_with_nul
,但这会在内部 \0
字符上出现恐慌(当 \0
不是t 最后一个字符).
It was suggested to use CStr::from_bytes_with_nul
, but this panics on an interior \0
character (when the \0
isn't the last character).
推荐答案
我会使用迭代器适配器来查找第一个零字节的索引:
I would use iterator adaptors to find the index of the first zero byte:
pub unsafe fn str_from_u8_nul_utf8_unchecked(utf8_src: &[u8]) -> &str {
let nul_range_end = utf8_src.iter()
.position(|&c| c == b'\0')
.unwrap_or(utf8_src.len()); // default to length if no `\0` present
::std::str::from_utf8_unchecked(&utf8_src[0..nul_range_end])
}
这有一个主要优点,就是要求一个人捕获所有情况(比如数组中没有 0).
This has the major advantage of requiring one to catch all cases (like no 0 in the array).
如果您想要检查格式良好的 UTF-8 的版本:
If you want the version that checks for well-formed UTF-8:
pub fn str_from_u8_nul_utf8(utf8_src: &[u8]) -> Result<&str, std::str::Utf8Error> {
let nul_range_end = utf8_src.iter()
.position(|&c| c == b'\0')
.unwrap_or(utf8_src.len()); // default to length if no `\0` present
::std::str::from_utf8(&utf8_src[0..nul_range_end])
}
这篇关于如果 NUL 终止符不在切片的末尾,如何从 NUL 终止的字节切片中获取“&str"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!