在单个 Data 实例中存储多个不同类型对象的数据表示 [英] Storing the data representations of multiple, differently typed objects in a single Data instance
问题描述
据我所知,Data
是一个抽象字节缓冲区的结构.它引用内存中的物理区域,换句话说:连续的字节数.现在我想在内存中有效地存储多个值(作为原始数据),其中值并非都是相同类型.
To my knowledge, Data
is a struct that abstracts a byte buffer. It references a physical area in memory, in other words: a contiguous number of bytes. Now I want to efficiently store multiple values in memory (as raw data), where the values are not all of the same type.
我在这里对高效的定义 ≔ 存储所有这些值,没有任何未使用的缓冲区/间隙字节.
My definition of efficient here ≔ Store all those values without any unused buffer / gap bytes.
let a: UInt8 = 39
let b: Int32 = -20001
let string: String = "How awesome is this data?!"
现在我想将所有这些值的数据按顺序存储在内存中,没有任何类型信息.
Now I want to store the data of all those values sequentially in memory, without any type information.
let data = [a.asData, b.asData, string.asData].concatenated()
想象一下,.asData
属性将每个实例的字节表示检索为一个 [UInt8]
数组,然后将它们包装在一个 Data
中实例.contenated()
方法然后将这 3 个 Data
实例连接成一个 Data
实例,如下所示:
Imagine that the .asData
property retrieves the byte representations of each instance as a [UInt8]
array and then wraps those in a Data
instance. The concetenated()
method then just concatenates these 3 Data
instances to a single Data
instance as follows:
extension Collection where Element == Data {
func concatenated() -> Data {
reduce(into: Data()) { (result, nextDataChunk) in
result.append(nextDataChunk)
}
}
}
从内存中读回数据到各自的类型
让我们假设这一切都很好,我现在有一个 Data
实例,我想从中恢复 3 个原始值(及其原始类型).这就是我所做的:
Reading the data back from memory into the respective types
Let's assume this all worked great and I now have this single Data
instance from which I want to restore the 3 original values (with their original types). This is what I do:
var cursor = 0
let a: UInt8 = data.withUnsafeBytes { pointer in
pointer.load(fromByteOffset: cursor, as: UInt8.self)
}
cursor += MemoryLayout<UInt8>.size // +1
let b: Int32 = data.withUnsafeBytes { pointer in
pointer.load(fromByteOffset: cursor, as: Int32.self)
}
cursor += MemoryLayout<Int32>.size // +4
let string: String = data.withUnsafeBytes { pointer in
pointer.load(fromByteOffset: cursor, as: String.self)
}
cursor += MemoryLayout<String>.size // +16
问题
问题在于这会引发运行时错误:
The Problem
The problem is that this throws a runtime error:
致命错误:从未对齐的原始指针加载
Fatal error: load from misaligned raw pointer
我确切地知道为什么:
Int32
对齐为 4(因为它有 4 个字节长).换句话说:使用原始指针读取数据时,Int32
的第一个字节必须位于 4 的倍数的索引处.但由于第一个值是 UInt8仅代码>,
Int32
的数据字节从索引 1 开始,不是 4 的倍数.因此,我得到了错误.
Int32
has an alignment of 4 (because it's 4 bytes long). In other words: When reading data with a raw pointer, the first byte of the Int32
must be at an index that is a multiple of 4. But as the first value is a UInt8
only, the data bytes for the Int32
start at index 1, which is not a multiple of 4. Thus, I get the error.
我能否以某种方式使用表示不同类型实例的原始
Data
来重新创建此类实例而不会出现对齐错误?怎么样?
Can I somehow use the raw
Data
that represents instances of different types to recreate such instances without alignment errors? How?
如果这是不可能的,有没有办法在连接它们时自动正确对齐 Data
块?
And if this is not possible, is there a way to automatically align the Data
chunks correctly when concatenating them in the first place?
推荐答案
未对齐数据的问题是需要使用 Data 的 subdata 方法.除此之外,您还可以创建一些助手来让您的生活更轻松,如下所示:
The issue about misaligned data is that you need to use Data's subdata method. Besides that you can create some helpers to make your life easier as follow:
这会将任何数字类型转换为数据:
This would convert any numeric type to Data:
extension Numeric {
var data: Data {
var bytes = self
return .init(bytes: &bytes, count: MemoryLayout<Self>.size)
}
}
这会将任何符合字符串协议的类型转换为数据(字符串/子字符串)
This would convert any type that conforms to String Protocol to Data (String/Substring)
extension StringProtocol {
var data: Data { .init(utf8) }
}
这会将任何有效的 utf8 编码字节序列 (UInt8) 转换为字符串
This would convert any valid utf8 encoded sequence of bytes (UInt8) to string
extension DataProtocol {
var string: String? { String(bytes: self, encoding: .utf8) }
}
这是将字节转换为对象或对象集合(数组)的通用方法:
This is a generic method to convert the bytes to object or to a collection (array) of objects:
extension ContiguousBytes {
func object<T>() -> T { withUnsafeBytes { $0.load(as: T.self) } }
func objects<T>() -> [T] { withUnsafeBytes { .init($0.bindMemory(to: T.self)) } }
}
和一个用于连接数据数组的简化通用版本:
and a simplified generic version to concatenate an array of data:
extension Collection where Element == DataProtocol {
var data: Data { .init(joined()) }
}
用法:
Usage:
let a: UInt8 = 39
let b: Int32 = -20001
let string: String = "How awesome is this data?!"
let data = [a.data, b.data, string.data].data
// just set the cursor (index) at the start position
var cursor = data.startIndex
// get the subdata from that position onwards
let loadedA: UInt8 = data.subdata(in: cursor..<data.endIndex).object() // 39
// advance your cursor for the next position
cursor = cursor.advanced(by: MemoryLayout<UInt8>.size)
// get your next object
let loadedB: Int32 = data.subdata(in: cursor..<data.endIndex).object() // -20001
// advance your position to the start of the string data
cursor = cursor.advanced(by: MemoryLayout<Int32>.size)
// load the subdata as string
let loadedString = data.subdata(in: cursor..<data.endIndex).string // "How awesome is this data?!"
编辑/更新:当然,加载字符串仅有效,因为它位于字节集合的末尾,否则您将需要使用 8 个字节来存储其大小:
edit/update: Of course loading the string only works because it is located at the end of your collection of bytes otherwise you would need to use 8 bytes to store its size:
let a: UInt8 = 39
let b: Int32 = -20001
let string: String = "How awesome is this data?!"
let c: Int = .max
let data = [a.data, b.data, string.count.data, string.data, c.data].data
var cursor = data.startIndex
let loadedA: UInt8 = data.subdata(in: cursor..<data.endIndex).object() // 39
print(loadedA)
cursor = cursor.advanced(by: MemoryLayout<UInt8>.size)
let loadedB: Int32 = data.subdata(in: cursor..<data.endIndex).object() // -20001
print(loadedB)
cursor = cursor.advanced(by: MemoryLayout<Int32>.size)
let stringCount: Int = data.subdata(in: cursor..<data.endIndex).object()
print(stringCount)
cursor = cursor.advanced(by: MemoryLayout<Int>.size)
let stringEnd = cursor.advanced(by: stringCount)
if let loadedString = data.subdata(in: cursor..<stringEnd).string { // "How awesome is this data?!"
print(loadedString)
cursor = stringEnd
let loadedC: Int = data.subdata(in: cursor..<data.endIndex).object() // 9223372036854775807
print(loadedC)
}
这将打印
39
-20001
26
这个数据有多棒?!
9223372036854775807
39
-20001
26
How awesome is this data?!
9223372036854775807
这篇关于在单个 Data 实例中存储多个不同类型对象的数据表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!