解析swift(3)数据流的惯用方法 [英] Idiomatic method of parsing swift(3) data streams
问题描述
我正在尝试对Swift3 Data对象进行一些简单的BSON解析.我觉得我正在与系统作战.
I'm trying to do some simple BSON parsing of Swift3 Data objects. I feel like I'm fighting the system.
让我们从一些输入和一个方案开始:
Let's start with some input and a scheme:
let input = Data(bytes: [2, 0x20, 0x21, 3, 0x30, 0x31, 0x32, 1, 0x10, 4, 0x40, 0x41, 0x42, 0x43])
这只是一个简单的数据流,无聊的方案是前导字节指示组成下一个块的字节数.因此,在上面,前导2表示第一个块0x20、0x21,然后是包含字节0x30、0x31、0x32等的3字节块.
This is just a simple data stream, the frivolous scheme being that a leading byte indicates how many bytes follow making up the next chunk. So in the above, the leading 2 indicates that 0x20, 0x21 are the first chunk, followed by a 3 byte chunk containing the bytes 0x30, 0x31, 0x32, etc.
流
我的第一个想法是使用流(er,Generator,Iterator等)进行处理.所以我最终得到这样的东西:
My first thought is to do it with a stream (er, Generator, Iterator, whatever). So I end up with something like:
var iter = input.makeIterator()
func parse(_ stream:inout IndexingIterator<Data>) -> Data {
var result = Data()
if let count = stream.next() {
for _ in 0..<count {
result.append(Data(bytes:[stream.next()!]))
}
}
return result
}
parse(&iter)
parse(&iter)
parse(&iter)
parse(&iter)
这会导致多个问题/发现:
This leads to multiple questions/observations:
1)为什么有人会let
迭代器?这件事的重点是要跟踪集合中不断变化的位置.我真的很难理解为什么Swift的作者选择将迭代器发送到全都赞美值语义"的棘手问题上.这意味着我必须将inout
放在所有解析函数中.
1) Why would anyone ever let
an iterator? The whole point of this thing is to keep track of an evolving position over a collection. I really struggle with why the Swift authors have chosen to send iterators down the "all hail the value semantics" rathole. It means I have to put inout
's on all my parse functions.
2)我觉得我用IndexingIterator过度指定了参数类型.也许我只需要习惯冗长的泛型?
2) I feel like I'm over-specifying the argument type with IndexingIterator. Maybe I just need to get used to verbose generics?
Python结构'
对于这种方法感到沮丧,我想我可能会模仿python的struct.unpack()样式,在该样式中,将返回已分析的数据和未使用的数据的元组.因为据称只要我不对其进行变异,数据都是神奇且高效的.就像这样:
Frustrated with that approach, I thought I might emulate pythons struct.unpack() style, where a tuple is returned of the parsed data, as well as the unconsumed data. Since supposedly Data are magical and efficient as long as I don't mutate them. That turned up like:
func parse2(_ data:Data) -> (Data, Data) {
let count = Int(data[0])
return (data.subdata(in: 1..<count+1), data.subdata(in: count+1..<data.count))
}
var remaining = input
var chunk = Data()
(chunk, rest) = parse2(remaining)
chunk
(chunk, rest) = parse2(remaining)
chunk
(chunk, rest) = parse2(remaining)
chunk
(chunk, rest) = parse2(remaining)
chunk
我遇到了两个问题.
1)我真正想返回的是data[1..count], data.subdata(in: count+1..<data.count)
.但这将返回MutableRandomAccessSlice.哪种似乎是完全不同的类型?所以我最终使用了涉及更多的subdata
.
1) What I really wanted to return was data[1..count], data.subdata(in: count+1..<data.count)
. But this returns a MutableRandomAccessSlice. Which seems to be a totally different kind of type? So I ended up using the more involved subdata
.
2)一个人可以用一个封闭范围的下标数据,但是subdata
方法将只包含一个开放范围.怎么了?
2) One can subscript a Data with a closed range, but the subdata
method will only take an open range. What's with that?
开放叛乱,旧习惯开始
现在,这名老Smalltalker似乎无法在这里找到快乐,我很生气:
Now annoyed that this old Smalltalker can't seem to find happiness here, I just roll my own:
class DataStream {
let data:Data
var index = 0
var atEnd:Bool {
return index >= self.data.count
}
init(data:Data) {
self.data = data
}
func next() -> UInt8 {
let byte = self.data[self.index]
self.index += 1
return byte
}
func next(_ count:Int) -> Data {
let subdata = self.data.subdata(in: self.index..<self.index + count)
self.index += count
return subdata
}
}
func parse3(_ stream:DataStream) -> Data {
let count = Int(stream.next())
return stream.next(count)
}
let stream = DataStream(data: input)
parse3(stream)
parse3(stream)
parse3(stream)
parse3(stream)
对于最终使用POV,我很满意的此解决方案.我可以充实DataStream来做各种事情.但是...我现在走在人迹罕至的地方,感觉好像我没有得到它"(迅捷灯泡).
This solution I'm happy with from an end use POV. I can flesh out DataStream to do all kinds of stuff. But... I'm now off the beaten path and feel like I'm not "getting it" (the Swiftish light bulb).
TL; DR版本
玩了之后,我发现自己最好奇的是通过数据结构流式传输,并根据其中遇到的内容从中提取数据的方法.
After this playing around, I find myself curious what the most idiomatic way to stream through Data structs, extracting data from them, based on what is encountered in them.
推荐答案
最后,如注释中所述,我使用了DataStream
类,其中包括MartinR的建议.这是我今天使用的实现.
In the end, as mentioned in the comments, I went with a DataStream
class including suggestions by MartinR. Here's the implementation I'm using today.
class DataStream {
let data:Data
var index = 0
var atEnd:Bool {
return index >= self.data.count
}
init(data:Data) {
self.data = data
}
func next() -> UInt8? {
guard self.atEnd.NOT else { return nil }
let byte = self.data[self.index]
self.index += 1
return byte
}
func next(_ count:Int) -> Data? {
guard self.index + count <= self.data.count else { return nil }
let subdata = self.data.subdata(in: self.index..<self.index + count)
self.index += count
return subdata
}
func upTo(_ marker:UInt8) -> Data? {
if let end = (self.index..<self.data.count).index( where: { self.data[$0] == marker } ) {
let upTo = self.next(end - self.index)
self.skip() // consume the marker
return upTo
}
else {
return nil
}
}
func skip(_ count:Int = 1) {
self.index += count
}
func skipThrough(_ marker:UInt8) {
if let end = (self.index..<self.data.count).index( where: { self.data[$0] == marker } ) {
self.index = end + 1
}
else {
self.index = self.data.count
}
}
}
这篇关于解析swift(3)数据流的惯用方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!