在 Swift 中逐行读取文件/URL [英] Read a file/URL line-by-line in Swift

查看:26
本文介绍了在 Swift 中逐行读取文件/URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取 NSURL 中给出的文件并将其加载到一个数组中,项目由换行符 分隔.

I am trying to read a file given in an NSURL and load it into an array, with items separated by a newline character .

这是我目前的做法:

var possList: NSString? = NSString.stringWithContentsOfURL(filePath.URL) as? NSString
if var list = possList {
    list = list.componentsSeparatedByString("
") as NSString[]
    return list
}
else {
    //return empty list
}

我对此不太满意,原因有几个.一,我正在处理大小从几千字节到数百 MB 的文件.可以想象,处理这么大的字符串既缓慢又笨拙.其次,这会在执行时冻结 UI——同样,这也不好.

I'm not very happy with this for a couple of reasons. One, I'm working with files that range from a few kilobytes to hundreds of MB in size. As you can imagine, working with strings this large is slow and unwieldy. Secondly, this freezes up the UI when it's executing--again, not good.

我曾考虑在单独的线程中运行此代码,但一直遇到问题,此外,它仍然无法解决处理大字符串的问题.

I've looked into running this code in a separate thread, but I've been having trouble with that, and besides, it still doesn't solve the problem of dealing with huge strings.

我想做的是类似于以下伪代码的内容:

What I'd like to do is something along the lines of the following pseudocode:

var aStreamReader = new StreamReader(from_file_or_url)
while aStreamReader.hasNextLine == true {
    currentline = aStreamReader.nextLine()
    list.addItem(currentline)
}

我将如何在 Swift 中完成此操作?

关于我正在阅读的文件的一些说明:所有文件都由短(<255 个字符)字符串组成,由 分隔 .文件的长度范围从约 100 行到超过 5000 万行.它们可能包含欧洲字符和/或带有重音符号的字符.

A few notes about the files I'm reading from: All files consist of short (<255 chars) strings separated by either or . The length of the files range from ~100 lines to over 50 million lines. They may contain European characters, and/or characters with accents.

推荐答案

(代码现在适用于 Swift 2.2/Xcode 7.3.如果有人需要,可以在编辑历史记录中找到旧版本.更新版本用于最后提供 Swift 3.)

以下 Swift 代码的灵感来自于各种答案如何从 NSFileHandle 逐行读取数据?.它以块的形式从文件中读取,并将完整的行转换为字符串.

The following Swift code is heavily inspired by the various answers to How to read data from NSFileHandle line by line?. It reads from the file in chunks, and converts complete lines to strings.

默认的行分隔符( )、字符串编码(UTF-8)和块大小(4096)可以使用可选参数进行设置.

The default line delimiter ( ), string encoding (UTF-8) and chunk size (4096) can be set with optional parameters.

class StreamReader  {

    let encoding : UInt
    let chunkSize : Int

    var fileHandle : NSFileHandle!
    let buffer : NSMutableData!
    let delimData : NSData!
    var atEof : Bool = false

    init?(path: String, delimiter: String = "
", encoding : UInt = NSUTF8StringEncoding, chunkSize : Int = 4096) {
        self.chunkSize = chunkSize
        self.encoding = encoding

        if let fileHandle = NSFileHandle(forReadingAtPath: path),
            delimData = delimiter.dataUsingEncoding(encoding),
            buffer = NSMutableData(capacity: chunkSize)
        {
            self.fileHandle = fileHandle
            self.delimData = delimData
            self.buffer = buffer
        } else {
            self.fileHandle = nil
            self.delimData = nil
            self.buffer = nil
            return nil
        }
    }

    deinit {
        self.close()
    }

    /// Return next line, or nil on EOF.
    func nextLine() -> String? {
        precondition(fileHandle != nil, "Attempt to read from closed file")

        if atEof {
            return nil
        }

        // Read data chunks from file until a line delimiter is found:
        var range = buffer.rangeOfData(delimData, options: [], range: NSMakeRange(0, buffer.length))
        while range.location == NSNotFound {
            let tmpData = fileHandle.readDataOfLength(chunkSize)
            if tmpData.length == 0 {
                // EOF or read error.
                atEof = true
                if buffer.length > 0 {
                    // Buffer contains last line in file (not terminated by delimiter).
                    let line = NSString(data: buffer, encoding: encoding)

                    buffer.length = 0
                    return line as String?
                }
                // No more lines.
                return nil
            }
            buffer.appendData(tmpData)
            range = buffer.rangeOfData(delimData, options: [], range: NSMakeRange(0, buffer.length))
        }

        // Convert complete line (excluding the delimiter) to a string:
        let line = NSString(data: buffer.subdataWithRange(NSMakeRange(0, range.location)),
            encoding: encoding)
        // Remove line (and the delimiter) from the buffer:
        buffer.replaceBytesInRange(NSMakeRange(0, range.location + range.length), withBytes: nil, length: 0)

        return line as String?
    }

    /// Start reading from the beginning of file.
    func rewind() -> Void {
        fileHandle.seekToFileOffset(0)
        buffer.length = 0
        atEof = false
    }

    /// Close the underlying file. No reading must be done after calling this method.
    func close() -> Void {
        fileHandle?.closeFile()
        fileHandle = nil
    }
}

用法:

if let aStreamReader = StreamReader(path: "/path/to/file") {
    defer {
        aStreamReader.close()
    }
    while let line = aStreamReader.nextLine() {
        print(line)
    }
}

您甚至可以使用带有 for-in 循环的阅读器

You can even use the reader with a for-in loop

for line in aStreamReader {
    print(line)
}

通过实施 SequenceType 协议(比较 http://robots.thoughtbot.com/swift-sequences):

by implementing the SequenceType protocol (compare http://robots.thoughtbot.com/swift-sequences):

extension StreamReader : SequenceType {
    func generate() -> AnyGenerator<String> {
        return AnyGenerator {
            return self.nextLine()
        }
    }
}

<小时>

Swift 3/Xcode 8 beta 6 的更新: 也现代化"为使用 guard 和新的 Data 值类型:


Update for Swift 3/Xcode 8 beta 6: Also "modernized" to use guard and the new Data value type:

class StreamReader  {

    let encoding : String.Encoding
    let chunkSize : Int
    var fileHandle : FileHandle!
    let delimData : Data
    var buffer : Data
    var atEof : Bool

    init?(path: String, delimiter: String = "
", encoding: String.Encoding = .utf8,
          chunkSize: Int = 4096) {

        guard let fileHandle = FileHandle(forReadingAtPath: path),
            let delimData = delimiter.data(using: encoding) else {
                return nil
        }
        self.encoding = encoding
        self.chunkSize = chunkSize
        self.fileHandle = fileHandle
        self.delimData = delimData
        self.buffer = Data(capacity: chunkSize)
        self.atEof = false
    }

    deinit {
        self.close()
    }

    /// Return next line, or nil on EOF.
    func nextLine() -> String? {
        precondition(fileHandle != nil, "Attempt to read from closed file")

        // Read data chunks from file until a line delimiter is found:
        while !atEof {
            if let range = buffer.range(of: delimData) {
                // Convert complete line (excluding the delimiter) to a string:
                let line = String(data: buffer.subdata(in: 0..<range.lowerBound), encoding: encoding)
                // Remove line (and the delimiter) from the buffer:
                buffer.removeSubrange(0..<range.upperBound)
                return line
            }
            let tmpData = fileHandle.readData(ofLength: chunkSize)
            if tmpData.count > 0 {
                buffer.append(tmpData)
            } else {
                // EOF or read error.
                atEof = true
                if buffer.count > 0 {
                    // Buffer contains last line in file (not terminated by delimiter).
                    let line = String(data: buffer as Data, encoding: encoding)
                    buffer.count = 0
                    return line
                }
            }
        }
        return nil
    }

    /// Start reading from the beginning of file.
    func rewind() -> Void {
        fileHandle.seek(toFileOffset: 0)
        buffer.count = 0
        atEof = false
    }

    /// Close the underlying file. No reading must be done after calling this method.
    func close() -> Void {
        fileHandle?.closeFile()
        fileHandle = nil
    }
}

extension StreamReader : Sequence {
    func makeIterator() -> AnyIterator<String> {
        return AnyIterator {
            return self.nextLine()
        }
    }
}

这篇关于在 Swift 中逐行读取文件/URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆