Swift计算大文件的MD5校验和 [英] Swift Calculate MD5 Checksum for Large Files
问题描述
我正在为大型视频文件创建MD5校验和。我目前正在使用代码:
I'm working on creating the MD5 Checksum for large video files. I'm currently using the code:
extension NSData {
func MD5() -> NSString {
let digestLength = Int(CC_MD5_DIGEST_LENGTH)
let md5Buffer = UnsafeMutablePointer<CUnsignedChar>.allocate(capacity: digestLength)
CC_MD5(bytes, CC_LONG(length), md5Buffer)
let output = NSMutableString(capacity: Int(CC_MD5_DIGEST_LENGTH * 2))
for i in 0..<digestLength {
output.appendFormat("%02x", md5Buffer[i])
}
return NSString(format: output)
}
}
但这会产生一个内存缓冲区,对于大型视频文件来说并不理想。在Swift中有没有办法计算读取文件流的MD5校验和,因此内存占用量最小?
But that creates a memory buffer, and for large video files would not be ideal. Is there a way in Swift to calculate the MD5 Checksum reading a file stream, so the memory footprint will be minimal?
推荐答案
你可以计算块中的MD5校验和,如
所示在是有一个MD5库,不需要同时输入整个输入?。
You can compute the MD5 checksum in chunks, as demonstrated e.g. in Is there a MD5 library that doesn't require the whole input at the same time?.
这是一个使用Swift的可能实现:
Here is a possible implementation using Swift:
func md5File(url: URL) -> Data? {
let bufferSize = 1024 * 1024
do {
// Open file for reading:
let file = try FileHandle(forReadingFrom: url)
defer {
file.closeFile()
}
// Create and initialize MD5 context:
var context = CC_MD5_CTX()
CC_MD5_Init(&context)
// Read up to `bufferSize` bytes, until EOF is reached, and update MD5 context:
while autoreleasepool(invoking: {
let data = file.readData(ofLength: bufferSize)
if data.count > 0 {
data.withUnsafeBytes {
_ = CC_MD5_Update(&context, $0, numericCast(data.count))
}
return true // Continue
} else {
return false // End of file
}
}) { }
// Compute the MD5 digest:
var digest = Data(count: Int(CC_MD5_DIGEST_LENGTH))
digest.withUnsafeMutableBytes {
_ = CC_MD5_Final($0, &context)
}
return digest
} catch {
print("Cannot open file:", error.localizedDescription)
return nil
}
}
需要自动释放池来释放
返回的内存 file.readData()
,没有它,整个(可能很大的)文件
将被加载到内存中。感谢Abhi Beckert注意到
并提供了一个实现。
The autorelease pool is needed to release the memory returned by
file.readData()
, without it the entire (potentially huge) file
would be loaded into memory. Thanks to Abhi Beckert for noticing that
and providing an implementation.
如果您需要将摘要作为十六进制编码的字符串,那么更改
返回输入字符串?
并替换
If you need the digest as a hex-encoded string then change the
return type to String?
and replace
return digest
by
let hexDigest = digest.map { String(format: "%02hhx", $0) }.joined()
return hexDigest
这篇关于Swift计算大文件的MD5校验和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!