有没有办法为 ZipFile 实现 Send 特性? [英] Is there any way to implement the Send trait for ZipFile?

查看:22
本文介绍了有没有办法为 ZipFile 实现 Send 特性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 zip 读取不同线程中的 .zip 文件板条箱.

I want to read a .zip file in a different thread by using the zip crate.

extern crate zip;

use zip::ZipArchive;
use zip::read::ZipFile;
use std::fs::File;
use std::io::BufReader;
use std::thread;

fn compute_hashes(mut file: ZipFile) {
    let reader_thread= thread::spawn(move || {
        let mut reader = BufReader::new(file);
        /* ... */
    });
}

fn main() {
    let mut file = File::open(r"facebook-JakubOnderka.zip").unwrap();
    let mut zip = ZipArchive::new(file).unwrap();

    for i in 0..zip.len() {
        let mut inside = zip.by_index(i).unwrap();

        if !inside.name().ends_with("/") { // skip directories
            println!("Filename: {}", inside.name());
            compute_hashes(inside);
        }
    }
}

但是编译器显示了这个错误:

But the compiler shows me this error:

error[E0277]: the trait bound `std::io::Read: std::marker::Send` is not satisfied
  --> src/main.rs:10:24
   |
10 |     let reader_thread= thread::spawn(move || {
   |                        ^^^^^^^^^^^^^ `std::io::Read` cannot be sent between threads safely
   |
   = help: the trait `std::marker::Send` is not implemented for `std::io::Read`
   = note: required because of the requirements on the impl of `std::marker::Send` for `&mut std::io::Read`
   = note: required because it appears within the type `std::io::Take<&mut std::io::Read>`
   = note: required because it appears within the type `zip::crc32::Crc32Reader<std::io::Take<&mut std::io::Read>>`
   = note: required because it appears within the type `zip::read::ZipFileReader<'_>`
   = note: required because it appears within the type `zip::read::ZipFile<'_>`
   = note: required because it appears within the type `[closure@src/main.rs:10:38: 13:6 file:zip::read::ZipFile<'_>]`
   = note: required by `std::thread::spawn`

但同样适用于 std::fs::File 类型.是否有必要修复 zip crate 或有其他方法吗?

But the same works for the type std::fs::File. Is it necessary to fix the zip crate or is there any other method?

推荐答案

这是zip crate 的 API 的限制,您无法真正更改任何内容.问题是文件 ZipArchive 是通过调用 new 并传递一个读取器来创建的——它实现了 ReadSeek.但这些是对阅读器的唯一要求(特别是,阅读器不需要 Clone).因此,整个ZipArchive只能拥有一个阅读器.

This is a limitation of the zip crate's API and you can't really change anything. The problem is that the file ZipArchive is created by calling new and passing a reader -- something that implements Read and Seek. But these are the only requirements for the reader (in particular, the reader doesn't need to be Clone). Thus, the whole ZipArchive can only own one reader.

但是现在 ZipArchive 能够生成自己实现 ReadZipFiles.如果整个 ZipArchive 只有一个阅读器,这如何工作?通过分享!存档和所有文件之间共享唯一的读取器.但是这种共享不是线程保存!对读取器的可变引用存储在每个 ZipFile 中——这违反了 Rust 的核心原则.

But now the ZipArchive is able to produce ZipFiles which implement Read themselves. How does that work if the whole ZipArchive only has one reader? Through sharing! The only reader is shared between the archive and all files. But this sharing is not thread save! A mutable reference to the reader is stored in each ZipFile -- this violates Rust's core principle.

这是 crate 的一个已知问题,正在 在 GitHub 问题跟踪器上进行讨论.

This is a known issue of the crate and is being discussed on the GitHub issue tracker.

那你现在能做什么?不是很多,但一些可能性(如库作者所述)可能适合您的用例:

So what can you do now? Not a whole lot, but a few possibilities (as mentioned by the library author) might be OK for your use case:

  • 您可以先将整个文件解压到内存中,然后将原始数据发送到另一个线程进行计算.类似的东西:

  • You could decompress the whole file into memory first, then send the raw data to another thread to do calculations on it. Something like:

let data = Vec::new();
BufReader::new(file).read_to_end(&mut data)?;
let reader_thread= thread::spawn(move || {
    // Do stuff with `data`
});

但如果您只想对所有文件计算一个廉价的散列函数,将内容加载到内存中可能比动态计算散列要慢,如果您的文件很大,则可能不可行.

But if you just want to compute a cheap hash function on all files, loading the contents into memory is probably slower than computing the hash on the fly and might be infeasible if your files are big.

为每个线程创建一个 ZipArchive.如果您的存档中有很多小文件,这可能会非常慢...

Creating one ZipArchive for each thread. This might be very slow if you have many small files in your archive...

一个小提示:启动一个线程需要时间.您通常不想为每个工作单元启动一个线程,而是希望在线程池中维护固定数量的线程,管理队列中的工作并将工作分配给空闲的工作线程.threadpool crate 可能会满足您的需求.

A tiny hint: starting a thread costs time. You often don't want to start a thread for each unit of work, but rather maintain a fixed number of threads in a thread pool, manage work in a queue and assign work to idle worker threads. The threadpool crate might serve your needs.

这篇关于有没有办法为 ZipFile 实现 Send 特性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆