我如何异步加载数据从大文件在Qt? [英] How can I asynchronously load data from large files in Qt?

查看:2482
本文介绍了我如何异步加载数据从大文件在Qt?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Qt 5.2.1来实现一个程序,从文件中读取数据(可能是几个字节到几GB),并以一种依赖于每个字节的方式可视化数据。我的例子是一个十六进制查看器。



一个对象读取,并发出一个信号 dataRead()它读取一个新的数据块。该信号携带指向 QByteArray 的指针:



filereader.cpp



  void FileReader :: startReading()
{

/ *对象状态代码在这里... * /

{
QFile inFile(fileName);

if(!inFile.open(QIODevice :: ReadOnly))
{
changeState(STARTED,State(ERROR,QString()));
return;
}

while(!inFile.atEnd())
{
QByteArray * qa = new QByteArray(inFile.read(DATA_SIZE));
qDebug()<< emit dataRead();
emit dataRead(qa);
}
}

/ *发出EOF信号* /

}

查看器将 loadData 插槽连接到此信号,这是显示数据的函数:



hexviewer.cpp



  void HexViewer :: loadData(QByteArray * data)
{
QString hexString = data-> toHex();

for(int i = 0; i< hexString.length(); i + = 2)
{
_ui-> hexTextView-> insertPlainText(hexString.at (一世));
_ui-> hexTextView-> insertPlainText(hexString.at(i + 1));
_ui-> hexTextView-> insertPlainText();
}

删除数据;
}

第一个问题是,如果这只是按原样运行,线程将变得完全无响应。所有 dataRead()信号将在GUI重新绘制之前发出。



=https://github.com/detly/qt-test =nofollow>完整代码可以运行,当您使用大于约1kB的文件时,您将看到此行为。



按照我的论坛帖子回复 Qt5中的非阻塞本地文件IO 和另一个Stack Overflow问题的答案如何做异步文件io在qt?,答案是:使用线程。但是这两个答案都没有详细说明如何清洗数据本身,也没有如何避免常见的错误和陷阱。



如果数据很小100个字节的顺序)我只是发出它与信号。但在文件大小为GB(编辑的情况下,或者如果文件位于基于网络的文件系统 NFS,Samba共享,我不想只是因为读取文件块锁定UI。



第二个问题是使用 new 在发射器和 delete 在接收器似乎有点幼稚:我有效地使用整个堆作为一个跨线程队列

问题1: Qt有没有更好的/惯用的方法来跨线程移动数据,同时限制内存消耗?它有一个线程安全队列或其他结构,可以简化这一切吗?



问题2: em>来实现线程等自己?我不是重塑轮子的狂热粉丝,特别是在内存管理和线程方面。

解决方案

首先,你不需要在网络上传输数据,在你的应用程序中有任何多线程。 FileReader 类是 QThread 的子类,但并不意味着所有 FileReader 方法将在另一个线程中执行。实际上,所有操作都在主(GUI)线程中执行。



FileReader 应该是 QObject 而不是 QThread 子类。然后你创建一个基本的 QThread 对象,并使用 QObject :: moveToThread 将你的工作者(reader)移动到它。您可以这里了解这项技术。



确保您已使用 qRegisterMetaType 注册 FileReader :: State 这是必要的Qt信号槽连接工作在不同的线程。



例如:

  HexViewer :: HexViewer(QWidget * parent):
QMainWindow(parent),
_ui(new Ui :: HexViewer),
_fileReader(new FileReader
{
qRegisterMetaType< FileReader :: State>(FileReader :: State);

QThread * readerThread = new QThread(this);
readerThread-> setObjectName(ReaderThread);
connect(readerThread,SIGNAL(finished()),
_fileReader,SLOT(deleteLater()));
_fileReader-> moveToThread(readerThread);
readerThread-> start();

_ui-> setupUi(this);

...
}

void HexViewer :: on_quitButton_clicked()
{
_fileReader-> thread() - >放弃();
_fileReader-> thread() - > wait();

qApp-> quit();
}

也不需要在堆上分配数据:

  while(!inFile.atEnd())
{
QByteArray * qa = new QByteArray(inFile.read DATA_SIZE));
qDebug()<< emit dataRead();
emit dataRead(qa);
}

QByteArray a href =http://doc.qt.io/qt-5/implicit-sharing.html =nofollow>隐式共享。这意味着,当您以只读模式在函数之间传递 QByteArray 对象时,其内容不会再次复制。



将上面的代码更改为此并忘记手动内存管理:

  while(!inFile.atEnd 
{
QByteArray qa = inFile.read(DATA_SIZE);
qDebug()<< emit dataRead();
emit dataRead(qa);
}

但是无论如何,主要的问题不是多线程。问题是 QTextEdit :: insertPlainText 操作不便宜,尤其是当你有大量的数据。 FileReader 很快读取文件数据,然后用您要显示的数据的新部分充斥您的小部件。



注意,你有一个非常无效的实现 HexViewer :: loadData 。您通过char插入文本数据char,使得 QTextEdit 不断地重绘其内容并冻结GUI。



(注意数据参数不再是指针):

  void HexViewer :: loadData(QByteArray data) 
{
QString tmp = data.toHex();

QString hexString;
hexString.reserve(tmp.size()* 1.5);

const inthexLen = 2;

for(int i = 0; i {
hexString.append(tmp.mid(i,hexLen) +);
}

_ui-> hexTextView-> insertPlainText(hexString);
}



无论如何,应用程序的瓶颈不是文件读取,而是 QTextEdit 更新。通过块加载数据,然后将它附加到小部件使用 QTextEdit :: insertPlainText 将不会加快任何东西。对于小于1Mb的文件,一次读取整个文件会更快,然后在一个步骤中将生成的文本设置为小部件。



我想你不能使用默认的Qt小部件容易显示大于几兆字节的巨大文本。这个任务需要一些不重要的批准,通常与多线程或异步数据加载无关。这是所有关于创建一些棘手的小部件,不会尝试一次显示其巨大的内容。


I'm using Qt 5.2.1 to implement a program that reads in data from a file (could be a few bytes to a few GB) and visualises that data in a way that's dependent on every byte. My example here is a hex viewer.

One object does the reading, and emits a signal dataRead() when it's read a new block of data. The signal carries a pointer to a QByteArray like so:

filereader.cpp

void FileReader::startReading()
{

    /* Object state code here... */

        {
            QFile inFile(fileName);

            if (!inFile.open(QIODevice::ReadOnly))
            {
                changeState(STARTED, State(ERROR, QString()));
                return;
            }

            while(!inFile.atEnd())
            {
                QByteArray *qa = new QByteArray(inFile.read(DATA_SIZE));
                qDebug() << "emitting dataRead()";
                emit dataRead(qa);
            }
        }

    /* Emit EOF signal */

}

The viewer has its loadData slot connected to this signal, and this is the function that displays the data:

hexviewer.cpp

void HexViewer::loadData(QByteArray *data)
{
    QString hexString = data->toHex();

    for (int i = 0; i < hexString.length(); i+=2)
    {
        _ui->hexTextView->insertPlainText(hexString.at(i));
        _ui->hexTextView->insertPlainText(hexString.at(i+1));
        _ui->hexTextView->insertPlainText(" ");
    }

    delete data;
}

The first problem is that if this is just run as-is, the GUI thread will become completely unresponsive. All of the dataRead() signals will be emitted before the GUI is ever redrawn.

(The full code can be run, and when you use a file bigger than about 1kB, you will see this behaviour.)

Going by the response to my forum post Non-blocking local file IO in Qt5 and the answer to another Stack Overflow question How to do async file io in qt?, the answer is: use threads. But neither of these answers go into any detail as to how to shuffle the data itself around, nor how to avoid common errors and pitfalls.

If the data was small (of the order of a hundred bytes) I'd just emit it with the signal. But in the case the file is GB in size (edit) or if the file is on a network-based filesystem eg. NFS, Samba share, I don't want the UI to lock up just because reading the file blocks.

The second problem is that the mechanics of using new in the emitter and delete in the receiver seems a bit naive: I'm effectively using the entire heap as a cross-thread queue.

Question 1: Does Qt have a better/idiomatic way to move data across threads while limiting memory consumption? Does it have a thread safe queue or other structures that can simplify this whole thing?

Question 2: Does I have to implement the threading etc. myself? I'm not a huge fan of reinventing wheels, especially regarding memory management and threading. Are there higher level constructs that can already do this, like there are for network transport?

解决方案

First of all, you don't have any multithreading in your app at all. Your FileReader class is a subclass of QThread, but it does not mean that all FileReader methods will be executed in another thread. In fact, all your operations are performed in the main (GUI) thread.

FileReader should be a QObject and not a QThread subclass. Then you create a basic QThread object and move your worker (reader) to it using QObject::moveToThread. You can read about this technique here.

Make sure you have registered FileReader::State type using qRegisterMetaType. This is necessary for Qt signal-slot connections to work across different threads.

An example:

HexViewer::HexViewer(QWidget *parent) :
    QMainWindow(parent),
    _ui(new Ui::HexViewer),
    _fileReader(new FileReader())
{
    qRegisterMetaType<FileReader::State>("FileReader::State");

    QThread *readerThread = new QThread(this);
    readerThread->setObjectName("ReaderThread");
    connect(readerThread, SIGNAL(finished()),
            _fileReader, SLOT(deleteLater()));
    _fileReader->moveToThread(readerThread);
    readerThread->start();

    _ui->setupUi(this);

    ...
}

void HexViewer::on_quitButton_clicked()
{
    _fileReader->thread()->quit();
    _fileReader->thread()->wait();

    qApp->quit();
}

Also it is not necessary to allocate data on the heap here:

while(!inFile.atEnd())
{
    QByteArray *qa = new QByteArray(inFile.read(DATA_SIZE));
    qDebug() << "emitting dataRead()";
    emit dataRead(qa);
}

QByteArray uses implicit sharing. It means that its contents are not copied again and again when you pass a QByteArray object across functions in a read-only mode.

Change the code above to this and forget about manual memory management:

while(!inFile.atEnd())
{
    QByteArray qa = inFile.read(DATA_SIZE);
    qDebug() << "emitting dataRead()";
    emit dataRead(qa);
}

But anyway, the main problem is not with multithreading. The problem is that QTextEdit::insertPlainText operation is not cheap, especially when you have a huge amount of data. FileReader reads file data pretty quickly and then floods your widget with new portions of data to display.

It must be noted that you have a very ineffectual implementation of HexViewer::loadData. You insert text data char by char which makes QTextEdit constantly redraw its contents and freezes the GUI.

You should prepare the resulting hex string first (note that data parameter is not a pointer anymore):

void HexViewer::loadData(QByteArray data)
{
    QString tmp = data.toHex();

    QString hexString;
    hexString.reserve(tmp.size() * 1.5);

    const int hexLen = 2;

    for (int i = 0; i < tmp.size(); i += hexLen)
    {
        hexString.append(tmp.mid(i, hexLen) + " ");
    }

    _ui->hexTextView->insertPlainText(hexString);
}

Anyway, the bottleneck of your application is not file reading but QTextEdit updating. Loading data by chunks and then appending it to the widget using QTextEdit::insertPlainText will not speed up anything. For files less than 1Mb it is faster to read the whole file at once and then set the resulting text to the widget in a single step.

I suppose you can't easily display huge texts larger than several megabytes using default Qt widgets. This task requires some non-trivial approch that in general has nothing to do with multithreading or asynchronous data loading. It's all about creating some tricky widget which won't try to display its huge contents at once.

这篇关于我如何异步加载数据从大文件在Qt?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆