验证STL文件是ASCII还是二进制 [英] Verifying that an STL file is ASCII or binary

查看:413
本文介绍了验证STL文件是ASCII还是二进制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读STL文件格式的规范后,我想写几个测试以确保一个文件实际上是一个有效的二进制或ASCII文件。

After reading the specs on the STL file format, I want to write a few tests to ensure that a file is, in fact, a valid binary or ASCII file.

基于ASCII的STL文件可以通过查找文本 solid < b>在字节0,后跟一个空格(十六进制值 \x20 ),然后是可选的文本字符串,后跟换行符。

An ASCII-based STL file can be determined by finding the text "solid" at byte 0, followed by a space (hex value \x20), and then an optional text string, followed by a newline.

一个二进制STL文件有一个保留的 80 字节头,后面跟着一个 4 - 无符号整数( NumberOfTriangles <

A binary STL file has a reserved 80-byte header, followed by a 4-byte unsigned integer (NumberOfTriangles), and then 50 bytes of data for each of the NumberOfTriangles facets specified.

每个三角形刻面上的每一个 长度为 50 个字节:12个单精度(4字节)浮点数,后跟无符号短整型(2字节)无符号整数。

Each triangle facet is 50 bytes in length: 12 single-precision (4-byte) floats followed by an unsigned short (2-byte) unsigned integer.

如果二进制文件是 84 + NumberOfTriangles * 50 个字节长,通常可以被认为是一个有效的二进制文件。

If a binary file is exactly 84 + NumberOfTriangles*50 bytes long, it can be typically be considered to be a valid binary file.


不幸的是,二进制文件可以包含从80字节头的内容中的字节0开始的文本 solid 。因此,仅对该关键字的测试不能肯定地确定文件是ASCII或二进制。

Unfortunately, binary files can contain the text "solid" starting at byte 0 in the contents of the 80-byte header. Therefore, a test for only that keyword cannot positively rule that a file is ASCII or binary.

这是我到目前为止:

STL_STATUS getStlFileFormat(const QString &path)
{
    // Each facet contains:
    //  - Normals: 3 floats (4 bytes)
    //  - Vertices: 3x floats (4 bytes each, 12 bytes total)
    //  - AttributeCount: 1 short (2 bytes)
    // Total: 50 bytes per facet
    const size_t facetSize = 3*sizeof(float_t) + 3*3*sizeof(float_t) + sizeof(uint16_t);

    QFile file(path);
    if (!file.open(QIODevice::ReadOnly))
    {
        qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
        return STL_INVALID;
    }

    QFileInfo fileInfo(path);
    size_t fileSize = fileInfo.size();

    if (fileSize < 84)
    {
        // 80-byte header + 4-byte "number of triangles" marker
        qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
        return STL_INVALID;
    }

    // Look for text "solid" in first 5 bytes, indicating the possibility that this is an ASCII STL format.
    QByteArray fiveBytes = file.read(5);

    // Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
    if (!file.seek(80))
    {
        qDebug("\n\tCannot seek to the 80th byte (after the header)");
        return STL_INVALID;
    }

    // Read the number of triangles, uint32_t (4 bytes), little-endian
    QByteArray nTrianglesBytes = file.read(4);
    file.close();

    uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());

    // Verify that file size equals the sum of header + nTriangles value + all triangles
    size_t targetSize = 84 + nTriangles * facetSize;
    if (fileSize == targetSize)
    {
        return STL_BINARY;
    }
    else if (fiveBytes.contains("solid"))
    {
        return STL_ASCII;
    }
    else
    {
        return STL_INVALID;
    }
}

到目前为止,这对我有用'm担心一个简单的ASCII文件的第80个字节可能包含一些ASCII字符,当转换为uint32_t时,实际上可以等于文件的长度(非常不可能,但不是不可能)。

So far, this has worked for me, but I'm worried that a plain ASCII file's 80th byte could contain some ASCII characters that, when translated to a uint32_t, could actually equal the length of the file (very unlikely, but not impossible).

有没有额外的步骤,在验证我是否可以绝对确定一个文件是ASCII或二进制文件有用吗?

Are there additional steps that would prove useful in validating whether I can be "absolutely sure" that a file is either ASCII or binary?

UPDATE:

根据@Powerswitch和@RemyLebeau的建议,我将进一步测试关键字。这是我现在得到的:

Following the advice of @Powerswitch and @RemyLebeau, I'm doing further tests for keywords. This is what I've got now:

STL_STATUS getStlFileFormat(const QString &path)
{
    // Each facet contains:
    //  - Normals: 3 floats (4 bytes)
    //  - Vertices: 3x floats (4 byte each, 12 bytes total)
    //  - AttributeCount: 1 short (2 bytes)
    // Total: 50 bytes per facet
    const size_t facetSize = 3*sizeof(float_t) + 3*3*sizeof(float_t) + sizeof(uint16_t);

    QFile file(path);
    bool canFileBeOpened = file.open(QIODevice::ReadOnly);
    if (!canFileBeOpened)
    {
        qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
        return STL_INVALID;
    }

    QFileInfo fileInfo(path);
    size_t fileSize = fileInfo.size();

    // The minimum size of an empty ASCII file is 15 bytes.
    if (fileSize < 15)
    {
        // "solid " and "endsolid " markers for an ASCII file
        qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
        file.close();
        return STL_INVALID;
    }

    // Binary files should never start with "solid ", but just in case, check for ASCII, and if not valid
    // then check for binary...

    // Look for text "solid " in first 6 bytes, indicating the possibility that this is an ASCII STL format.
    QByteArray sixBytes = file.read(6);
    if (sixBytes.startsWith("solid "))
    {
        QString line;
        QTextStream in(&file);
        while (!in.atEnd())
        {
            line = in.readLine();
            if (line.contains("endsolid"))
            {
                file.close();
                return STL_ASCII;
            }
        }
    }

    // Wasn't an ASCII file. Reset and check for binary.
    if (!file.reset())
    {
        qDebug("\n\tCannot seek to the 0th byte (before the header)");
        file.close();
        return STL_INVALID;
    }

    // 80-byte header + 4-byte "number of triangles" for a binary file
    if (fileSize < 84)
    {
        qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
        file.close();
        return STL_INVALID;
    }

    // Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
    if (!file.seek(80))
    {
        qDebug("\n\tCannot seek to the 80th byte (after the header)");
        file.close();
        return STL_INVALID;
    }

    // Read the number of triangles, uint32_t (4 bytes), little-endian
    QByteArray nTrianglesBytes = file.read(4);
    if (nTrianglesBytes.size() != 4)
    {
        qDebug("\n\tCannot read the number of triangles (after the header)");
        file.close();
        return STL_INVALID;
    }

    uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());

    // Verify that file size equals the sum of header + nTriangles value + all triangles
    if (fileSize == (84 + (nTriangles * facetSize)))
    {
        file.close();
        return STL_BINARY;
    }

    return STL_INVALID;
}

似乎处理更多边缘情况,我试图写它在处理非常大(几个千兆字节)STL文件的方式,而不需要立即将ENTIRE文件加载到内存中,以扫描endsolid文本。

It appears to handle more edge cases, and I've attempted to write it in a way that handles extremely large (a few gigabyte) STL files gracefully without requiring the ENTIRE file to be loaded into memory at once for it to scan for the "endsolid" text.

随时提供任何意见和建议(特别是对未来寻找解决方案的人)。

Feel free to provide any feedback and suggestions (especially for people in the future looking for solutions).

推荐答案

文件不以solid开头,如果文件大小完全 84 +(numTriangles * 50) ,其中 numTriangles 从偏移量80读取,则文件是二进制的。

If the file does not begin with "solid ", and if the file size is exactly 84 + (numTriangles * 50) bytes, where numTriangles is read from offset 80, then the file is binary.

如果文件大小至少15字节(没有三角形的ASCII文件的绝对最小值),以solid开头,读取其后的名称达到。检查下一行是以facet开始还是以endsolid [name]允许)。如果facet,查找到文件的结尾,并确保它以endsolid [name]。如果所有这些都是真的,则文件是ASCII。

If the file size is at least 15 bytes (absolute minimum for an ASCII file with no triangles) and begins with "solid ", read the name that follows it until a line break is reached. Check if the next line either begins with "facet " or is "endsolid [name]" (no other value is allowed). If "facet ", seek to the end of the file and make sure it ends with a line that says "endsolid [name]". If all of these are true, the file is ASCII.

将任何其他组合视为无效。

Treat any other combination as invalid.

,像这样:

STL_STATUS getStlFileFormat(const QString &path)
{
    QFile file(path);
    if (!file.open(QIODevice::ReadOnly))
    {
        qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
        return STL_INVALID;
    }

    QFileInfo fileInfo(path);
    size_t fileSize = fileInfo.size();

    // Look for text "solid " in first 6 bytes, indicating the possibility that this is an ASCII STL format.

    if (fileSize < 15)
    {
        // "solid " and "endsolid " markers for an ASCII file
        qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
        return STL_INVALID;
    }

    // binary files should never start with "solid ", but
    // just in case, check for ASCII, and if not valid then
    // check for binary...

    QByteArray sixBytes = file.read(6);
    if (sixBytes.startsWith("solid "))
    {
        QByteArray name = file.readLine();
        QByteArray endLine = name.prepend("endsolid ");

        QByteArray nextLine = file.readLine();
        if (line.startsWith("facet "))
        {
            // TODO: seek to the end of the file, read the last line,
            // and make sure it is "endsolid [name]"...
            /*
            line = ...;
            if (!line.startsWith(endLine))
                return STL_INVALID;
            */
            return STL_ASCII;
        }
        if (line.startsWith(endLine))
            return STL_ASCII;

        // reset and check for binary...
        if (!file.reset())
        {
            qDebug("\n\tCannot seek to the 0th byte (before the header)");
            return STL_INVALID;
        }
    }

    if (fileSize < 84)
    {
        // 80-byte header + 4-byte "number of triangles" for a binary file
        qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
        return STL_INVALID;
    }

    // Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
    if (!file.seek(80))
    {
        qDebug("\n\tCannot seek to the 80th byte (after the header)");
        return STL_INVALID;
    }

    // Read the number of triangles, uint32_t (4 bytes), little-endian
    QByteArray nTrianglesBytes = file.read(4);
    if (nTrianglesBytes.size() != 4)
    {
        qDebug("\n\tCannot read the number of triangles (after the header)");
        return STL_INVALID;
    }            

    uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());

    // Verify that file size equals the sum of header + nTriangles value + all triangles
    if (fileSize == (84 + (nTriangles * 50)))
        return STL_BINARY;

    return STL_INVALID;
}

这篇关于验证STL文件是ASCII还是二进制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆