通过OpenXml SDK生成的XLSX文件有效和无效 [英] XLSX file via OpenXml SDK Both Valid and Invalid

查看:114
本文介绍了通过OpenXml SDK生成的XLSX文件有效和无效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个将System.Data.DataTable导出到XLSX/OpenXml Spreadsheet的程序.最后,它大部分都能正常工作了.但是,在Excel中打开电子表格时,Excel抱怨文件无效,需要修复,并显示此消息...

I have a program which exports a System.Data.DataTable to an XLSX / OpenXml Spreadsheet. Finally have it mostly working. However when opening the Spreadsheet in Excel, Excel complains about the file being invalid, and needing repair, giving this message...

我们发现中的某些内容存在问题.你要我们 尽我们所能地恢复?如果您相信 工作簿,老兄,是的.

We found a problem with some content in . Do you want us to try to recover as much as we can? If you trust the source of the workbook, clik Yes.

如果我单击是",它将返回此消息...

If I click Yes, it comes back with this message...

单击日志文件并将其打开,仅显示此内容...

Clicking the log file and opening that, just shows this...

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
        <logFileName>error268360_01.xml</logFileName>
        <summary>Errors were detected in file 'C:\Users\aabdi\AppData\Local\Temp\data.20190814.152538.xlsx'</summary>
        <repairedRecords>
            <repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1.xml part</repairedRecord>
        </repairedRecords>
    </recoveryLog> 

显然,我们不想将其部署到这样的生产环境中.因此,我一直在尝试找出解决方法.我根据

Obviously, we don't want to deploy this into a production environment like this. So I've been trying to figure out how to fix this. I threw together a quick little sample to validate the XML and show the errors, based on this link from MSDN. But when I run the program and load the exact same XLSX document that Excel complains about, the Validator comes back saying that the file is perfectly Valid. So I'm not sure where else to go from there.

有没有更好的工具来尝试验证我的XLSX XML?以下是我用来生成XLSX文件的完整代码. (是的,它在VB.NET中,它是一个旧版应用程序.)

Any better tools for trying to validate my XLSX XML? Following is the complete code I'm using to generate the XLSX file. (Yes, it's in VB.NET, it's a legacy app.)

如果我注释掉For Each dr As DataRow循环中的这一行,则XLSX文件可以在Excel中很好地打开(只是没有任何数据).因此,各个单元都有关系,但是我对它们的工作并不多.设置一个值和数据类型就可以了.

If I comment out the line in the For Each dr As DataRow loop, then the XLSX file opens fine in Excel, (just without any data). So it's something with the individual cells, but I'm not really DOING much with them. Setting a value and data type, and that's it.

我还尝试用以下内容替换ConstructDataRow中的For Each循环,但是它仍然输出相同的错误" XML ...

I also tried replacing the For Each loop in ConstructDataRow with the following, but it still outputs the same "bad" XML...

        rv.Append(
            (From dc In dr.Table.Columns
             Select ConstructCell(
                 NVL(dr(dc.Ordinal), String.Empty),
                 MapSystemTypeToCellType(dc.DataType)
             )
            ).ToArray()
        )

还尝试将每个单元格对Append的调用也替换为AppendChild,但这也无济于事.

Also tried replacing the call to Append with AppendChild for each cell too, but that didn't help either.

此处提供压缩的XLSX文件(错误,带有伪数据):
https://drive.google.com/open?id=1KVVWEqH7VHMxwbRA-Pn807SXHZ32oJWR

The zipped up XLSX file (erroring, with dummy data) is available here:
https://drive.google.com/open?id=1KVVWEqH7VHMxwbRA-Pn807SXHZ32oJWR

完整的数据表到Excel XLSX代码

    #Region " ToExcel "
    <Extension>
    Public Function ToExcel(ByVal target As DataTable) As Attachment
        Dim filename = Path.GetTempFileName()
        Using doc As SpreadsheetDocument = SpreadsheetDocument.Create(filename, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook)
            Dim data = New SheetData()

            Dim wbp = doc.AddWorkbookPart()
            wbp.Workbook = New Workbook()
            Dim wsp = wbp.AddNewPart(Of WorksheetPart)()
            wsp.Worksheet = New Worksheet(data)

            Dim sheets = wbp.Workbook.AppendChild(New Sheets())
            Dim sheet = New Sheet() With {.Id = wbp.GetIdOfPart(wsp), .SheetId = 1, .Name = "Data"}
            sheets.Append(sheet)

            data.AppendChild(ConstructHeaderRow(target))
            For Each dr As DataRow In target.Rows
                data.AppendChild(ConstructDataRow(dr)) '// THIS LINE YIELDS THE BAD PARTS
            Next

            wbp.Workbook.Save()
        End Using

        Dim attachmentname As String = Path.Combine(Path.GetDirectoryName(filename), $"data.{Now.ToString("yyyyMMdd.HHmmss")}.xlsx")
        File.Move(filename, attachmentname)
        Return New Attachment(attachmentname, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
    End Function

    Private Function ConstructHeaderRow(dt As DataTable) As Row
        Dim rv = New Row()
        For Each dc As DataColumn In dt.Columns
            rv.Append(ConstructCell(dc.ColumnName, CellValues.String))
        Next
        Return rv
    End Function

    Private Function ConstructDataRow(dr As DataRow) As Row
        Dim rv = New Row()
        For Each dc As DataColumn In dr.Table.Columns
            rv.Append(ConstructCell(NVL(dr(dc.Ordinal), String.Empty), MapSystemTypeToCellType(dc.DataType)))
        Next
        Return rv
    End Function

    Private Function ConstructCell(value As String, datatype As CellValues) As Cell
        Return New Cell() With {
        .CellValue = New CellValue(value),
        .DataType = datatype
        }
    End Function

    Private Function MapSystemTypeToCellType(t As System.Type) As CellValues
        Dim rv As CellValues
        Select Case True
            Case t Is GetType(String)
                rv = CellValues.String
            Case t Is GetType(Date)
                rv = CellValues.Date
            Case t Is GetType(Boolean)
                rv = CellValues.Boolean
            Case IsNumericType(t)
                rv = CellValues.Number
            Case Else
                rv = CellValues.String
        End Select

        Return rv
    End Function
    #End Region

推荐答案

对于其他进入并找到它的人,我最终将其追溯到Cell.DataType

For anyone else coming in and finding this, I finally tracked this down to the Cell.DataType

将值设置为CellValues.Date将导致Excel要修复"文档. (显然,对于日期,数据类型应为NULL ,并且Date仅在Office 2010中使用.

Setting a value of CellValues.Date will cause Excel to want to "fix" the document. (apparently for dates, the DataType should be NULL, and Date was only used in Office 2010).

此外,如果您将数据类型指定为CellValues.Boolean,则CellValue必须为0或1."true"/"false"也将导致Excel希望修复"电子表格.

Also, if you specify a DataType of CellValues.Boolean, then the CellValue needs to be either 0 or 1. "true" / "false" will also cause Excel to want to "fix" your spreadsheet.

此外,Microsoft已经构建了更好的验证器工具,可以在此处下载:
https://www.microsoft.com/zh-cn/download/details.aspx?id = 30425

Also, Microsoft has a better validator tool already built for download here:
https://www.microsoft.com/en-us/download/details.aspx?id=30425

这篇关于通过OpenXml SDK生成的XLSX文件有效和无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆