使用OpenXML sdk无法读取excel文件 [英] Having trouble reading excel file with the OpenXML sdk

查看:158
本文介绍了使用OpenXML sdk无法读取excel文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个从excel文件读取并将结果存储在DataSet中的函数。我有另一个写入excel文件的函数。当我尝试从一个常规的人工生成的excel文件中读取时,excel读取函数返回一个空白DataSet,但是当我从写入函数生成的excel文件中读取时,它工作得很好。该函数然后不会在常规生成的excel文件上工作,即使我只是复制和粘贴函数生成的excel文件的内容。我终于找到了这个,但我不知道从这里去哪里。我的代码有什么问题吗?任何帮助是极大的赞赏。感谢提前!



这里是excel生成函数。

  public static Boolean writeToExcel(string fileName,DataSet data)
{
Boolean answer = false;
using(SpreadsheetDocument excelDoc = SpreadsheetDocument.Create(tempPath + fileName,SpreadsheetDocumentType.Workbook))

{

WorkbookPart workbookPart = excelDoc.AddWorkbookPart();

workbookPart.Workbook = new Workbook();

WorksheetPart worksheetPart = workbookPart.AddNewPart< WorksheetPart>();

表单= excelDoc.WorkbookPart.Workbook.AppendChild< Sheets>(new Sheets());
Sheet sheet = new Sheet()
{
Id = excelDoc.WorkbookPart.GetIdOfPart(worksheetPart),
SheetId = 1,
Name =Page1
};

sheets.Append(sheet);

CreateWorkSheet(worksheetPart,data);
answer = true;
}
返回答案;

}
private static void CreateWorkSheet(WorksheetPart worksheetPart,DataSet data)
{
工作表工作表=新的Worksheet();

SheetData sheetData = new SheetData();

UInt32Value currRowIndex = 1U;
int colIndex = 0;
行excelRow;
DataTable table = data.Tables [0];

for(int rowIndex = -1; rowIndex< table.Rows.Count; rowIndex ++)
{
excelRow = new Row();
excelRow.RowIndex = currRowIndex ++;
(colIndex = 0; colIndex< table.Columns.Count; colIndex ++)
{
单元格=新单元格()
{
CellReference = Convert。 ToString(Convert.ToChar(65 + colIndex)),
DataType = CellValues.String
};

CellValue cellValue = new CellValue();
if(rowIndex == -1)
{
cellValue.Text = table.Columns [colIndex] .ColumnName.ToString();
}
else
{
cellValue.Text =(table.Rows [rowIndex] .ItemArray [colIndex] .ToString()!=)? table.Rows [rowIndex] .ItemArray [colIndex] .ToString():*;
}

cell.Append(cellValue);
excelRow.Append(cell);
}
sheetData.Append(excelRow);
}
SheetFormatProperties formattingProps = new SheetFormatProperties()
{
DefaultColumnWidth = 20D,
DefaultRowHeight = 20D
};

worksheet.Append(formattingProps);
worksheet.Append(sheetData);
worksheetPart.Worksheet =工作表;

}

而阅读功能如下:

  public static void readInventoryExcel(string fileName,ref DataSet set)
{
using(SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName ,false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements< SheetData>()。First();
int count = -1;
foreach(sheetData.Elements< Row>())中的行r
{

if(count> = 0)
{
DataRow行= set.Tables [0] .NewRow();
row [SerialNumber] = r.ChildElements [1] .InnerXml;
row [PartNumber] = r.ChildElements [2] .InnerXml;
row [EntryDate] = r.ChildElements [3] .InnerXml;
row [RetirementDate] = r.ChildElements [4] .InnerXml;
row [ReasonForReplacement] = r.ChildElements [5] .InnerXml;
row [RetirementTech] = r.ChildElements [6] .InnerXml;
row [IncludeInMaintenance] = r.ChildElements [7] .InnerXml;
row [MaintenanceTech] = r.ChildElements [8] .InnerXml;
row [Comment] = r.ChildElements [9] .I​​nnerXml;
row [Station] = r.ChildElements [10] .InnerXml;
row [LocationStatus] = r.ChildElements [11] .InnerXml;
row [AssetName] = r.ChildElements [12] .InnerXml;
row [InventoryType] = r.ChildElements [13] .InnerXml;
row [Description] = r.ChildElements [14] .InnerXml;
set.Tables [0] .Rows.Add(row);
}
count ++;
}
}


解决方案

认为这是由于您只有一个工作表,而Excel有三个。我不确定,但我认为这些表格是以相反的顺序返回,所以你应该更改行:

  WorksheetPart worksheetPart = workbookPart .WorksheetParts.First(); 

  WorksheetPart worksheetPart = workbookPart.WorksheetParts.Last(); 

搜索 WorksheetPart 如果您可以通过工作表名称来识别它。您首先需要找到 Sheet ,然后使用该ID找到

  private WorksheetPart GetWorksheetPartBySheetName(WorkbookPart workbookPart,string sheetName)
{
//首先查找工作表。
IEnumerable< Sheet> sheet = workbookPart.Workbook.GetFirstChild< Sheets>()。Elements< Sheet>()。其中​​(s => s.Name == sheetName);

if(sheets.Count()> 0)
{
string relationId = sheets.First()。Id.Value;
WorksheetPart worksheetPart =(WorksheetPart)workbookPart.GetPartById(relationshipId);
return worksheetPart;
}

返回null;
}

然后您可以使用:

  WorksheetPart worksheetPart = GetWorksheetPartBySheetName(workbookPart,Sheet1); 

还有一些其他的事情,我注意到看你的代码,你可以(或可能没有!)感兴趣的是:



在你的代码中,你只读取 InnerXml ,所以可能不重要对Excel而言,Excel存储字符串的方式与您正在编写它们的方式不同,因此读取Excel生成的文件可能不会给您您期望的值。在您的示例中,您将直接将字符串写入单元格:





但是Excel使用SharedStrings概念,其中所有字符串都写入一个名为sharedStrings.xml的单独的XML文件。该文件包含Excel文件中带有引用的字符串,它的存储在表格XML中的单元格值中。



sharedString.xml如下所示:





然后单元格如下所示:





47 < v> 元素中是对第47个共享字符串的引用。请注意,您生成的XML中的类型( t 属性)为 str ,但Excel生成的文件中的类型为取值。这表示你的是一个内联字符串,它们是一个共享字符串。



您可以像其他任何部分一样阅读SharedStrings:

  var stringTable = workbookPart.GetPartsOfType< SharedStringTablePart>()。FirstOrDefault(); 

if(stringTable!= null)
{
sharedString = stringTable.SharedStringTable.ElementAt(int.Parse(value))。InnerText;
}

其次,如果您查看代码生成的单元格引用和单元格参考Excel生成你可以看到你只输出列而不是输出行(例如,输出 A 而不是 A1 )。要解决这个问题,你应该改变一下:

  CellReference = Convert.ToString(Convert.ToChar(65 + colIndex)),

  CellReference = Convert.ToString(Convert.ToChar(65 + colIndex)+ rowIndex.ToString()),

我希望有帮助。


I have a function that reads from an excel file and stores the results in a DataSet. I have another function that writes to an excel file. When I try to read from a regular human generated excel file, the excel reading function returns a blank DataSet, but when I read from the excel file generated by the writing function, it works perfectly fine. The function then will not work on a regular generated excel file, even when I just copy and paste the contents of the function generated excel file. I finally tracked it down to this, but I have no idea where to go from here. Is there something wrong with my code? Any help is greatly appreciated. Thanks in advance!

Here is the excel generating function.

public static Boolean writeToExcel(string fileName, DataSet data)
    {
        Boolean answer = false;
        using (SpreadsheetDocument excelDoc = SpreadsheetDocument.Create(tempPath + fileName, SpreadsheetDocumentType.Workbook))

        {

            WorkbookPart workbookPart = excelDoc.AddWorkbookPart();

            workbookPart.Workbook = new Workbook();

            WorksheetPart worksheetPart = workbookPart.AddNewPart<WorksheetPart>();

            Sheets sheets = excelDoc.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());
            Sheet sheet = new Sheet()
            {
                Id = excelDoc.WorkbookPart.GetIdOfPart(worksheetPart),
                SheetId = 1,
                Name = "Page1"
            };

            sheets.Append(sheet);

            CreateWorkSheet(worksheetPart, data);
            answer = true;
        }
        return answer;

    }
private static void CreateWorkSheet(WorksheetPart worksheetPart, DataSet data)
    {
        Worksheet worksheet = new Worksheet();

        SheetData sheetData = new SheetData();

        UInt32Value currRowIndex = 1U;
        int colIndex = 0;
        Row excelRow;
        DataTable table = data.Tables[0];

        for (int rowIndex = -1; rowIndex < table.Rows.Count; rowIndex++)
        {
            excelRow = new Row();
            excelRow.RowIndex = currRowIndex++;
            for (colIndex = 0; colIndex < table.Columns.Count; colIndex++)
            {
                Cell cell = new Cell()
                {
                    CellReference = Convert.ToString(Convert.ToChar(65 + colIndex)),
                    DataType = CellValues.String
                };

                CellValue cellValue = new CellValue();
                if (rowIndex == -1)
                {
                    cellValue.Text = table.Columns[colIndex].ColumnName.ToString();
                }
                else
                {
                    cellValue.Text = (table.Rows[rowIndex].ItemArray[colIndex].ToString() != "") ? table.Rows[rowIndex].ItemArray[colIndex].ToString() : "*";
                }

                cell.Append(cellValue);
                excelRow.Append(cell);
            }
            sheetData.Append(excelRow);
        }
        SheetFormatProperties formattingProps = new SheetFormatProperties()
        {
            DefaultColumnWidth = 20D,
            DefaultRowHeight = 20D
        };

        worksheet.Append(formattingProps);
        worksheet.Append(sheetData);
        worksheetPart.Worksheet = worksheet;

    }

while the reading function is as following

public static void readInventoryExcel(string fileName, ref DataSet set)
    {
        using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
        {
            WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
            WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
            SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
            int count = -1;
            foreach (Row r in sheetData.Elements<Row>())
            {

                if (count >= 0)
                {
                    DataRow row = set.Tables[0].NewRow();
                    row["SerialNumber"] = r.ChildElements[1].InnerXml;
                    row["PartNumber"] = r.ChildElements[2].InnerXml;
                    row["EntryDate"] = r.ChildElements[3].InnerXml;
                    row["RetirementDate"] = r.ChildElements[4].InnerXml;
                    row["ReasonForReplacement"] = r.ChildElements[5].InnerXml;
                    row["RetirementTech"] = r.ChildElements[6].InnerXml;
                    row["IncludeInMaintenance"] = r.ChildElements[7].InnerXml;
                    row["MaintenanceTech"] = r.ChildElements[8].InnerXml;
                    row["Comment"] = r.ChildElements[9].InnerXml;
                    row["Station"] = r.ChildElements[10].InnerXml;
                    row["LocationStatus"] = r.ChildElements[11].InnerXml;
                    row["AssetName"] = r.ChildElements[12].InnerXml;
                    row["InventoryType"] = r.ChildElements[13].InnerXml;
                    row["Description"] = r.ChildElements[14].InnerXml;
                    set.Tables[0].Rows.Add(row);
                }
                count++;
            }
        }

解决方案

I think this is caused by the fact that you have only one sheet whereas Excel has three. I'm not certain but I think the sheets are returned in reverse order so you should change the line:

WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

to

WorksheetPart worksheetPart = workbookPart.WorksheetParts.Last();

It might be safer to search for the WorksheetPart if you can identify it by the sheet name. You need to find the Sheet first then use the Id of that to find the SheetPart:

private WorksheetPart GetWorksheetPartBySheetName(WorkbookPart workbookPart, string sheetName)
{
    //find the sheet first.
    IEnumerable<Sheet> sheets = workbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName);

    if (sheets.Count() > 0)
    {
        string relationshipId = sheets.First().Id.Value;
        WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(relationshipId);
        return worksheetPart;
    }

    return null;
}

You can then use:

WorksheetPart worksheetPart = GetWorksheetPartBySheetName(workbookPart, "Sheet1");

There are a couple of other things I've noticed whilst looking at your code which you may (or may not!) be interested in:

In your code you are only reading the InnerXml so it might not matter to you but the way Excel stores strings is different to the way you are writing them so reading an Excel generated file may not give you the values you expect. In your example you are writing the string directly to the cell like this:

But Excel uses a SharedStrings concept where all strings are written to a separate XML file called sharedStrings.xml. That file contains the strings used in the Excel file with a reference and it's that value that is stored in the cell value in the sheet XML.

The sharedString.xml looks like this:

And the Cell then looks like this:

The 47 in the <v> element is a reference to the 47th shared string. Note that the type (the t attribute) in your generated XML is str but the type in the Excel generated file is s. This denotes yours is an inline string and theirs is a shared string.

You can read the SharedStrings just as you would any other part:

var stringTable = workbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();

if (stringTable != null)
{
    sharedString = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
}

Secondly, if you look at the cell reference that your code generates and the cell reference that Excel generates you can see you are only outputting the column and not the row (e.g. you output A instead of A1). To fix this you should change the line:

CellReference = Convert.ToString(Convert.ToChar(65 + colIndex)),

to

CellReference = Convert.ToString(Convert.ToChar(65 + colIndex) + rowIndex.ToString()),

I hope that helps.

这篇关于使用OpenXML sdk无法读取excel文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆