c#使用OpenXML过滤Excel [英] c# Filter Excel using OpenXML

查看:86
本文介绍了c#使用OpenXML过滤Excel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,

我正在编写一个c#WPF应用程序,该应用程序读取每天添加900,000 +行以及更多行的Excel工作表。我只对与今天的日期相对应的底部几百行感兴趣(日期存储在A栏中)。
$


我知道这张大小的数据表是数据库会更好,但是客户拒绝这一点,他们也不想提供只有今天价值的工作簿,所以我必须使用我得到的东西。

I am writing a c# WPF application that reads an Excel sheet with 900,000 + rows and more added daily. I am only interested in the bottom few hundred rows which correspond to today's date (Date is stored in column A).

I know with a sheet this size that a database would be preferable, but the Client is resisting this, nor do they want to provide a workbook with only today's values in it, so I have to work with what I get.

目前我是将整个工作表读入DataTable,不仅需要很长时间,而且资源很多。有没有办法使用OpenXML只读取今天的数据?

Currently I am reading the entire sheet in to a DataTable, which not only takes a long time, but is heavy on resources.is there a way, using OpenXML, to only read the data for today?

提前致谢

Arthur

推荐答案

Hello Arthur,

Hello Arthur,

抱歉延迟。

要解析大型Excel文件,您可以访问 

To parse large Excel file, you could visit 

如何:解析和阅读大型电子表格文档(Open XML SDK)

使用Open XML SDK解析和读取大型Excel文件

根据你的说明,你只想获得底部几百行的数据。如果行计数是固定的或行索引是每天的常规,您可以使用类似的代码,例如 

According to your description, you only want to get data at the bottom few hundred rows. If the row count is fixed or the row index is regular for every day, you could use similar code like 

   static void readExcelDataDOM(string filename)
        {
            using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open(filename, false))
            {
                WorkbookPart workbookPart = myDoc.WorkbookPart;
                WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
                SheetData sheetData =
                worksheetPart.Worksheet.Elements<SheetData>().First();
                Row r = sheetData.Elements<Row>().ElementAt(1);
                //foreach (Row r in sheetData.Elements<Row>())
                //{
                    foreach (Cell c in r.Elements<Cell>())
                    {
                        string text = c.CellValue.Text;
                        Console.WriteLine(text);
                    }
                //}
            }
        }


请注意,不同的单元格数据类型需要不同的方法。您可以访问 

如何:检索值电子表格文档中的单元格(Open XML SDK)

// If the cell represents an integer number, you are done. 
            // For dates, this code returns the serialized value that 
            // represents the date. The code handles strings and 
            // Booleans individually. For shared strings, the code 
            // looks up the corresponding value in the shared string 
            // table. For Booleans, the code converts the value into 
            // the words TRUE or FALSE.
            if (theCell.DataType != null)
            {
                switch (theCell.DataType.Value)
                {
                    case CellValues.SharedString:
                        
                        // For shared strings, look up the value in the
                        // shared strings table.
                        var stringTable = 
                            wbPart.GetPartsOfType<SharedStringTablePart>()
                            .FirstOrDefault();
                        
                        // If the shared string table is missing, something 
                        // is wrong. Return the index that is in
                        // the cell. Otherwise, look up the correct text in 
                        // the table.
                        if (stringTable != null)
                        {
                            value = 
                                stringTable.SharedStringTable
                                .ElementAt(int.Parse(value)).InnerText;
                        }
                        break;

                    case CellValues.Boolean:
                        switch (value)
                        {
                            case "0":
                                value = "FALSE";
                                break;
                            default:
                                value = "TRUE";
                                break;
                        }
                        break;
                }

对于2017/7/18这样的日期,单元格值将为42934,作为存储在单元格中的整数。 

For a date like 2017/7/18, the cell value would 42934 as an integer number stored in cell. 

对于某些字符串,例如"test",单元格值将是一个整数,表示SharedStringTable的索引。因此,要获取单元格值为sharedstring的单元格值,您需要获取索引并在SharedStringTable中找到它。

For some strings like "test", the cell value would be an integer number which represents the index of SharedStringTable. So to get cell value which datatype is sharedstring, you need to get the index and find it in the SharedStringTable.

如果您的工作表很大,我认为它可能需要很长时间才能成为DataTable。

If your sheet is large, i think it may take long time as DataTable.

问候,

Celeste


这篇关于c#使用OpenXML过滤Excel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆