EPPlus使用200列以上的数据表将200万行保存到多个Excel文件 [英] EPPlus save two million rows with 200+ columns datatable to multiple excel files

查看:155
本文介绍了EPPlus使用200列以上的数据表将200万行保存到多个Excel文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我具有使用EPPlus将SQL表中的所有记录保存到excel工作表的功能。
如果我导出少量数据,则一切正常,但是具有200+列和500000+行时,出现OutOfMemory异常。

I have function that saves all records from SQL table to excel worksheet using EPPlus. If I export small amount of data everything works fine, but with 200+ columns and 500 000+ rows I get OutOfMemory exception.

我想修改我的代码,使其每个文件可以保存5万条记录。

I'd like to modify my code in a way to be able to save 50 000 records per file.

这是我的适用于小数据的代码:

Here is my code that works for small data:

private Task SaveAsync(string tableName)
{

    return Task.Run(() =>
    {
        try
        {
            using (var conn = new SqlConnection(_connectionString))
            {
                using (var cmd = new SqlCommand(string.Format(DataQuery, tableName), conn))
                {
                    cmd.CommandType = CommandType.Text;
                    cmd.CommandTimeout = 360;
                    conn.Open();
                    using (SqlDataReader sdr = cmd.ExecuteReader())
                    {
                        var fileName = string.Format(TargetFile, tableName);
                        if (File.Exists(fileName))
                        {
                            File.Delete(fileName);
                        }

                        sdr.Read();
                        var numberOfRecordsInTable = sdr.GetInt32(0);

                        sdr.NextResult();

                        using (ExcelPackage pck = new ExcelPackage(new FileInfo(fileName)))
                        {
                            ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Results");

                            int count = sdr.FieldCount;
                            int col = 1, row = 1;

                            for (int i = 0; i < count; i++)
                            {
                                ws.SetValue(row, col++, sdr.GetName(i));
                            }
                            row++;
                            col = 1;
                            while (sdr.Read())
                            {
                                for (int i = 0; i < count; i++)
                                {
                                    var val = sdr.GetValue(i);
                                    ws.SetValue(row, col++, val);
                                }
                                row++;
                                col = 1;
                            }
                            //autosize
                            ws.Cells[ws.Dimension.Address].AutoFitColumns();
                            //autofiltr
                            ws.Cells[1, 1, 1, count].AutoFilter = true;
                        }
                    }
                    conn.Close();
                }
            }
        }
        catch (Exception e)
        {
            Debug.WriteLine("Error at: " + Thread.CurrentThread.ManagedThreadId);
            Debug.WriteLine(e);
        }
    });
}

和我修改过的代码,每个文件分割记录50000:

and my modified code that splits records 50 000 per file:

private Task SaveAsync2(string tableName)
{
    return Task.Run(() =>
    {
        try
        {
            using (var conn = new SqlConnection(_connectionString))
            {
                using (var cmd = new SqlCommand(string.Format(DataQuery, tableName), conn))
                {
                    cmd.CommandType = CommandType.Text;
                    cmd.CommandTimeout = 360;
                    conn.Open();
                    using (SqlDataReader sdr = cmd.ExecuteReader())
                    {

                        var fileName = string.Format(TargetFile, tableName,"");
                        if (File.Exists(fileName))
                        {
                            File.Delete(fileName);
                        }

                        sdr.Read();
                        var max = sdr.GetInt32(0);
                        int filesCount = 1;
                        if (max > 50000)
                        {
                            fileName = string.Format(TargetFile, tableName, filesCount);
                        }

                        sdr.NextResult();

                        ExcelPackage pck = new ExcelPackage(new FileInfo(fileName));
                        ExcelWorksheet ws = pck.Workbook.Worksheets.Add("RESULTS");

                        int count = sdr.FieldCount;

                        int col = 1, row = 1;

                        for (int i = 0; i < count; i++)
                        {
                            ws.SetValue(row, col++, sdr.GetName(i));
                        }
                        row++;
                        col = 1;
                        while (sdr.Read())
                        {
                            for (int i = 0; i < count; i++)
                            {
                                var val = sdr.GetValue(i);
                                ws.SetValue(row, col++, val);
                            }
                            row++;
                            col = 1;

                            if (row > 50000)
                            {
                                pck.Save();
                                filesCount++;
                                fileName = string.Format(TargetFile, tableName, filesCount);

                                pck = new ExcelPackage(new FileInfo(fileName));
                                ws = pck.Workbook.Worksheets.Add("RESULTS");

                                count = sdr.FieldCount;

                                col = 1;
                                row = 1;

                                for (int i = 0; i < count; i++)
                                {
                                    ws.SetValue(row, col++, sdr.GetName(i));
                                }
                                row++;
                                col = 1;
                            }
                        }

                        //autosize
                        ws.Cells[ws.Dimension.Address].AutoFitColumns();
                        //autofiltr
                        ws.Cells[1, 1, 1, count].AutoFilter = true;

                        pck.Save();
                    }
                }
                conn.Close();

            }
        }
        catch (Exception e)
        {
            Debug.WriteLine("Error at: " + Thread.CurrentThread.ManagedThreadId);
            Debug.WriteLine(e);
        }
    });
}

基本上可以正常工作,但是在我的代码的第一个版本中,我使用了所有功能在 using 语句中,在第二个版本中,我两次调用相同的代码。

basically this works fine, but in first version of my code I was using everything inside using statement, when in second version I'm calling same code twice.


  1. 如何修复代码以删除重复的代码并将所有内容放入其中。

  2. 是否可以将下一组(50,000条记录)添加为新工作表,而不是创建新文件?
  3. >
  4. 将数据保存到文件时,EPPlus的限制是多少? 行x列?我发现,EPPlus应该处理超过一百万行,但没有那么多列。我认为我可以单列导出一百万行,但是对于我来说200+列是5万行的限制。我想知道是否有数量(行x列)将限制我的导出正常运行。我希望该导出功能具有通用性,因此当我传递具有50列的数据表时,它将导出例如每个文件100000行,而对于2列则将导出每个文件50万行。

  1. How can I fix my code to remove duplicate code and put everything inside using.
  2. Can I add next set (50 000 records) as new worksheet instead of creating new file?
  3. What would be EPPlus limit when saving data to file? rows x columns? I found information that EPPlus should handle more than million rows, but not so much columns as I have. I thinks that I can export million rows with single column, but for 200+ columns for me 50 000 rows is limit. I'm wondering if there is number (rows x columns) that will be limit to which my export will work fine. I want that export function to be universal, so when I pass datatable with 50 columns it will export for example 100 000 rows per file and for 2 columns it will export half million per file.


推荐答案

不幸的是,没有简单的方法可以将大量数据与Epplus合并到一个文件中。基本上,整个文件在打开时都会加载到内存中-全部或全部不加载。从理论上讲,您可以生成XLSX包含的XML文件(它们是重命名的zip文件),然后手动插入它,因为它具有较小的内存占用空间,但这并不容易。

Unfortunately, there is no easy way to merge that much data with Epplus in a single file. Basically, the entire file is loaded into memory when open - its either all or nothing. In theory, you could generate the XML files that XLSX contains (they are zip files renamed) and manually insert it since it would have a smaller memory footprint but that is no small feat.

对于您当前的代码,如果要避免使用using语句,总是可以手动调用 .dispose()。但我知道您要避免重复的代码。这样的事情怎么样(但是在复制所有对象数据时要注意内存使用情况):

For you current code, you could always just call .dispose() manually if you want to avoid the using statement. But I understand you wanting to avoid duplicate code. What about something like this (but watch for memory usage when copying all the object data):

const int max = 10;
var loop = 0;

using (var sdr = cmd.ExecuteReader())
{
    var fieldcount = sdr.FieldCount;

    var getfi = new Func<int, FileInfo>(i =>
    {
        var fi = new FileInfo(String.Format(@"c:\temp\Multi_Files{0}.xlsx", i));
        if (fi.Exists) fi.Delete();
        return fi;
    });

    var savefile = new Action<FileInfo, List<Object[]>>((info, rows) =>
    {
        using (var pck = new ExcelPackage(info))
        {
            var wb = pck.Workbook;
            var ws = wb.Worksheets.Add("RESULTS");
            for (var row = 0; row < rows.Count; row++)
                for (var col = 0; col < fieldcount; col++)
                    ws.SetValue(row + 1, col + 1, rows[row][col]);
            pck.Save();
        }
    });

    var rowlist = new List<Object[]>();

    while (sdr.Read())
    {
        var rowdata = new Object[sdr.FieldCount];
        sdr.GetValues(rowdata);
        rowlist.Add(rowdata);

        if (rowlist.Count == max)
        {
            savefile(getfi(++loop), rowlist);
            rowlist.Clear();
        }
    }
    if (rowlist.Count > 0)
        savefile(getfi(++loop), rowlist);
}

这篇关于EPPlus使用200列以上的数据表将200万行保存到多个Excel文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆