在DataTable上调用Fill()时出现内存不足异常 - 结果查询只有3列和160行 [英] Out of memory exception when calling Fill() on DataTable - resulting query is only 3 columns and 160 rows

查看:75
本文介绍了在DataTable上调用Fill()时出现内存不足异常 - 结果查询只有3列和160行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个SQLite数据库,它有一个包含1800万行和24列的表。我在C#中编写了一个SQL查询函数,我用Excel将其暴露给Excel。

I have a SQLite database that has a single table with 18 million rows and 24 columns. I have written a SQL query function in C#, which I expose to Excel with ExcelDNA.

这是完整的代码,



        string constr = constr =" Data Source =" + FilePath +" ;; Version = 3; Synchronous = OFF; temp_store = memory; cache_size = 700000; count_changes = off;" ;;

        DataTable dt = new DataTable();

       试试
        {                               

            SQLiteConnection conn = new SQLiteConnection(constr);

            SQLiteCommand command = new SQLiteCommand(SQLStatement,conn);                

            conn.Open();                

            SQLiteDataAdapter sda = new SQLiteDataAdapter(command);                                

            sda.Fill(dt);

            sda.Dispose();

            command.Dispose();

            conn.Dispose();

            int numRows =(IncludeHeaders?dt.Rows.Count + 1:dt.Rows.Count);

            object [,] ret = new object [numRows,dt.Columns.Count];

            int rowCount = 0;

            if(IncludeHeaders)

            {

                int colCount = 0;

                foreach(DataColumn col in dt.Columns)

                {

                    ret [rowCount,colCount] = col.ColumnName;

                    colCount ++;

                }¥b $ b            }¥b $ b            rowCount =(IncludeHeaders?1:0);

            foreach(dt.Rows中的DataRow行)

            {

                int colCount = 0;

                foreach(DataColumn col in dt.Columns)

                {

                    if(row [col]!= DBNull.Value)

                        ret [rowCount,colCount] = row [col];

                   否则

                        ret [rowCount,colCount] ="" ;;

                    colCount ++;

                }¥b $ b                rowCount ++;

            }¥b $ b           返回ret;

        }¥b $ b        catch(例外情况)

        {

            object [,] err = new object [1,1];

            err [0,0] = ex.ToString();

           返回错误;

        }¥b $ b       终于

        {

            dt.Clear();

            dt.Dispose();

            dt = null;  

        }


如果我运行两次查询(两次点击Shift + F9),我会得到一个OutOfMemoryException。在任务管理器中,我可以看到EXCEL.EXE映像的工作集(内存)在抛出异常之前从200MB变为1500MB。



然而,这种行为并不完全一致。我返回5列和1100行的其他查询工作得很好。我看到任务管理器中的内存使用情况有问题,一旦结果返回到Excel,我看到内存恢复了。
$


调试上面的应用程序显示它在`sda.Fill(dt)'线上绊倒。



会不会有任何想法?我会更好地使用`SQLiteDataReader`吗?或者我可以使用任何其他提示或技巧?谢谢。



顺便提一下,如果我通过Python运行确切的查询我没有遇到这个问题,所以我认为它与垃圾收集有关C#。



以下是数据库和查询的一些细节。架构如下:



   日期(VARCHAR)

    CompanyName(VARCHAR)

   金额(REAL)

    AggCode(VARCHAR)

    Level1 ... Level20(VARCHAR)



查询通常组合字段`Level9`,`Level5`,`AggCode`,`Date`,`CompanyName `在`WHERE`子句中。因此,除了原始表格外,我还配置了以下四个指数,即


   在我的(Level09,AggCode)上创建INDEX idx1;

   在我的(Level05,AggCode)上创建INDEX idx2;

   在我的(CompanyName,AggCode)上创建INDEX idx3;

   在我的(日期,AggCode)上创建INDEX idx4;
$


成功返回1100行和2列的查询,



    SELECT CompanyName,SUM(Amount)FROM my where Level09 =" T_EU"和AggCode ="R_S_V" GROUP BY CompanyName ORDER BY SUM(金额)DESC



抛出内存异常的查询,



    SELECT Date,CompanyName,Sum(Amount)FROM my WHERE Level05 =" M_TO" AND AggCode =" C_DTA" GROUP BY日期,公司名称



第二个查询在Python中返回163行和3列。

This is the complete code,

        string constr = constr = "Data Source=" + FilePath + ";Version=3;Synchronous=OFF;temp_store=memory;cache_size=700000;count_changes=off;";
        DataTable dt = new DataTable();
        try
        {                                
            SQLiteConnection conn = new SQLiteConnection(constr);
            SQLiteCommand command = new SQLiteCommand(SQLStatement,conn);                
            conn.Open();                
            SQLiteDataAdapter sda = new SQLiteDataAdapter(command);                                
            sda.Fill(dt);
            sda.Dispose();
            command.Dispose();
            conn.Dispose();
            int numRows = (IncludeHeaders ? dt.Rows.Count + 1 : dt.Rows.Count);
            object[,] ret = new object[numRows, dt.Columns.Count];
            int rowCount = 0;
            if (IncludeHeaders)
            {
                int colCount = 0;
                foreach (DataColumn col in dt.Columns)
                {
                    ret[rowCount, colCount] = col.ColumnName;
                    colCount++;
                }
            }
            rowCount = (IncludeHeaders ? 1 : 0);
            foreach (DataRow row in dt.Rows)
            {
                int colCount = 0;
                foreach (DataColumn col in dt.Columns)
                {
                    if (row[col] != DBNull.Value)
                        ret[rowCount, colCount] = row[col];
                    else
                        ret[rowCount, colCount] = "";
                    colCount++;
                }
                rowCount++;
            }
            return ret;
        }
        catch (Exception ex)
        {
            object[,] err = new object[1, 1];
            err[0, 0] = ex.ToString();
            return err;
        }
        finally
        {
            dt.Clear();
            dt.Dispose();
            dt = null;  
        }

If I run the query twice (two hits of Shift+F9), I get an OutOfMemoryException. In the Task Manager I can see the Working Set(Memory) of the EXCEL.EXE image go from 200MB to 1500MB before the exception is thrown.

However, this behavior isn't entirely consistent. Other queries where I return upto 5 columns and 1100 rows work just fine. I see the memory usage tick up in the Task Manager and once the results are returned to Excel I see the memory come back down.

Debugging the application above shows that it trips up at the `sda.Fill(dt)` line.

Would appreciate any thoughts? Would I be better off using `SQLiteDataReader` instead? Or are there any other tips or tricks I can use? Thank you.

Incidentally if I run the exact query via Python I don't get this problem, so I'd assume it's something to do with the garbage collection in C#.

Here are some details on the database and query. The schema is along the lines of,

    Date (VARCHAR)
    CompanyName (VARCHAR)
    Amount (REAL)
    AggCode (VARCHAR)
    Level1 ... Level20 (VARCHAR)

The queries are usually run combining the fields `Level9`, `Level5`, `AggCode`, `Date`, `CompanyName` in the `WHERE` clause. So apart from the raw table, I have also configured the following four indices,

    CREATE INDEX idx1 on my(Level09, AggCode);
    CREATE INDEX idx2 on my(Level05, AggCode);
    CREATE INDEX idx3 on my(CompanyName, AggCode);
    CREATE INDEX idx4 on my(Date, AggCode);

The query that returns 1100 rows and 2 columns successfully,

    SELECT CompanyName, SUM(Amount) FROM my where Level09="T_EU" and AggCode = "R_S_V" GROUP BY CompanyName ORDER BY SUM(Amount) DESC

The query that throws the memory exception,

    SELECT Date, CompanyName, Sum(Amount) FROM my WHERE Level05 ="M_TO" AND AggCode = "C_DTA" GROUP BY Date, CompanyName

The second query returns 163 rows and 3 columns in Python.

推荐答案

http://lauteikkehn.blogspot.com/2012/03/datatable-vs-list.html

也许,您需要了解如何使用自定义对象和列表< T>内存使用量比数据表少得多。

Maybe, you need to find out how to use a custom object and a List<T> that takes far less memory usage than a datatable.


这篇关于在DataTable上调用Fill()时出现内存不足异常 - 结果查询只有3列和160行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆