从 Sqlite 到 SQL Server 读取大量数据在预执行时失败 [英] Reading Huge volume of data from Sqlite to SQL Server fails at pre-execute

查看:31
本文介绍了从 Sqlite 到 SQL Server 读取大量数据在预执行时失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的 (26GB) sqlite 数据库,我想用 SSIS 将它导入到 SQL Server.

我已经正确设置了所有内容.部分数据流工作正常并正在导入数据.

数据流很简单.它们仅由源和目标组成.

但是当涉及到一个有 8000 万行的表时,数据流会失败并显示以下无用消息:

<块引用>

代码:0xC0047062
来源:数据流任务源 9 - nibrs_bias_motivation [55]
说明:System.Data.Odbc.OdbcException (0x80131937): ERROR [HY000] 未知错误 (7)

在 System.Data.Odbc.OdbcConnection.HandleError(OdbcHandle hrHandle, RetCode retcode)
在 System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior 行为,String 方法,Boolean needReader,Object[] methodArguments,SQL_API odbcApiMethod)
在 System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior 行为,String 方法,Boolean needReader)
在 System.Data.Odbc.OdbcCommand.ExecuteReader(CommandBehavior 行为)
在 System.Data.Odbc.OdbcCommand.ExecuteDbDataReader(CommandBehavior 行为)
在 System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior 行为)
在 Microsoft.SqlServer.Dts.Pipeline.DataReaderSourceAdapter.PreExecute()
在 Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostPreExecute(IDTSManagedComponentWrapper100 包装器)

而在此任务失败之前,内存使用率上升到 99%,然后任务失败.这让我觉得这是一个内存问题.但我不知道我该如何解决这个问题.

我尝试在所有数据流任务上将 DelayValidation 设置为 true.没有改变.我玩过缓冲区大小.没什么.

我能做什么?

解决方案

分步指南

由于从大数据集读取时抛出错误,尝试分块读取数据,可以按照以下步骤实现:

  1. 声明 2 个 Int32 类型的变量(@[User::RowCount]@[User::IncrementValue])
  2. 添加一个 Execute SQL Task 来执行一个 select Count(*) 命令并将结果集存储到变量 @[User::RowCount]

  1. 添加具有以下首选项的 For 循环:

  1. 在for循环容器内添加一个数据流任务
  2. 在数据流任务中添加ODBC SourceOLEDB Destination
  3. 在 ODBC Source 中选择 SQL Command 选项并编写一个 SELECT * FROM TABLE 查询 *(仅检索元数据`
  4. 映射源和目标之间的列
  5. 返回Control flow并点击Data flow task并点击F4查看属性窗口
  6. 在属性窗口中,转到表达式并将以下表达式分配给 [ODBC Source].[SQLCommand] 属性:(有关更多信息,请参阅

    参考文献

    I have a huge (26GB) sqlite database that I want to import to SQL Server with SSIS.

    I have everything setup correctly. Some of the data flows are working correctly and importing the data.

    Data flows are simple. They just consist of source and destination.

    But when it comes to a table that has 80 million rows, data flow fails with this unhelpful message:

    Code: 0xC0047062
    Source: Data Flow Task Source 9 - nibrs_bias_motivation [55]
    Description: System.Data.Odbc.OdbcException (0x80131937): ERROR [HY000] unknown error (7)

    at System.Data.Odbc.OdbcConnection.HandleError(OdbcHandle hrHandle, RetCode retcode)
    at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader, Object[] methodArguments, SQL_API odbcApiMethod)
    at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader)
    at System.Data.Odbc.OdbcCommand.ExecuteReader(CommandBehavior behavior)
    at System.Data.Odbc.OdbcCommand.ExecuteDbDataReader(CommandBehavior behavior)
    at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)
    at Microsoft.SqlServer.Dts.Pipeline.DataReaderSourceAdapter.PreExecute()
    at Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostPreExecute(IDTSManagedComponentWrapper100 wrapper)

    And before this task fails, memory usage goes up to 99%, then the task fails. This made me think its a memory issue. But I don't know how can I solve this.

    I tried setting DelayValidation to true on all data flow tasks. Nothing changed. I played with the buffer sizes. Nothing.

    What can I do?

    解决方案

    Step by Step guide

    Since the error is thrown when reading from a large dataset, try reading data by chunks, to achieve that you can follow these steps:

    1. Declare 2 Variables of type Int32 (@[User::RowCount] and @[User::IncrementValue])
    2. Add an Execute SQL Task that execute a select Count(*) command and store the Result Set into the variable @[User::RowCount]

    1. Add a For Loop with the following preferences:

    1. Inside the for loop container add a Data flow task
    2. Inside the dataflow task add an ODBC Source and OLEDB Destination
    3. In the ODBC Source select SQL Command option and write a SELECT * FROM TABLE query *(to retrieve metadata only`
    4. Map the columns between source and destination
    5. Go back to the Control flow and click on the Data flow task and hit F4 to view the properties window
    6. In the properties window go to expression and Assign the following expression to [ODBC Source].[SQLCommand] property: (for more info refer to How to pass SSIS variables in ODBC SQLCommand expression?)

      "SELECT * FROM MYTABLE ORDER BY ID_COLUMN
      LIMIT 500000
      OFFSET " + (DT_WSTR,50)@[User::IncrementValue]"
      

    Where MYTABLE is the source table name, and IDCOLUMN is your primary key or identity column.

    Control Flow Screenshot

    References

    这篇关于从 Sqlite 到 SQL Server 读取大量数据在预执行时失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆