Sqoop创建包含多个记录的插入语句 [英] Sqoop creating insert statements containing multiple records

查看:822
本文介绍了Sqoop创建包含多个记录的插入语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在尝试将sqoop的数据加载到netezza。我们正面临以下问题。

  java.io.IOException:org.netezza.error.NzSQLException:错误:

示例输入数据集如下所示:

1,2,3



1,3,4



sqoop命令如下所示:

  sqoop export --table< tablename> --export-dir< path> 
--input:fields-terminated-by'\t'--input-lines-terminated-by'\\\
'--connect
'jdbc:netezza://< host> /< db> - 驱动程序org.netezza.Driver
- 用户名<用户名> --password< passwrd>

Sqoop以下列方式创建插入语句:



插入到(c1,c2,c3)值(1,2,3),(1,3,4)。

我们可以加载一条记录,但是当我们尝试将数据加载到多条记录时,错误如上所述。



您的帮助非常感谢。 肯定会有所帮助,但是如果您的导出记录数量非常大,这将导致导出过程非常缓慢,例如5百万。



为了解决这个问题,您需要添加以下内容:

1。)属性文件 sqoop.properties 这个属性 jdbc.transaction.isolation = TRANSACTION_READ_UNCOMMITTED (它可以避免导出期间出现死锁)

也需要在export命令中指定:



- connection-param-file /path/to/sqoop.properties



2)同时 sqoop.export.records.per.statement = 100 ,这会提高导出速度。



3。 )第三,您必须添加 - 批处理,使用批处理模式执行基础语句。



所以你最终的出口将如下所示:

  sqoop export -D sqoop.export.records.per.statement = 100  - -table< tablename> --export-dir< path> 
--input:fields-terminated-by'\t'--input-lines-terminated-by'\\\
'--connect
'jdbc:netezza://< host> /< db> - 驱动程序org.netezza.Driver
- 用户名<用户名> --password< passwrd>
--connection-param-file /path/to/sqoop.properties
--batch

希望这会有帮助。


we are trying to load the data from sqoop to netezza. And we are facing the following issue.

java.io.IOException: org.netezza.error.NzSQLException: ERROR:

Example Input dataset is as shown below:

1,2,3

1,3,4

sqoop command is as shown below:

sqoop export --table <tablename> --export-dir <path> 
--input-fields-terminated-by '\t' --input-lines-terminated-by '\n' --connect
'jdbc:netezza://<host>/<db>' --driver org.netezza.Driver 
--username <username> --password <passwrd>

The Sqoop is creating an insert statement in the following way:

insert into (c1,c2,c3) values (1,2,3),(1,3,4).

We are able to load one record but when we try to load the data to multiple records, the error is as said above.

Your help is highly appreciated.

解决方案

Making sqoop.export.records.per.statement=1 will definitely help but this will make the export process extremely slow if your export record count is very large say "5 Million".

To solve this you need add following things:

1.) A properties file sqoop.properties, it must contain this property jdbc.transaction.isolation=TRANSACTION_READ_UNCOMMITTED (It avoids deadlock during exports)

also in the export command you need to specify this:

--connection-param-file /path/to/sqoop.properties

2.) Also sqoop.export.records.per.statement=100, making this will increase the speed of export.

3.) Third you have to add --batch, Use batch mode for underlying statement execution.

So you final export will look like this,

sqoop export -D sqoop.export.records.per.statement=100 --table <tablename> --export-dir <path> 
--input-fields-terminated-by '\t' --input-lines-terminated-by '\n' --connect
'jdbc:netezza://<host>/<db>' --driver org.netezza.Driver 
--username <username> --password <passwrd>
--connection-param-file /path/to/sqoop.properties
--batch

Hope this will help.

这篇关于Sqoop创建包含多个记录的插入语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆