Hadoop DBWritable:无法将记录从Hadoop缩减器插入到MySQL [英] Hadoop DBWritable : Unable to insert record to mysql from Hadoop reducer

查看:362
本文介绍了Hadoop DBWritable:无法将记录从Hadoop缩减器插入到MySQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我已经使用Hadoop映射器来从文件中读取记录,它成功完全从文件中读取记录。但是while



java.io.IOException:关键'PRIMARY'的重复条目'505975648'



但是Mysql表仍然是空的。无法将记录写入Hadoop DBWritable reducer的mysql表。



以下是错误日志:

警告:com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:Connection.close()已被调用。在此状态下操作无效。
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl。 java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
在com.mysql.jdbc.Util.getInstance(Util.java:381)
在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:984)
在com.mysql.jdbc.SQLError .createSQLException(SQLError.java:956)
在com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
在com.mysql.jdbc.ConnectionImpl.getMutex(ConnectionImpl.java:3018 )
处org.apache.hadoop.mapred.lib.db.DBOutputFormat $ DBRecordWriter.close com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4564)
(DBOutputFormat.java:72 )在org.apac
在org.apache.hadoop.mapred.ReduceTask.runOldReducer he.hadoop.mapred.ReduceTask $ OldTrackingRecordWriter.close(ReduceTask.java:467)
(ReduceTask.java:539)
。在org.apache。 hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
。在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:262)


$ b $ org.apache.hadoop.mapred.LocalJobRunner $作业运行
警告:job_local_0001
java.io.IOException:重复在'org.apache.hadoop.mapred.lib.db.DBOutputFormat $ DBRecordWriter.close(DBOutputFormat.java:77)
'关键'PRIMARY'中输入'505975648' 在org.apache.hadoop.mapred.ReduceTask $ OldTrackingRecordWriter.close(ReduceTask.java:467)
在org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:531)
。在组织.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
。在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:262)




  / ** {@inheritDoc} * / 
public void close(TaskAttemptContext context)throws IOException {
try {
LOG.warn(Executing statement:+ statement);

statement.executeBatch();
connection.commit();
} catch(SQLException e){
try {
connection.rollback();
}
catch(SQLException ex){
LOG.warn(StringUtils.stringifyException(ex));
}
抛出新的IOException(e.getMessage());
} finally {
try {
statement.close();
connection.close();
}
catch(SQLException ex){
throw new IOException(ex.getMessage());



$ / code $ / pre
$ b $您可以检查一般登录到MySQL端来查看是否有任何东西被执行。赔率是你会看到你的交易回滚的基础上的错误。要解决此问题,请确保主键是唯一的。如果更新/插入是您想要的,您可以创建一个输出/记录编写器,但这是一个不同的任务。


Facing duplicate entry problem while inserting to the table.

I have been used Hadoop mapper for reading record from file.It success fully reads record from file.But while writing the record to mysql data base by Hadoop reducer, following error occured.

java.io.IOException: Duplicate entry '505975648' for key 'PRIMARY'

But Mysql table is remains empty.Unable to write the record to mysql table from Hadoop DBWritable reducer.

Following is error log:

WARNING: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Connection.close() has already been called. Invalid operation in this state. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at com.mysql.jdbc.Util.handleNewInstance(Util.java:406) at com.mysql.jdbc.Util.getInstance(Util.java:381) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:984) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926) at com.mysql.jdbc.ConnectionImpl.getMutex(ConnectionImpl.java:3018) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4564) at org.apache.hadoop.mapred.lib.db.DBOutputFormat$DBRecordWriter.close(DBOutputFormat.java:72) at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:539) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:262)

Jun 04, 2014 1:23:36 PM org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local_0001 java.io.IOException: Duplicate entry '505975648' for key 'PRIMARY' at org.apache.hadoop.mapred.lib.db.DBOutputFormat$DBRecordWriter.close(DBOutputFormat.java:77) at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:531) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:262)

解决方案

The DBOutputFormat / DBRecordWriter does everything in a database transaction. While you may have nothing in the table right now, if you try to do two inserts in the same transaction with the same primary key, you will get this error, which is what is happening. To better trace this, you can add logging. You can do this by taking the code for the DBOutputFormat and making a new similarly named class. I called mine LoggingDBOutputFormat. Update your job code to use this new output format instead. For the new output format you change the close method to log your statements before they are executed:

    /** {@inheritDoc} */
public void close(TaskAttemptContext context) throws IOException {
  try {
      LOG.warn("Executing statement:" + statement);   

      statement.executeBatch();
    connection.commit();
  } catch (SQLException e) {
    try {
      connection.rollback();
    }
    catch (SQLException ex) {
      LOG.warn(StringUtils.stringifyException(ex));
    }
    throw new IOException(e.getMessage());
  } finally {
    try {
      statement.close();
      connection.close();
    }
    catch (SQLException ex) {
      throw new IOException(ex.getMessage());
    }
  }
}

You can then check the general log on the mysql side to see if anything was executed. Odds are you will see that your transaction was rolled back based of the error. To work around this, make sure the primary keys are unique. If updating/upserting was what you wanted instead, you can make an output/record writer that does that, but that is a different undertaking.

这篇关于Hadoop DBWritable:无法将记录从Hadoop缩减器插入到MySQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆