将序列文件数据加载到使用存储的序列文件失败创建的配置单元表中 [英] Loading Sequence File data into hive table created using stored as sequence file failing

查看:251
本文介绍了将序列文件数据加载到使用存储的序列文件失败创建的配置单元表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用下面的sqoop import命令将序列文件中的内容从MySQL导入到HDFS中。

  sqoop import --connectjdbc: mysql://quickstart.cloudera:3306 / retail_db
--username retail_dba --password cloudera $ b $ --table命令
--target-dir / user / cloudera / sqoop_import_seq / orders
--as-sequencefile
--lines-terminated-by'\\\
'--fields-terminated-by','

然后我使用下面的命令创建配置单元表

  create表order_seq(order_id int,order_date字符串,order_customer_id int,order_status字符串)
行格式限定符
FIELDS TERMINATED BY'|'
存储为SEQUENCEFILE

但是,当我尝试使用以下命令将从第一个命令获得的序列数据加载到配置单元表中时

  LOAD DATA INPATH'/ user / cloudera / sqoop_import_seq / orders'INTO TABLE orders_seq; 

它给出了下面的错误。



<$将数据加载到表practice.orders_seq
使用java.lang.RuntimeException异常失败:java.io.IOException:WritableName无法加载class:orders
FAILED:执行错误,从org.apache.hadoop.hive.ql.exec.MoveTask返回代码1

Where我错了吗?

解决方案

首先,有必要以这种格式提供数据?



假设您必须拥有该格式的数据。 加载数据命令不是必需的。一旦sqoop完成导入数据,您将只需创建一个Hive表,指向数据sqoop所在的同一目录。



脚本的一个附注:

p>

 创建表orders_seq(order_id int,order_date字符串,order_customer_id int,order_status字符串)
ROW FORMAT DELIMITED
FIELDS ''''
存储为SEQUENCEFILE

您的sqoop命令说: - fields-terminated-by','但是当你创建你正在使用的表时: FIELDS TERMINATED BY'|'

根据我的经验,我最好的方法是将数据作为avro 进行数据处理,这将自动创建一个avro模式。然后你只需要使用之前创建的模式创建一个Hive表( AvroSerde )并使用您存储从sqooping进程获得的数据的位置。

Importing the content from MySQL to HDFS as sequence files using below sqoop import command

sqoop import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" 
    --username retail_dba --password cloudera 
    --table orders 
    --target-dir /user/cloudera/sqoop_import_seq/orders 
    --as-sequencefile 
    --lines-terminated-by '\n' --fields-terminated-by ','

Then i'm creating the hive table using the below command

create table orders_seq(order_id int,order_date string,order_customer_id int,order_status string) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '|' 
STORED AS SEQUENCEFILE

But when I tried to load sequence data obtained from 1st command into hive table using the below command

LOAD DATA INPATH '/user/cloudera/sqoop_import_seq/orders' INTO TABLE orders_seq;

It is giving the below error.

Loading data to table practice.orders_seq
Failed with exception java.lang.RuntimeException: java.io.IOException: WritableName can't load class: orders
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

Where am I going wrong?

解决方案

First of all, It's necessary to have the data in that format?

Let's suppose you have to have the data in that format. The load data command is not necessary. Once the sqoop finishes importing data, you will just have to create a Hive table pointing the same directory where you sqoop the data.

One side note from your scripts:

create table orders_seq(order_id int,order_date string,order_customer_id int,order_status string)  
ROW FORMAT DELIMITED  
FIELDS TERMINATED BY '|'  
STORED AS SEQUENCEFILE

Your sqoop command says this: --fields-terminated-by ',' but when you are creating the table you are using: FIELDS TERMINATED BY '|'

In my experience, the best approach I thing is to sqoop the data as avro, this will create automatically an avro-schema. Then you will just to have to create a Hive table using the schema previously created (AvroSerde) and using the location where you stored the data you got from sqooping process.

这篇关于将序列文件数据加载到使用存储的序列文件失败创建的配置单元表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆