问题与sqoop出口与蜂巢表分区的时间戳 [英] Issue with sqoop export with hive table partitioned by timestamp

查看:190
本文介绍了问题与sqoop出口与蜂巢表分区的时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法sqoop导出按时间戳分区的配置单元表。



我有一个按时间戳分区的配置单元表。它创建的hdfs路径包含空间,我认为这是造成sqoop问题。



fs -ls
2013-01-28 16:31 / user / hive / warehouse / my_table / day = 2013-01-28 00%3A00%3A00



来自sqoop export的错误:

'p> 13/01/28 17点18分23秒ERROR security.UserGroupInformation:PriviledgedActionException为:布兰登(AUTH:SIMPLE)原因:java.io.FileNotFoundException:文件不存在:/用户/蜂巢/仓库/ MY_TABLE /天=在org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes 2012年10月29日00%3A00%3A00
(FSNamesystem.java:1239)
。在org.apache.hadoop。在org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1192)
(FSNamesystem.java:1165)
。在组织。 apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1147)
在org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLoc在org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations ations(NameNodeRpcServer.java:383)
(ClientNamenodeProtocolServerSideTranslatorPB.java:170)
。在org.apache.hadoop.hdfs.protocol.proto。 ClientNamenodeProtocolProtos $ ClientNamenodeProtocol $ 2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44064)
at org.apache.hadoop.ipc.ProtobufRpcEngine $ Server $ ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop .ipc.RPC $ Server.call(RPC.java:898)
在org.apache.hadoop.ipc.Server $ Handler $ 1.run(Server.java:1693)
在org.apache。 hadoop.ipc.Server $ Handler $ 1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject。 Java的:在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332 396)

在org.apache.hadoop.ipc.Server $ Handler.run(Server.java: 1687)

如果您做
fs -ls / user / hive / warehouse / my_table / day = 2013-01-28 00%3A00%3A00
ls: / user / hive / warehouse / my_table / day = 2013-01-28':没有这样的文件或目录
ls:
00%3A00%3A00':没有这样的文件或目录



如果您添加引号,它可以工作:
brandon @ prod-namenode -new:〜$ fs -ls / user / hive / warehouse / my_table / day =2013-01-28 00%3A00%3A00
找到114件商品
-rw -r - r-- 2 brandon supergroup 4845 2013-01-28 16:30 / user / hive / warehouse / my_table / day = 2013 -01-28%2000%253A00%253A00 / 000000_0
...

解决方案

文件名以冒号(:)并不支持这些 jira 中提到的HDFS路径,但可以通过转换它进入Hex.But当sqoop试图再次读取该路径时,它将它转换为冒号(:),因此它无法找到该路径。我建议从您的目录名称删除时间部分,并再次尝试。希望这个回答你的问题。


I'm unable to sqoop export a hive table that's partitioned by timestamp.

I have a hive table that's partitioned by timestamp. The hdfs path it creates contains spaces which I think is causing issues with sqoop.

fs -ls 2013-01-28 16:31 /user/hive/warehouse/my_table/day=2013-01-28 00%3A00%3A00

The error on from sqoop export:

13/01/28 17:18:23 ERROR security.UserGroupInformation: PriviledgedActionException as:brandon (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/my_table/day=2012-10-29 00%3A00%3A00 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1239) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1192) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1165) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1147) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:383) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:170) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44064) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

If you do fs -ls /user/hive/warehouse/my_table/day=2013-01-28 00%3A00%3A00 ls: /user/hive/warehouse/my_table/day=2013-01-28': No such file or directory ls:00%3A00%3A00': No such file or directory

It works if you add quotes: brandon@prod-namenode-new:~$ fs -ls /user/hive/warehouse/my_table/day="2013-01-28 00%3A00%3A00" Found 114 items -rw-r--r-- 2 brandon supergroup 4845 2013-01-28 16:30 /user/hive/warehouse/my_table/day=2013-01-28%2000%253A00%253A00/000000_0 ...

解决方案

Filenames with colon(:) are not supported as HDFS path as mention in these jira .But will work by converting it into Hex.But when sqoop is trying to read that path again it is converting it to colon(:) hence it cant able to find that path.I suggest to remove time part from your directory name and try again.Hope this answer your question.

这篇关于问题与sqoop出口与蜂巢表分区的时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆