在Hive中无法识别刺字符分隔符 [英] Thorn character delimiter is not recognized in Hive

查看:438
本文介绍了在Hive中无法识别刺字符分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如在使用冰岛刺人物作为Hive中的分隔符
在Hive中不识别刺字符分隔符



示例表



$ p $ lt; code> CREATE EXTERNAL TABLE IF NOT EXISTS zzzzz_raw(
spot_id INT,
activity_type_id INT,
activity_type STRING,
activity_id INT,
activity_sub_type STRING,
report_name STRING,
tag_method_id INT

由(dt日期)分隔
行格式限定字段终止'\-2'行终止'\\\
'
存储为TEXTFILE
LOCATION'/ raw / data / networkmatchtablesactivity / activity_cat';



输出



* from activity_cat_raw limit 1;

 4552126þ805759þeaasv101þ2275868þbfeaac01þBF_EAAccess_InfoPageþ2NULL NULL NULL NULL NULL NULL 2015-03-24 

我是否缺少某些东西?

解决方案

我找到了答案。
代替'-2'(刺分隔符),我使用了'-61'分隔符,然后是一个子字符串来删除额外的符号,如下所示

< code CREATE EXTERNAL TABLE如果不存在SSSSSS(
spot_id STRING,
activity_type_id STRING,
activity_type STRING,
activity_id STRING,
activity_sub_type STRING,
report_name STRING,
tag_method_id STRING

由(dt STRING)分隔
行格式限定字段终止'\-61'行终止'\\\
'
保存为文本文件
LOCATION'SSSSSS';



然后使用子字符串删除其他符号

INSERT OVERWRITE TABLE vvvvvv PARTITION(dt)
SELECT spot_id STRING,
substr(activity_type_id,2),
dt
FROM SSSSS



希望它有帮助..


As mentioned in post Using the Icelandic Thorn character as a delimiter in Hive The thorn character delimiter is not recognized in Hive

Sample table

CREATE EXTERNAL TABLE IF NOT EXISTS zzzzz_raw ( spot_id INT, activity_type_id INT, activity_type STRING, activity_id INT, activity_sub_type STRING, report_name STRING, tag_method_id INT ) PARTITIONED BY ( dt DATE ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-2' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/raw/data/networkmatchtablesactivity/activity_cat';

Output

select * from activity_cat_raw limit 1;

4552126þ805759þeaasv101þ2275868þbfeaac01þBF_EA Access_Info Pageþ2       NULL    NULL    NULL    NULL    NULL    NULL    2015-03-24

Am I missing something?

解决方案

I found the answer. Instead of '-2' (thorn delimiter) , i used '-61' delimiter then a substring to remove the additional symbol, something like below

CREATE EXTERNAL TABLE IF NOT EXISTS SSSSSS ( spot_id STRING, activity_type_id STRING, activity_type STRING, activity_id STRING, activity_sub_type STRING, report_name STRING, tag_method_id STRING ) PARTITIONED BY ( dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-61' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 'SSSSSS';

and then use substring to remove other symbols

INSERT OVERWRITE TABLE vvvvvv PARTITION (dt) SELECT spot_id STRING, substr(activity_type_id,2), dt FROM SSSSS

Hope it helps..

这篇关于在Hive中无法识别刺字符分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆