如何处理配置单元中的换行符? [英] How to handle new line characters in hive?
问题描述
我正在将表从 Teradata 导出到 Hive .. teradata 中的表有一个地址字段,其中包含换行符 (\n).. 最初我导出表以从 Teradata 挂载文件系统路径,然后我正在加载表进入 hive...teradata 表和 hive 表之间的记录计数不匹配,因为新行字符出现在 hive 中.
I am exporting table from Teradata to Hive.. The table in the teradata Has a address field which has New line characters(\n).. initially I am exporting the table to mount filesystem path from Teradata and then I am loading the table into hive... Record counts are mismatching between teradata table and hive table, Since new line characters are presented in hive.
注意:我不想通过sqoop处理这个来带来数据我想在从本地路径加载到配置单元时处理换行符.
NOTE: I don't want to handle this through sqoop to bring the data I want to handle the new line characters while loading Into hive from local path.
推荐答案
我通过创建具有以下选项的外部表来实现此目的:
I got this to work by creating an external table with the following options:
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001'
ESCAPED BY '\\'
STORED AS TEXTFILE;
然后我为包含数据文件的目录创建了一个分区.(我的表使用分区)即
Then I created a partition to the directory that contains the data files. (my table uses partitions) i.e.
ALTER TABLE STG_HOLD_CR_LINE_FEED ADD PARTITION (part_key='part_week53') LOCATION '/ifs/test/schema.table/staging/';
注意:确保在创建数据文件时使用\"作为转义字符.
NOTE: Be sure that when creating your data file you use '\' as the escape character.
这篇关于如何处理配置单元中的换行符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!