在 hive 中处理换行符 [英] handling newline character in hive

查看：96 发布时间：2021/12/15 19:28:19 hadoop hive

本文介绍了在 hive 中处理换行符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在 hive 中创建了一个表

创建表(id int, Description String)

我的数据如下所示:

<前>1|这将返回损坏的数据，因为第一个字符串中有一个,".一些文字更改数据2|读取数据有问题一些文本

数据加载到 hive 后，由于默认的行终止符是，hive 无法读取描述列，因此它显示一个 NULL 值.任何人都可以建议如何在加载到配置单元之前处理换行符.

解决方案

我知道这个问题很老了，但您有几个选择.你不能用 FIELDS TERMINATED BY 控制它，因为它只控制终止字段的内容，而不控制记录.Hive 中的记录被硬编码为由换行符终止(即使有 LINES TERMINATED BY 子句，它也没有实现).

编写一个使用 RecordReader 的自定义 InputFormat理解非换行符分隔的记录.看代码LineReader/LineRecordReader 和 TextInputFormat.
使用格式除了文本/ASCII，如 Parquet.我会推荐这个无论如何，因为文本可能是您可以存储数据的最糟糕的格式无论如何.

I have created a table in hive as

Create table(id int, Description String)

My data looks something as follows :

 
1|This will return corrupt data since there is a ',' in the first string.
     some text
     Change the data  
2|There is prob in reading data 
    sometext

After the data is loaded into hive since the default line terminator is , the description column cannot be read by hive, Hence it displays a NULL value. Can anyone suggest how to handle newline before loading into hive.

解决方案

I know this question is old, but you have a couple of options. You can't control this with FIELDS TERMINATED BY, because that only controls what terminates the fields, not the records. Records in Hive are hard-coded to be terminated by the newline character (even though there is a LINES TERMINATED BY clause, it is not implemented).

Write a custom InputFormat that uses a RecordReader that understands non-newline delimited records. Look at the code for LineReader/LineRecordReader and TextInputFormat.
Use a format other than text/ASCII, like Parquet. I would recommend this regardless, as text is probably the worst format you can store data in anyway.

这篇关于在 hive 中处理换行符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 hive 中处理换行符 [英] handling newline character in hive

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在 hive 中处理换行符 [英] handling newline character in hive

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭