空字符串在 Hive 中不被视为 null [英] Empty String is not treated as null in Hive
问题描述
我对下面语句的理解是,如果在hive列中插入空白或空字符串,将被视为空.
My understanding of the following statement is that if blank or empty string is inserted into hive column, it will be treated as null.
TBLPROPERTIES('serialization.null.format'=''
为了测试功能,我创建了一个表并将 '' 插入到字段 3.当我在字段 3 上查询空值时,没有符合该条件的行.
To test the functionality i have created a table and insertted '' to the filed 3. When i query for nulls on the field3, there are no rows with that criteria.
我对将空白字符串设为 null 的理解是否正确?
Is my understanding of making blank string to null correct??
CREATE TABLE CDR
(
field1 string,
field2 string,
field3 string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '
'
**TBLPROPERTIES('serialization.null.format'='');**
insert overwrite table emmtest.cdr select **field1,field2,''** from emmtest.cdr_non_orc;
select * from emmtest.cdr where **field3 is null;**
最后一条语句没有返回任何行.但我希望返回所有行,因为 field3 中有空字符串.
The last statement has not returned any rows. But i am expecting all rows to be returned since there is blank string in field3.
推荐答案
TBLPROPERTIES('serialization.null.format'='')
含义如下:
- 查询表时,数据文件中的空字段将被视为NULL
- 向表中插入行时,NULL 值将作为空字段写入数据文件
你在做别的事情-
您正在通过查询向表中插入一个空字符串.
它被按原样"处理.- 一个空字符串.
You are doing something else -
You are inserting an empty string to a table from a query.
It is treated "as is" - an empty string.
bash
hdfs dfs -mkdir /user/hive/warehouse/mytable
echo Hello,,World | hdfs dfs -put - /user/hive/warehouse/mytable/data.txt
蜂巢
create table mytable (s1 string,s2 string,s3 string)
row format delimited
fields terminated by ','
;
hive> select * from mytable;
OK
s1 s2 s3
Hello World
hive> alter table mytable set tblproperties ('serialization.null.format'='');
OK
hive> select * from mytable;
OK
s1 s2 s3
Hello NULL World
这篇关于空字符串在 Hive 中不被视为 null的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!