在Hive中,空字符串不被视为null [英] Empty String is not treated as null in Hive
问题描述
我对以下语句的理解是,如果在蜂巢列中插入空白或空字符串,它将被视为null.
My understanding of the following statement is that if blank or empty string is inserted into hive column, it will be treated as null.
TBLPROPERTIES('serialization.null.format'=''
为了测试功能,我创建了一个表并将''插入到字段3中.当我在field3上查询空值时,没有符合该条件的行.
To test the functionality i have created a table and insertted '' to the filed 3. When i query for nulls on the field3, there are no rows with that criteria.
我是否了解使空白字符串为null正确?
Is my understanding of making blank string to null correct??
CREATE TABLE CDR
(
field1 string,
field2 string,
field3 string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
**TBLPROPERTIES('serialization.null.format'='');**
insert overwrite table emmtest.cdr select **field1,field2,''** from emmtest.cdr_non_orc;
select * from emmtest.cdr where **field3 is null;**
最后一条语句未返回任何行.但是我希望所有行都将返回,因为field3中有空字符串.
The last statement has not returned any rows. But i am expecting all rows to be returned since there is blank string in field3.
推荐答案
TBLPROPERTIES('serialization.null.format'='')
的含义如下:
- 查询表时,数据文件中的空字段将被视为NULL
- 在表中插入行时,NULL值将作为空字段写入数据文件
您正在做其他事情-
您正在从查询向表中插入空字符串.
它被按原样"对待. -一个空字符串.
You are doing something else -
You are inserting an empty string to a table from a query.
It is treated "as is" - an empty string.
重击
hdfs dfs -mkdir /user/hive/warehouse/mytable
echo Hello,,World | hdfs dfs -put - /user/hive/warehouse/mytable/data.txt
配置单元
create table mytable (s1 string,s2 string,s3 string)
row format delimited
fields terminated by ','
;
hive> select * from mytable;
OK
s1 s2 s3
Hello World
hive> alter table mytable set tblproperties ('serialization.null.format'='');
OK
hive> select * from mytable;
OK
s1 s2 s3
Hello NULL World
这篇关于在Hive中,空字符串不被视为null的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!