csv在pig中读取,csv文件包含逗号 [英] csv reading in pig, csv file contains quoted comma
本文介绍了csv在pig中读取,csv文件包含逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
所以我的数据看起来像这样
asdf,asdf,adsf,qwef,asdf
当我使用
读取猪数据时
PigStorage(',')
它将adsf,qwef两个数据,并存储为
{adsf} {qwef}
我想将引号视为单个数据。
我该怎么办?
解决方案 / div>您应该使用CSVLoader:
data = LOAD'my.csv'USING org.apache.pig .piggybank.storage.CSVExcelStorage()
AS(...);
其中...是标识符。
注意:您必须先注册Piggybank。详情如下: https://cwiki.apache.org/confluence/display/PIG/PiggyBank
So my data looks something like this
asdf, asdf, "adsf,qwef", asdf
When i read that data in pig using
PigStorage(',')
It stores the "adsf,qwef" as a two data, and stores it like
{ "adsf } { qwef" }
I want quotes to be treated as single data.
What should I do?
Im trying to write a pigscript to do this.
解决方案You should use the CSVLoader:
data = LOAD 'my.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS (...);
Where the ... is the identifiers.
NOTE: You'll have to register the Piggybank first. Details here: https://cwiki.apache.org/confluence/display/PIG/PiggyBank
这篇关于csv在pig中读取,csv文件包含逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文