csv在pig中读取,csv文件包含逗号 [英] csv reading in pig, csv file contains quoted comma

查看:629
本文介绍了csv在pig中读取,csv文件包含逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我的数据看起来像这样

  asdf,asdf,adsf,qwef,asdf 



当我使用

读取猪数据时

  PigStorage(',')

它将adsf,qwef两个数据,并存储为

  {adsf} {qwef} 



我想将引号视为单个数据。



我该怎么办?

解决方案 / div>

您应该使用CSVLoader:

  data = LOAD'my.csv'USING org.apache.pig .piggybank.storage.CSVExcelStorage()
AS(...);

其中...是标识符。



注意:您必须先注册Piggybank。详情如下: https://cwiki.apache.org/confluence/display/PIG/PiggyBank


So my data looks something like this

asdf, asdf, "adsf,qwef", asdf 

When i read that data in pig using

PigStorage(',')

It stores the "adsf,qwef" as a two data, and stores it like

{ "adsf } { qwef" } 

I want quotes to be treated as single data.

What should I do?

Im trying to write a pigscript to do this.

解决方案

You should use the CSVLoader:

data = LOAD 'my.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() 
    AS (...);

Where the ... is the identifiers.

NOTE: You'll have to register the Piggybank first. Details here: https://cwiki.apache.org/confluence/display/PIG/PiggyBank

这篇关于csv在pig中读取,csv文件包含逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆