Pig:使用外部模式文件加载数据文件 [英] Pig: loading a data file using an external schema file

查看:75
本文介绍了Pig:使用外部模式文件加载数据文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据文件和一个相应的模式文件存储在不同的位置。
我想使用模式文件中的模式加载数据。我尝试使用

  A = LOAD'< file path>'USING PigStorage('\\\')as'<模式文件路径>'

但会出错。



正确加载文件的语法是什么?



模式文件格式如下所示:

  data1  -  complex  -   -   -   -  format  -   -  
data1 event_type - - - - long - ends\001'
data1 event_id - - - - - varchar(50) - ends'\001'
data1 name_format - - - - - varchar(10) - ends newline
pre

解决方案

AS子句用于直接指定模式而不是模式文件的路径。 b

  A = LOAD'< file path>'使用PigStorage('\\\')作为'type:long,id:chararray,nameformat:chararray' ; 

或者,一个名为 .pig_schema 的文件包含模式并位于您的输入目录中也可以工作。但从来没有尝试过。它必须是具有以下语法的JSON文件:

  {fields:[
{name: type,type:55,description:Fu,schema:null},
{name:id,type:15,description:Bar ,schema:null},
{name:nameFormat,type:55,description:Xu,schema:null},
]版本:0,sortKeys:[],sortKeyOrders:[]}

如果您在使用PigStorage进行存储时指定了-schema选项,则也会生成该文件。


I have a data file and a corresponding schema file stored in separate locations. I would like to load the data using the schema in the schema-file. I tried using

A= LOAD '<file path>' USING PigStorage('\u0001') as '<schema-file path>' 

but get an error.

What is the syntax for correctly loading the file?

The schema file format is something like:

data1 - complex - - - - format - -
data1 event_type - - - - - long - "ends '\001'"
data1 event_id - - - - - varchar(50) - "ends '\001'"
data1 name_format - - - - - varchar(10) - "ends newline"

解决方案

The AS clause is for specifying the schema directly not the path to the schema file.

 A = LOAD '<file path>' USING PigStorage('\u0001') as 'type: long, id:chararray, nameformat:chararray';

Alternatively, a file named .pig_schema containing the schema and located in your input directory could work as well. Never tried that though. It must be a JSON file with the following syntax:

{"fields":[
        {"name":"type","type":55,"description":"Fu","schema":null},
        {"name":"id","type":15,"description":"Bar","schema":null},
        {"name":"nameFormat","type":55,"description":"Xu","schema":null},
    ] ,"version":0,"sortKeys":[],"sortKeyOrders":[]}

This file is also generated if you specify the -schema option when storing with PigStorage.

这篇关于Pig:使用外部模式文件加载数据文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆