标量只能用于PIG中的投影 [英] scalars can only be used with projection in PIG

查看:288
本文介绍了标量只能用于PIG中的投影的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

标量只能用于投影我使用foreach时出现这个错误。我该如何解决这个错误?我如何在foreach中使用LIMIT?请提前建议一些



编辑(Tichdroma):从评论中复制代码

  A = LOAD'part-r-00000'; 
G = A组乘以($ 0,$ 2);
Y = foreach G生成FLATTEN(组),FLATTEN($ 1);
sorted =订购Y $ 0 ASC,$ 1 DESC;
X = foreach Y {
lim = LIMIT sorted 3;
生成lim;
};
转储x;


解决方案

LIMIT 可在FOREACH的Pig 0.9中找到 nested_op

如果您想要每个组的前N个元素,你可能想尝试迭代每一个并单独排序和限制它们:

  A = LOAD'part-r-00000 ; 
G = GROUP A by($ 0,$ 2);
X = FOREACH G {
sorted = ORDER A by $ 0 ASC,$ 1 DESC;
lim = LIMIT排序3;
GENERATE lim;
};
DUMP X;

请注意 TOP 可以很有效率,只要有一列可比较的值(不是这种情况下)。


scalars can only be used with projection i am getting this error while using foreach.How can i resolved this error ? how can i use LIMIT within foreach ? please suggest some thanks in advance..

Edit (Tichdroma): Copied code from comment

A = LOAD 'part-r-00000';
G = Group A by ($0,$2 );
Y = foreach G generate FLATTEN(group), FLATTEN($1);
sorted = order Y by $0 ASC, $1 DESC;
X = foreach Y {
  lim = LIMIT sorted 3;
  generate lim;
};
Dump x;

解决方案

LIMIT is available in Pig 0.9 in the FOREACH nested_op.

If you want the top N element of each group, you might want to try to iterate on each one and individually sort and limit them:

A = LOAD 'part-r-00000';
G = GROUP A by ($0, $2);
X = FOREACH G {
  sorted = ORDER A by $0 ASC, $1 DESC;
  lim = LIMIT sorted 3;
  GENERATE lim;
};
DUMP X;

Notice that TOP can be efficient when you just have a column of comparable values (not in this case).

这篇关于标量只能用于PIG中的投影的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆