使用猪拉丁语选择计数不同 [英] select count distinct using pig latin
本文介绍了使用猪拉丁语选择计数不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要关于这个猪脚本的帮助.我只是得到一个记录.我正在选择 2 列并对另一列进行计数(不同),同时还使用 where like 子句来查找特定描述(desc).
I need help with this pig script. I am just getting a single record. I am selecting 2 columns and doing a count(distinct) on another while also using a where like clause to find a particular description (desc).
这是我正在尝试编码的猪的 sql.
Here's my sql with pig I am trying to code.
/*
For example in sql:
select domain, count(distinct(segment)) as segment_cnt
from table
where desc='ABC123'
group by domain
order by segment_count desc;
*/
A = LOAD 'myoutputfile' USING PigStorage('\u0005')
AS (
domain:chararray,
segment:chararray,
desc:chararray
);
B = filter A by (desc=='ABC123');
C = foreach B generate domain, segment;
D = DISTINCT C;
E = group D all;
F = foreach E generate group, COUNT(D) as segment_cnt;
G = order F by segment_cnt DESC;
推荐答案
您可以对每个域进行 GROUP,然后使用 嵌套 FOREACH 语法:
You could GROUP on each domain and then count the number of distinct elements in each group with a nested FOREACH syntax:
D = group C by domain;
E = foreach D {
unique_segments = DISTINCT C.segment;
generate group, COUNT(unique_segments) as segment_cnt;
};
这篇关于使用猪拉丁语选择计数不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文