使用猪拉丁选择不同的计数 [英] select count distinct using pig latin

查看:74
本文介绍了使用猪拉丁选择不同的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要有关此猪脚本的帮助.我只是得到一张唱片.我正在选择2列,并在另一列上做一个count(distinct),同时还使用where like子句来查找特定的描述(desc).

I need help with this pig script. I am just getting a single record. I am selecting 2 columns and doing a count(distinct) on another while also using a where like clause to find a particular description (desc).

这是我尝试编写的关于Pig的SQL.

Here's my sql with pig I am trying to code.

 /*
    For example in sql:
    select domain, count(distinct(segment)) as segment_cnt
    from table
    where desc='ABC123'
    group by domain
    order by segment_count desc;
    */

    A = LOAD 'myoutputfile' USING PigStorage('\u0005')
            AS (
                domain:chararray,
                segment:chararray,
                desc:chararray
                );
B = filter A by (desc=='ABC123');
C = foreach B generate domain, segment;
D = DISTINCT C;
E = group D all;
F = foreach E generate group, COUNT(D) as segment_cnt;
G = order F by segment_cnt DESC;

推荐答案

您可以在每个域上进行GROUP,然后使用

You could GROUP on each domain and then count the number of distinct elements in each group with a nested FOREACH syntax:

D = group C by domain;
E = foreach D { 
    unique_segments = DISTINCT C.segment;
    generate group, COUNT(unique_segments) as segment_cnt;
};

这篇关于使用猪拉丁选择不同的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆