使用猪拉丁语选择计数不同 [英] select count distinct using pig latin

查看:28
本文介绍了使用猪拉丁语选择计数不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要关于这个猪脚本的帮助.我只是得到一个记录.我正在选择 2 列并对另一列进行计数(不同),同时还使用 where like 子句来查找特定描述(desc).

I need help with this pig script. I am just getting a single record. I am selecting 2 columns and doing a count(distinct) on another while also using a where like clause to find a particular description (desc).

这是我正在尝试编码的猪的 sql.

Here's my sql with pig I am trying to code.

 /*
    For example in sql:
    select domain, count(distinct(segment)) as segment_cnt
    from table
    where desc='ABC123'
    group by domain
    order by segment_count desc;
    */

    A = LOAD 'myoutputfile' USING PigStorage('\u0005')
            AS (
                domain:chararray,
                segment:chararray,
                desc:chararray
                );
B = filter A by (desc=='ABC123');
C = foreach B generate domain, segment;
D = DISTINCT C;
E = group D all;
F = foreach E generate group, COUNT(D) as segment_cnt;
G = order F by segment_cnt DESC;

推荐答案

您可以对每个域进行 GROUP,然后使用 嵌套 FOREACH 语法:

You could GROUP on each domain and then count the number of distinct elements in each group with a nested FOREACH syntax:

D = group C by domain;
E = foreach D { 
    unique_segments = DISTINCT C.segment;
    generate group, COUNT(unique_segments) as segment_cnt;
};

这篇关于使用猪拉丁语选择计数不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆