dc.js箱形图精简器(使用两组) [英] dc.js Box plot reducer using two groups

查看:65
本文介绍了dc.js箱形图精简器(使用两组)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试绘制一个箱形图,以显示按其供应商分组的单个设备已连接的网络总数。



数据格式:

  {
SSID: eduroam,
identifier:客户,
纬度:52.4505,
经度:-1.9361,
mac: dc:d9:16:##:##:##,
packet : PR-REQ,
时间戳: 2018-07-10 12:25:26,
供应商:华为技术有限公司
}

处理数据



隐藏组的尝试



Gordon在评论中提到不能在交叉过滤器中递归地传递两个组。我现在正在尝试产生一个隐藏的组,该组可以使用DC git中的以下代码来积累每个mac地址的网络,但是我无法将其与boxplot减速器结合起来。我在这里朝正确的方向前进吗? ?





小提琴的叉子



仅按#MACs排名前十名



一个示例,说明如何在框太多的情况下修剪数据,这是如何按MAC地址数量进行排序的示例, nd仅接受10个最受欢迎的供应商:

  function top_ten_by_length(group){
return {
全部:function(){
返回group.all()。sort(function(a,b){
返回b.value.length-a.value.length;
})。slice(0,10);
}
};
}

像这样组成它们:

  var boxPlotGroup = top_ten_by_length(flatten_object_group(vendorMacCountsGroup)); 

这是我的头顶,未经测试,因此如果有故障,请编辑/评论。


I'm trying to produce a box plot which will show the total number of networks single devices have connected to grouped by their vendor.

Data Format:

{
    "SSID": "eduroam",
    "identifier": "Client",
    "latitude": 52.4505,
    "longitude": -1.9361,
    "mac": "dc:d9:16:##:##:##",
    "packet": "PR-REQ",
    "timestamp": "2018-07-10 12:25:26",
    "vendor": "Huawei Technologies Co.Ltd"
}

Fiddle with data https://jsfiddle.net/v4a8g2bo/

I have managed to get a sum of the networks a single device has connected to using the following code. Data filtered before to only contain unique networks to a mac address, therefore using a counter works to count networks.

var mac = ndx.dimension(function (d) { return d["mac"]; });
var SSIDstoSingleMAC = mac.group().reduceSum(function (d) { return 
+d.counter});

My problem lies when trying to then pass this grouped sum into a further group that will output an array for use in the box plot chart

var vendor = ndx.dimension(function (d) { return d["vendor"]; });

//Used to count number of networks per device
var mac = ndx.dimension(function (d) { return d["mac"]; });
var SSIDstoSingleMAC = mac.group().reduceSum(function (d) { return 
+d.counter});

//This is where things fall down
var boxplotGroup = SSIDstoSingleMAC.group().reduce(
    function (p, v) {
        let dv = v.counter;
        if (dv != null) p.push(dv);
        return p;
    },
    function (p, v) {
        let dv = v.counter;
        if (dv != null) p.splice(p.indexOf(dv), 1);
        return p;
    },
    function () {
        return [];
    }
);

var boxPlot = dc.boxPlot("#boxPlot");
boxPlot
    .width(1200)
    .height(600)
    .dimension(vendor)
    .group(boxplotGroup)
    .tickFormat(d3.format('.1f'))
    .elasticY(true)
    .elasticX(true)
;

This is the goal: Ex. Apple [7,5,10,2] = four apple devices.. device one has connected to 7 networks... ect..

ATTEMPT AT HIDDEN GROUP

Gordon mentioned in the comments that two groups can't be passed recursively in crossfilter. I'm now trying to produce a hidden group that can accumulate the networks per mac address using the following code from the DC git however I can't get this to mesh up with the boxplot reducer.. Am I going in the right direction here?

https://github.com/dc-js/dc.js/wiki/FAQ#accumulate-values

var allDim = ndx.dimension(function (d) { return d; });

function accumulate_group(source_group) {
    return {
        all:function () {
            var cumulate = 0;
            return source_group.all().map(function(d) {
                cumulate += d.counter;
                return {key:d.mac, value:cumulate};
            });
        }
    };
}

var boxPlotDim = accumulate_group(allDim);

var boxPlotGroup = boxPlotDim.group().reduce(
    function(p,v) {
        p.push(v.value());
        return p;
    },
    function(p,v) {
        p.splice(p.indexOf(v.value()), 1);
        return p;
    },
    function() {
        return [];
    }
);

var boxPlot = dc.boxPlot("#boxPlot");
boxPlot
    .width(1200)
    .height(600)
    .dimension(vendor)
    .group(boxPlotGroup)
    .tickFormat(d3.format('.1f'))
    .elasticY(true)
    .elasticX(true)
;

Thanks Adam

解决方案

Ideally we'd really like to use a simple dimension over vendors here, in case we want to filter using a brush on the boxplot.

So then the question becomes: how do we reduce twice, once to get counts per MAC address, and then again to turn those counts into an array.

The first part has a standard answer: just reduce to an object instead of a value:

var vendorMacCountsGroup = vendor.group().reduce(
  function(p, v) { // add
    p[v.mac] = (p[v.mac] || 0) + v.counter;
    return p;
  },
  function(p, v) { // remove
    p[v.mac] -= v.counter;
    return p;
  },
  function() { // init
    return {}; // macs;
  }
);

I recently described this pattern in this answer, so I won't go into the details here.

Here's the sample output: bins are vendors, and each value is an object mapping mac addresses to counts:

[
  {
    "key": "Asustek Computer Inc.",
    "value": {
      "1c:b7:2c:48": 8,
      "1c:b7:be:ef": 3
    }
  },
  {
    "key": "Huawei Technologies Co.Ltd",
    "value": {
      "dc:d9:16:3d": 14,
      "dc:da:16:3d": 2,
      "dc:d9:16:3a": 1,
      "dc:d9:16:3b": 1
    }
  },
  ...

Next, we really just want the counts and to forget the MAC addresses. JavaScript has a nice built-in function for this, Object.values. We just need to apply to that to each of the object-values in our group. We'll also throw out any zeros, because that will only happen when a MAC address has been filtered out somewhere else.

function flatten_object_group(group) {
  return {
    all: function() {
      return group.all().map(function(kv) {
        return {
          key: kv.key,
          value: Object.values(kv.value).filter(function(v) { return v>0; })
        }; 
      });
    }
  };
}
var boxPlotGroup = flatten_object_group(vendorMacCountsGroup);

Sample output:

[
  {
    "key": "Asustek Computer Inc.",
    "value": [
      8,
      3
    ]
  },
  {
    "key": "Huawei Technologies Co.Ltd",
    "value": [
      14,
      2,
      1,
      1
    ]
  },
  ...

Your sample data only had one MAC address per vendor, so I added some bogus data, and got a decent-looking boxplot:

Fork of your fiddle.

Taking only the top ten by #MACs

As an example of how you might trim the data if there are too many boxes, here's how you would sort by number of MAC addresses, and take only the 10 "most popular" vendors:

function top_ten_by_length(group) {
  return {
    all: function() {
      return group.all().sort(function(a,b) {
        return b.value.length - a.value.length;
      }).slice(0, 10);
    }
  };
}

Compose them like this:

var boxPlotGroup = top_ten_by_length(flatten_object_group(vendorMacCountsGroup));

This is off the top of my head and untested so please edit/comment if there is some glitch.

这篇关于dc.js箱形图精简器(使用两组)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆