减少大数据集的DOM元素的策略 [英] Strategies to reduce DOM elements of large data sets

查看:76
本文介绍了减少大数据集的DOM元素的策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型数据集,我想使用dc.js显示。条目的数量远远超过了屏幕上可用的以像素为单位的绘图空间。因此,在500像素宽的图表上渲染20k点没有意义,而且还会降低浏览器的速度。



我阅读了



通常是因为您显示的太多点-应该有几十个而不是数百个,永远不要成千上万个。



这个想法是为了产生不同的组不同的时间间隔。在这里,我们将定义一些间隔和阈值(以毫秒为单位),在该阈值上应使用该间隔:

  var groups_by_min_interval = [
{
名称:'minutes',
阈值:60 * 60 * 1000,
间隔:d3.timeMinute
},{
名称: 'seconds',
阈值:60 * 1000,
间隔:d3.timeSecond
},{
名称:'milliseconds',
阈值:0,
时间间隔:d3.timeMillisecond
}
];

同样,这里还应该有更多-因为我们将动态生成组并缓存它们,所以可以有一大堆。 (它可能会占用内存,但是现在在JS中,千兆字节就可以了。)



当我们需要一个组时,我们将使用d3生成它interval函数,它会产生下限,然后减少总数和计数:

 函数make_group(interval){
返回Dimensions.group(interval).reduce(
function(p,v){
p.count ++;
p.total + = v.value;
return p;
},
function(p,v){
p.count--;
p.total + = v.value;
return p;
} ,
function(){
return {count:0,total:0};
}
);
}

因此,我们将告诉图表计算其<$ c $的平均值c> valueAccessor s:

  chart.valueAccessor(kv => kv.value.total / kv.value.count)

这是有趣的部分:当我们需要一个小组时,我们将进行扫描直到找到第一个阈值小于当前范围(以毫秒为单位)的规范为止。

  function select_group(extent){
var d =范围[1] .getTime()-范围[0] .getTime();找到
var = groups_by_min_interval.find(mg => mg.threshold< d);
console.log('interval'+ d +'大于'+ found.threshold +'ms;选择'+ found.name +
'代表'+ found.interval.range(extent [ 0],范围[1])。length +'points');
if(!found.group)
found.group = make_group(found.interval);
返回找到的组。
}

将其连接到已过滤事件:

  rangeChart.on('filtered.dynamic-interval',function(_,filter ){
chart.group(choose_group(filter || fullDomain));
});

暂时没有时间了。请提出任何问题,我们会做得更好。我们将需要自定义时间间隔(例如10秒),而我现在找不到该示例。有一个很好的方法。



注意:我已将您单上一个,并将点数增加了一个数量级到一百万对于老式计算机而言,这可能太多了,但是在2017年的计算机上,它证明数据量不是问题,而DOM元素却是问题。


I have a large dataset that I want to display using dc.js. The amount of entries exceeds the available drawing space in pixels on the screen by far. So it does not make sense to render 20k points on a 500px wide chart and also slows down the browser.

I read the Performance teak section of the wiki and thought of some other things:

  • Aggregating groups using crossfilter (e.g. chunk the dataset in 500 groups if I have a 500px wide svg)
  • simplify my data using a Douglas–Peucker or Visvalingam’s algorithm

dc.js offers a neat rangeChart that can be used to display range selection that I want to use.

But the more I zoom in the rangeChart the more Detail I want to show. But I don't know on how to get the zoom level of the chart and aggregate a group 'on the fly'. Perhaps someone has a thought about this.

I created a codepan as an example.

解决方案

This comes up a lot so I've added a focus dynamic interval example.

It's a refinement of the same techniques in the switching time intervals example, except here we determine which d3 time interval to use based on the extent of the brush in the range chart.

Unfortunately I don't have time to tune it right now, so let's iterate on this. IMO it's almost but not quite fast enough - it could sample even less points but I used the built-in time intervals. When you see a jaggy line in the dc line chart

it's usually because you are displaying too many points - there should be dozens not hundreds and never thousands.

The idea is to spawn different groups for different time intervals. Here we'll define a few intervals and the threshold, in milliseconds, at which we should use that interval:

    var groups_by_min_interval = [
        {
            name: 'minutes',
            threshold: 60*60*1000,
            interval: d3.timeMinute
        }, {
            name: 'seconds',
            threshold: 60*1000,
            interval: d3.timeSecond
        }, {
            name: 'milliseconds',
            threshold: 0,
            interval: d3.timeMillisecond
        }
    ];

Again, there should be more here - since we will generate the groups dynamically and cache them, it's okay to have a bunch. (It will probably hog memory at some point, but gigabytes are OK in JS these days.)

When we need a group, we'll generate it by using the d3 interval function, which produces the floor, and then reduce total and count:

    function make_group(interval) {
        return dimension.group(interval).reduce(
            function(p, v) {
                p.count++;
                p.total += v.value;
                return p;
            },
            function(p, v) {
                p.count--;
                p.total += v.value;
                return p;
            },
            function() {
                return {count: 0, total: 0};
            }
        );
    }

Accordingly we will tell the charts to compute the average in their valueAccessors:

    chart.valueAccessor(kv => kv.value.total / kv.value.count)

Here's the fun part: when we need a group, we'll scan this list until we find the first spec whose threshold is less than the current extent in milliseconds:

    function choose_group(extent) {
        var d = extent[1].getTime() - extent[0].getTime();
        var found = groups_by_min_interval.find(mg => mg.threshold < d);
        console.log('interval ' + d + ' is more than ' + found.threshold + ' ms; choosing ' + found.name +
                    ' for ' + found.interval.range(extent[0], extent[1]).length + ' points');
        if(!found.group)
            found.group = make_group(found.interval);
        return found.group;
    }

Hook this up to the filtered event of the range chart:

    rangeChart.on('filtered.dynamic-interval', function(_, filter) {
        chart.group(choose_group(filter || fullDomain));
    });

Run out of time for now. Please ask any questions, and we'll refine this better. We will need custom time intervals (like 10th of a second) and I am failing to find that example right now. There is a good way to do it.

Note: I have one-upped you and increased the number of points by an order of magnitude to half a million. This may be too much for older computers, but on a 2017 computer it proves that data quantity is not the problem, DOM elements are.

这篇关于减少大数据集的DOM元素的策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆