交叉过滤器-计算具有属性的所有记录的百分比 [英] crossfilter - calculating percent of all records with a property

查看:79
本文介绍了交叉过滤器-计算具有属性的所有记录的百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的问题:

我正在使用一个python flask服务器,该服务器从mongo db获取json数据,并在其中指定要导入的字段.此数据为json格式,仅以这种方式获取.一旦通过graphs.js中的交叉过滤器,是否可以对这些字段进行转换?例如我有一个状态属性,该属性可以采用通过",进行中",保留"或失败"值.我基本上想做一个指标,告诉我失败的百分比.因此,理想情况下,我必须对数据进行一些计算.请为此提供建议.

i am using a python flask server that fetches json data from mongo db, and in there i specify what fields to import.This data is in json format and is fetched like that only. Is it possible to do transformations to these fields once passed through crossfilter in graphs.js? e.g. i have a status attribute which can take values "Pass","In Progress","on Hold" or "Fail". I basically want to do a metric which tells me percent failure. So ideally i have to do some calculations on the data. Please advise on this.

Sample data (in tabular form for clarity) looks like:
TrialLocation     | Subject Status
Site A            | In progress
Site A            | Pass
Site B            | In progress
Site A            | In progress
Site B            | On Hold
Site A            | Screen Failure

在这种情况下,我应该获得在x轴和y轴上具有站点名称的条形图,我应该获得计算故障百分比的度量.在这种情况下 网站A为25%,网站B为0%.

In this case i should get a bar chart with site name on x axis and on the y axis, i should get the metric calculating the failure percentage. which in this case would be 25% for Site A and 0% for Site B.

因此,我首先创建了图表,该图表使我可以了解每个站点的主题数.

So i created chart in the first place which gave me the count of subjects per site.

var siteName = ndx.dimension(function(d) { return d["TrialLocation"];});
var numSubjectsBySite = siteName.group();
var siteLevelChart = dc.barChart("#site-level-count", "subjectView");

最后是图表:

siteLevelChart
 .width(2000)
 .height(200)
 .transitionDuration(1000)
 .dimension(siteName)
 .group(numSubjectsBySite)
 .ordering(function(d){return d.value;})

所以我想,我将使用SubjectStatus ="Screen Failure"计算行数,并将其除以总行数,在这种情况下,该行数将是"numSubjectsBySite"变量 然后,当我介绍这段代码时:

So i thought, i would calculate the count of rows with SubjectStatus = "Screen Failure" and divide that by the total number of rows which in this case would be "numSubjectsBySite" variable Then when i introduced this code:

var countScreenFailures = ndx.dimension(function(d){ return d["SubjectStatus"];});
 countScreenFailures.filter("Off Study");

我的条形图仅显示主题状态为"ScreenFailure"的行.

My bar chart only shows the rows where Subject Status ="ScreenFailure".

如何计算屏幕故障率然后再使用?请帮帮我吗?

How can i calculate the screen failure rate and then use it ? Please help me out?

非常感谢你. 安莫尔

推荐答案

您将需要构建自定义的分组/归约函数以跟踪每个状态的计数以及总计数.然后,您可以在图表中除以计算百分比.如果您有兴趣使用还原,则可以执行以下操作:

You'll need to build custom grouping/reduce functions to track the count of each status as well as the total count. Then you can just divide in the chart to calculate your percentage. If you are interested in using Reductio, you can probably do the following:

var reducer = reductio().count(true);

// Do this as many times as you need for different status counts. Each
// call of reducer.value will add a new property to your groups where
// you can store the count for that status.
reducer.value("ScreenFailure").sum(
  function(d) {
    // This counts records with SubjectStatus = "Screen Failure"
    return d["SubjectStatus"] === "Screen Failure" ? 1 : 0;
  });

// Build the group with the Reductio reducers.
var numSubjectsBySite = reducer(siteName.group());

// In your dc.js chart, calculate the % using a value accessor.
siteLevelChart
 .width(2000)
 .height(200)
 .transitionDuration(1000)
 .dimension(siteName)
 .group(numSubjectsBySite)
 .valueAccessor(function(p) { return p.value.ScreenFailure.sum / p.value.count; })
 .ordering(function(d){return d.value;})

这篇关于交叉过滤器-计算具有属性的所有记录的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆