将运行总计添加到PowerQuery结果中 [英] Add a running total to a PowerQuery result

查看:49
本文介绍了将运行总计添加到PowerQuery结果中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建特定星期(包括本周在内的最后五个星期)中发生的日期的计数,以及该日期的总运行计数,其中包括列出的五周持续时间之前的记录.
我可以使用标准Excel公式轻松完成此操作,但是我正在尝试学习PowerQuery和 M 语言.

I am trying to create a count of dates that occur during particular weeks (the last five weeks including this week) and a total running count for dates which would include records prior to the listed five week duration.
I could do this fairly easily using standard Excel formula, but I'm trying to learn PowerQuery and the M language.

我的第一个查询从原始源数据返回相关行(另一个查询从Excel表中提取所有内容并为每一列强制使用正确的数据类型).
我要计算的日期包含在 A#(lf)Turnbacks 部分.
该查询称为 Triage_Turnbacks .

My first query returns the relevant rows from the original source data (another query that pulls everything from the Excel table and forces the correct data type for each column).
The dates I'm trying to count are contained in the Section A#(lf)Turnbacks.
The query is called Triage_Turnbacks.

let
    Source = #"Source_Data",
    #"Filter_Rows" = Table.SelectRows(Source, each ([#"Cancelled?"] = null) and ([#"Section A#(lf)Turnbacks"] <> null)),
    #"Removed_Columns" = Table.SelectColumns(Filter_Rows,{"VENDOR CODE", "Supplier Lookup", "Section A#(lf)Turnbacks"}),
    #"Duplicate_Date_Column" = Table.DuplicateColumn(Removed_Columns, "Section A#(lf)Turnbacks", "Start_Of_Week"),
    #"Start_of_Week" = Table.TransformColumns(#"Duplicate_Date_Column",{{"Start_Of_Week", Date.StartOfWeek, type date}})
in
    Start_of_Week   

| VENDOR CODE | Supplier Lookup | Section ATurnbacks | Start_Of_Week |
|-------------|-----------------|--------------------|---------------|
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A2          | Supplier B      | 16/04/2019         | 15/04/2019    |
| A3          | Supplier C      | 30/05/2019         | 27/05/2019    |
| A10         | Supplier D      | 24/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 18/06/2019         | 17/06/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 24/06/2019         | 24/06/2019    |
| A2          | Supplier B      | 11/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 03/07/2019         | 01/07/2019    |
| A10         | Supplier D      | 02/07/2019         | 01/07/2019    |
| A5          | Supplier E      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A2          | Supplier B      | 12/07/2019         | 08/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A1          | Supplier A      | 29/07/2019         | 29/07/2019    |
| A2          | Supplier B      | 12/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 21/02/2019         | 18/02/2019    |
| A10         | Supplier D      | 23/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 14/06/2019         | 10/06/2019    |
| A10         | Supplier D      | 08/02/2019         | 04/02/2019    |

第二个查询返回前五个星期的星期一和星期日日期.
该查询称为 5_Week_Calendar .

A second query returns the previous five week Monday and Sunday dates.
The query is called 5_Week_Calendar.

    let
        Source = Table.FromList({-5..0}, each{_}),
        AddedStartOfWeek = Table.AddColumn(Source, "StartOfWeek", each Date.StartOfWeek(Date.AddDays(Date.From(DateTime.LocalNow()),[Column1]*7)), type date),
        AddedEndOfWeek = Table.AddColumn(AddedStartOfWeek, "EndOfWeek", each Date.EndOfWeek([StartOfWeek]), type date),
        RemovedColumn = Table.RemoveColumns(AddedEndOfWeek,{"Column1"})
    in RemovedColumn  

| StartOfWeek | EndOfWeek  |
|-------------|------------|
| 01/07/2019  | 07/07/2019 |
| 08/07/2019  | 14/07/2019 |
| 15/07/2019  | 21/07/2019 |
| 22/07/2019  | 28/07/2019 |
| 29/07/2019  | 04/08/2019 |
| 05/08/2019  | 11/08/2019 |

我的第三个查询返回每周计数(即 Triage_Turnbacks.Start_Of_Week 中的记录与 5_Week_Calendar.StartOfWeek 相匹配.

My third query returns the count per week (i.e. which records in Triage_Turnbacks.Start_Of_Week matches 5_Week_Calendar.StartOfWeek.

let
    Source = Table.NestedJoin(#"5_Week_Calendar", {"StartOfWeek"}, Triage_Turnbacks, {"Start_Of_Week"}, "Triage_Turnbacks", JoinKind.LeftOuter),
    #"Removed Columns" = Table.RemoveColumns(Source,{"EndOfWeek"}),
    #"Aggregated Triage_Turnbacks" = Table.AggregateTableColumn(#"Removed Columns", "Triage_Turnbacks", {{"Start_Of_Week", List.Count, "Total Turnbacks"}})
in
    #"Aggregated Triage_Turnbacks"  

    | StartOfWeek | Total Turnbacks |
    |-------------|-----------------|
    | 01/07/2019  | 2               |
    | 08/07/2019  | 1               |
    | 15/07/2019  | 3               |
    | 22/07/2019  | 1               |
    | 29/07/2019  | 1               |
    | 05/08/2019  | 2               |

问题

我如何添加包含记录的最初5周日期01/07/2019之前的运行总计?

The problem

How do I add the running total which includes records before the initial 5 week date of 01/07/2019?

| StartOfWeek | Total Turnbacks | Running Total |
|-------------|-----------------|---------------|
| 01/07/2019  | 2               | 15            |
| 08/07/2019  | 1               | 16            |
| 15/07/2019  | 3               | 19            |
| 22/07/2019  | 1               | 19            |
| 29/07/2019  | 1               | 20            |
| 05/08/2019  | 2               | 22            |

推荐答案

运行总计在概念上很简单,但是据我所知, M 的标准库目前不提供类似的功能 List.CumulativeSum Table.CumulativeSum .这使事情有些尴尬,可能需要自定义功能.

A running total is simple in concept but, as far as I know, M's standard library does not currently provide anything like List.CumulativeSum or Table.CumulativeSum. This makes things a bit awkward and a custom function will likely be needed.

总的运行量可以通过以下方式实现:

A running total can be achieved with something like:

ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),

给定一个数字列表,它返回一个数字列表.

which, given a list of numbers, returns a list of numbers.

表的等效项可能是以下功能.(我敢肯定有很多方法可以做到这一点;比其他方法更具性能/可扩展性.)

The equivalent for tables might be the function below. (I am sure there are many ways of doing this; some more performant/scalable than others.)

TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) =>
    let
        runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
        rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
        leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
        joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
        dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"})
    in dropped,


总体而言,回到您的问题,我认为可以通过以下代码实现:


Overall and going back to your question, I think it can be achieved with the code below:

let
    // This is just a function which returns a record. The record itself
    // contains two values:
    //    • a list of dates (which will be the "left join column")
    //    • a replacer function (which conditionally replaces dates)
    GetReplacerAndDates = (n as number, replaceWith as any) as record => 
        let
            startOfCurrentWeek = Date.StartOfWeek(DateTime.Date(DateTime.LocalNow())),
            nMondaysAgo = Date.AddWeeks(startOfCurrentWeek, -n),
            defaultAndLastNWeeks = {replaceWith} & List.Dates(nMondaysAgo, n + 1, #duration(7, 0, 0, 0)),
            conditionalReplacer = (someDate as date) =>
                let
                    startOfWeek = Date.StartOfWeek(someDate),
                    startOfWeekOrReplacement = if startOfWeek >= nMondaysAgo then startOfWeek else replaceWith
                in startOfWeekOrReplacement,
            toReturn = [replacer = conditionalReplacer, values = defaultAndLastNWeeks]
        in toReturn,
    rec = GetReplacerAndDates(5, "Prior period"), // Needs a better variable name.

    // You don't need this below, I only used it to give me a starting point
    // and so that I could verify my answer.
    triageTurnbacks = Table.FromColumns(
        {
            {"A1","A1","A2","A3","A10","A5","A1","A1","A2","A3","A10","A5","A1","A1","A2","A1","A1","A2","A3","A10","A5","A10"},
            {"Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier D"},
            {#date(2019,4,8), #date(2019,4,8), #date(2019,4,16), #date(2019,5,30), #date(2019,4,24), #date(2019,6,18), #date(2019,7,17), #date(2019,6,24), #date(2019,6,11), #date(2019,7,3), #date(2019,7,2), #date(2019,7,17), #date(2019,7,17), #date(2019,8,5), #date(2019,7,12), #date(2019,8,5), #date(2019,7,29), #date(2019,6,12), #date(2019,2,21), #date(2019,4,23), #date(2019,6,14), #date(2019,2,8)}
        },
        type table [Vendor Code = text, Supplier Lookup = text, Section A Turnbacks = date]
    ),
    // I think the single step "startOfWeekColumn" below should probably replace the #"Duplicate_Date_Column"
    // and #"Start_of_Week" steps of your Triage_Turnbacks query.
    // You could also assign rec[replacer] to a variable,
    // just to make the code/call site easier to read.
    startOfWeekColumn = Table.AddColumn(triageTurnbacks, "startOfWeek", each rec[replacer]([Section A Turnbacks]), type any),
    datesToJoin = Table.FromColumns({rec[values]}, {"startOfWeek"}),
    joined = Table.NestedJoin(datesToJoin, "startOfWeek", startOfWeekColumn, "startOfWeek", "Total Turnbacks", JoinKind.LeftOuter),
    reduced = Table.TransformColumns(joined, {{"Total Turnbacks", Table.RowCount, type number}}),

    // Helper functions. I don't think cumulative summation can be achieved natively (as far as I know).
    ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),
    TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) as table =>
        let
            runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
            rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
            leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
            joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
            dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"}),
            changedTypes = Table.TransformColumnTypes(dropped, {{newColumnName, type number}})
        in changedTypes,
    runningTotal = TableCumulativeSum(reduced, "Total Turnbacks", "Running Total")
in
    runningTotal

这给了我这个

与您的预期输出匹配,但从 2019年7月22日开始的一周的 Total Turnbacks 列除外(您的 1 ,我有 0 ).

which I think matches your expected output with the exception of the Total Turnbacks column of the week starting 22/07/2019 (where you have 1 and I have 0).

您显然可以从最终表中过滤/删除"<期间> "行.有关更多详细信息,请参见代码中的注释.如果他们没有帮助,请告诉我.

You could obviously filter/remove the "Prior period" row from the final table. For further details, please see comments in the code. If they don't help, let me know.

这篇关于将运行总计添加到PowerQuery结果中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆