从排序表中选择不同的值 [英] Selecting Distinct Values from sorted Table
本文介绍了从排序表中选择不同的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
csv文件的简化结构如下面的html-table片段所示:
数据-lang="js"数据-隐藏="假"数据-控制台="真"数据-巴贝尔="假"><table style="text-align:center">
<tr>
<th style="width:100px">Response-ID</th>
<th style="width:80px">Task-ID</th>
<th style="width:200px">Executed Action</th>
<th style="width:200px">Status</th>
<th style="width:80px">Date</th>
</tr>
<tr>
<td>1</td>
<td>A</td>
<td>Some Task to be done</td>
<td>Open</td>
<td>20.01.2020</td>
</tr>
<tr>
<td>2</td>
<td>A</td>
<td>Another Action done today</td>
<td>Open</td>
<td>25.01.2020</td>
</tr>
<tr>
<td>3</td>
<td>B</td>
<td>New Task</td>
<td>Open</td>
<td>25.01.2020</td>
</tr>
<tr>
<td>4</td>
<td>A</td>
<td>More actions in this matter</td>
<td><b>Done</b></td>
<td>26.01.2020</td>
</tr>
<tr>
<td>5</td>
<td>C</td>
<td>Another Task</td>
<td>Open</td>
<td>27.02.2020</td>
</tr>
<tr>
<td>6</td>
<td>B</td>
<td>Whatever</td>
<td><b>Done</b></td>
<td>29.02.2020</td>
</tr>
<tr>
<td>7</td>
<td>D</td>
<td>Now what?</td>
<td>Open</td>
<td>02.03.2020</td>
</tr>
<tr>
<td>8</td>
<td>C</td>
<td>Almost done...</td>
<td>Open</td>
<td>02.03.2020</td>
</tr>
</table>
我要实现的是一个Power Query M代码,它只显示每个未完成的任务ID的最新条目。
以下是我到目前为止的情况:
let
Source = ResponseData //a copy of my actual data query "ResponseData" to work with in this case
#"Transformed Table" = Table.Distinct(Table.Buffer(Table.Sort(Source, {{"Date", Order.Descending}})), "Task-ID"),
#"Remove Finished Tasks" = Table.SelectRows(#"Transformed Table", each ([Status] <> "Done" and [Status] <> "Canceled"))
in
#"Remove Finished Tasks"
此代码实际上返回所有打开的任务的列表。但出乎我的意料,它只包含每个任务的最旧的响应ID,而不是最新的。更改#"Transformed Table"
行中的排序顺序会以相反的顺序返回完全相同的结果。因此,基本上对表进行排序对结果数据没有任何影响...
根据Ron Rosenfeld的回答编辑了上面的代码->缓冲已排序的表可以做到这一点。
推荐答案
由于我不完全理解的原因(似乎与查询折叠有关),Power Query Remove Duplicates
在特定的数据源上运行时,显然不遵守排序顺序。但如果使用Table.Buffer
,则可确保折叠不会发生,而Remove Duplicates
将在排序/筛选后进行。
不确定在大型数据集上哪个更快--Table.Buffer
与Table.Group
CSV文件
Response-ID,Task-ID,Executed Action,Status,Date
1,A,Some Task to be done,Open,20.01.2020
2,A,Another Action done today,Open,25.01.2020
3,B,New Task,Open,25.01.2020
4,A,More actions in this matter,Done,26.01.2020
5,C,Another Task,Open,27.02.2020
6,B,Whatever,Done,29.02.2020
7,D,Now what?,Open,02.03.2020
8,C,Almost done...,Open,02.03.2020
mCode的前几行略有不同,但这应该会为您提供要点:
let
Source = Csv.Document(File.Contents("C:Users
onDesktop est.csv"),[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Response-ID", Int64.Type}, {"Task-ID", type text}, {"Executed Action", type text}, {"Status", type text}, {"Date", type text}}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Changed Type", {{"Date", type date}}, "en-150"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type with Locale", each ([Status] = "Open")),
#"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"Date", Order.Descending}}),
#"Buffered Table" = Table.Buffer(#"Sorted Rows"),
#"Removed Duplicates" = Table.Distinct(#"Buffered Table", {"Task-ID"}),
#"Sorted Rows1" = Table.Sort(#"Removed Duplicates",{{"Task-ID", Order.Ascending}})
in
#"Sorted Rows1"
结果
这篇关于从排序表中选择不同的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文