如何找到第n个最相似的值 [英] How to Find the n-th most similar value
问题描述
目标是找到两个最相似的操作日,并返回其生产测量的平均值。这将有助于未来的运营规划。
所以我有一个过去90天的设施和运营数据清单。所以每个设施每天将有一排。该信息在自己的工作表上,并从基于excel的查询中引入。所以基本上这个选项卡PDA有设施,日期和总输出列。
在第二个标签上,分析器,我有一个类似的设置。但它的未来日期。我想做的是采用CPWTotal(未来日期的总产量),并从PDA选项卡中找到两个最相似的日期,并返回这两个总输出。我尝试使用一个索引(小)和最小(abs)函数的混搭来找到最小的差异,但它并没有像我所希望的那样工作。
以下是我最接近的,但与建筑号码的比较已经是主要的挂断。 PDA包含历史输出数据,PDA!W包含历史数据建立号码,A3包含未来计划编号,C3包含未来的输出预测。
{= INDEX(PDA!$ AB $ 2:$ AB $ 10000,MATCH(($ A3 = PDA!$ W $ 2: $($ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ code>
我知道()* Min()部分可能会跳过我,但是我的逻辑是0 * Min ()如果建筑物号码不匹配,那么为该行匹配返回0,我还没有找到更好的方法。
这不是最后一步,但是我需要设置索引(匹配)来拉出我需要平均的其余的操作数据。
我也对VBA解决方案开放。
编辑:样本数据:
选项卡1:PDA返回(我希望返回输出的工作表)1和2是将返回最接近和最接近的第二个匹配的列。
Bulding#2016 CPW Forecast 1 2
4505 29-Feb-16 1789
4505 01-Mar-16 2270 < br>
4505 02-Mar-16 2108
4505 03-Mar-16 2028
4505 04-Mar-16 1912
4600 29 -Feb-16 1653
4600 01-Mar-16 2264
4600 02-Mar-16 2232
4600 03-Mar-16 2013
4600 04-Mar-16 1876
4650 29-Feb-16 1799
4650 01-Mar-16 2254
4650 02-Mar -16 2181
4650 03-Mar-16 1902
4650 04-Mar-16 1910
Tab 2(PDA )包含我要搜索的信息:
建筑#2016 CPW总生产
4505 29-Feb-16 1799
4505 01-Mar-16 2250
4505 02-Mar-16 2118
4505 03-Mar-16 2025
4505 04-Mar-16 1914
4600 29-Feb-16 1654
4600 01-Mar-16 2262
4600 02- Mar-16 2231
4600 03-Mar-16 2016
4600 04-Mar-16 1826
4650 29-Feb-16 1789
4650 01-Mar-16 2244
4650 02-Mar-16 2161
4650 03-Mar-16 1912
4650 04-Mar- 16 1920
我不知道如何上传示例excel文件,但您应该能够复制/粘贴这些表。
你将不得不使用伪MINIFS两次;一次返回最小天数差,再次返回第一个匹配行号。
为了动态范围分配,以下将按照AB列中最后一个日期的行号定义每个范围的范围。范围
PDA!W $ 2:W $ 14
成为PDA!W $ 2:INDEX(PDA!W:W,MATCH(1E + 99,PDA!AB:AB))
将以这种方式定义所有范围。完整的列引用应不与 示例数据从[excel-2010]解决方案。
The goal is to find the two most similar operational days, and return their averages for the production measurements. This will help future operational planning.
So I have a list of facilities and their operational data for the past 90 days. So each facility will have one row per day. This info is on its own worksheet, and is pulled in from an excel-based query. So essentially this tab "PDA" has the columns "Facility", "Date", and "Total Output".
On the 2nd tab, "Analyzer", I have a similar set up. But its for future dates. What I want to do is take the "CPWTotal" (total output for the future dates) and find the two most similar days from the "PDA" tab and return those two "Total Outputs". I tried using a mash up of a index(small) and min(abs) function to find the smallest differences, but it didn't work nearly as well as I hoped.
Below is the closest I've gotten, but the comparison to building number has been a major hang up.
PDA!AB contains the historical output data, PDA!W contains the historical data building number, A3 contains the future plan building number, and C3 contains the future output forecast.
{=INDEX(PDA!$AB$2:$AB$10000,MATCH(($A3 = PDA!$W$2:$W$10000)*MIN(ABS(PDA!$AB$2:$AB$10000-$C3)),ABS(PDA!$AB$2:$AB$10000-$C3),0))}
I know the ()*Min() portion might be tripping me up, but my logic was that is would be 0*Min() if the building number was not a match, thus returning a 0 for that line of the match, and I have not been able to figure out a better approach.
This isn't the last step, but its what I need to set up the index(match) to pull the rest of the operational data that I need to average out.
I am also open to VBA solutions.
EDIT: Sample Data:
Tab 1: PDA Return (worksheet where I want the output to be returned) 1 and 2 are the columns where the closest and 2nd closest match will be returned.
Bulding # 2016 CPW Forecast 1 2
4505 29-Feb-16 1789
4505 01-Mar-16 2270
4505 02-Mar-16 2108
4505 03-Mar-16 2028
4505 04-Mar-16 1912
4600 29-Feb-16 1653
4600 01-Mar-16 2264
4600 02-Mar-16 2232
4600 03-Mar-16 2013
4600 04-Mar-16 1876
4650 29-Feb-16 1799
4650 01-Mar-16 2254
4650 02-Mar-16 2181
4650 03-Mar-16 1902
4650 04-Mar-16 1910Tab 2 (PDA) contains the info I want to search for:
Building # 2016 CPW Total Production
4505 29-Feb-16 1799
4505 01-Mar-16 2250
4505 02-Mar-16 2118
4505 03-Mar-16 2025
4505 04-Mar-16 1914
4600 29-Feb-16 1654
4600 01-Mar-16 2262
4600 02-Mar-16 2231
4600 03-Mar-16 2016
4600 04-Mar-16 1826
4650 29-Feb-16 1789
4650 01-Mar-16 2244
4650 02-Mar-16 2161
4650 03-Mar-16 1912
4650 04-Mar-16 1920I don't know how to upload a sample excel file but you should be able to just copy/paste those tables.
解决方案You are going to have to use a pseudo-MINIFS twice; once to return the minimum days differential and again to return the first matching row number.
For the purpose of dynamic range allocation, the following will define the extents of each range by the row number of the last date in column AB. The range
PDA!W$2:W$14
becomes,PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB))
All ranges will be defined this way. Full column references should not be used with the AGGREGATE function. Additions and deletions of data will be automatically adjusted for
Standard Formula for excel-2010
The standard formula in D3 is,
=INDEX(PDA!$AC:$AC, AGGREGATE(15, 6, ROW(PDA!AC$2:INDEX(PDA!AC:AC, MATCH(1E+99, PDA!AB:AB )))/ ((PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB ))=A$3)* (ABS(PDA!AB$2:INDEX(PDA!AB:AB, MATCH(1E+99, PDA!AB:AB))-C$3)= AGGREGATE(15,6,ABS(PDA!AB$2:INDEX(PDA!AB:AB, MATCH(1E+99, PDA!AB:AB ))-C$3)/ (PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB ))=A$3),1))),ROW(1:1)))
That returns the first match. Fill down for the second, third, etc matches if they exist.
Sample data from [excel-2010] solution.这篇关于如何找到第n个最相似的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!