如何找到第n个最相似的值 [英] How to Find the n-th most similar value

查看:155
本文介绍了如何找到第n个最相似的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目标是找到两个最相似的操作日,并返回其生产测量的平均值。这将有助于未来的运营规划。



所以我有一个过去90天的设施和运营数据清单。所以每个设施每天将有一排。该信息在自己的工作表上,并从基于excel的查询中引入。所以基本上这个选项卡PDA有设施,日期和总输出列。



在第二个标签上,分析器,我有一个类似的设置。但它的未来日期。我想做的是采用CPWTotal(未来日期的总产量),并从PDA选项卡中找到两个最相似的日期,并返回这两个总输出。我尝试使用一个索引(小)和最小(abs)函数的混搭来找到最小的差异,但它并没有像我所希望的那样工作。



以下是我最接近的,但与建筑号码的比较已经是主要的挂断。 PDA包含历史输出数据,PDA!W包含历史数据建立号码,A3包含未来计划编号,C3包含未来的输出预测。

  {= INDEX(PDA!$ AB $ 2:$ AB $ 10000,MATCH(($ A3 = PDA!$ W $ 2: $($ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ code> 

我知道()* Min()部分可能会跳过我,但是我的逻辑是0 * Min ()如果建筑物号码不匹配,那么为该行匹配返回0,我还没有找到更好的方法。



这不是最后一步,但是我需要设置索引(匹配)来拉出我需要平均的其余的操作数据。



我也对VBA解决方案开放。



编辑:样本数据:



选项卡1:PDA返回(我希望返回输出的工作表)1和2是将返回最接近和最接近的第二个匹配的列。



Bulding#2016 CPW Forecast 1 2



4505 29-Feb-16 1789

4505 01-Mar-16 2270 < br>
4505 02-Mar-16 2108

4505 03-Mar-16 2028

4505 04-Mar-16 1912

4600 29 -Feb-16 1653

4600 01-Mar-16 2264

4600 02-Mar-16 2232

4600 03-Mar-16 2013

4600 04-Mar-16 1876

4650 29-Feb-16 1799

4650 01-Mar-16 2254

4650 02-Mar -16 2181

4650 03-Mar-16 1902

4650 04-Mar-16 1910



Tab 2(PDA )包含我要搜索的信息:



建筑#2016 CPW总生产



4505 29-Feb-16 1799

4505 01-Mar-16 2250

4505 02-Mar-16 2118

4505 03-Mar-16 2025

4505 04-Mar-16 1914

4600 29-Feb-16 1654

4600 01-Mar-16 2262

4600 02- Mar-16 2231

4600 03-Mar-16 2016

4600 04-Mar-16 1826

4650 29-Feb-16 1789

4650 01-Mar-16 2244

4650 02-Mar-16 2161

4650 03-Mar-16 1912

4650 04-Mar- 16 1920



我不知道如何上传示例excel文件,但您应该能够复制/粘贴这些表。

解决方案

你将不得不使用伪MINIFS两次;一次返回最小天数差,再次返回第一个匹配行号。


为了动态范围分配,以下将按照AB列中最后一个日期的行号定义每个范围的范围。范围 PDA!W $ 2:W $ 14 成为

  PDA!W $ 2:INDEX(PDA!W:W,MATCH(1E + 99,PDA!AB:AB))

将以这种方式定义所有范围。完整的列引用应 示例数据从[excel-2010]解决方案。


The goal is to find the two most similar operational days, and return their averages for the production measurements. This will help future operational planning.

So I have a list of facilities and their operational data for the past 90 days. So each facility will have one row per day. This info is on its own worksheet, and is pulled in from an excel-based query. So essentially this tab "PDA" has the columns "Facility", "Date", and "Total Output".

On the 2nd tab, "Analyzer", I have a similar set up. But its for future dates. What I want to do is take the "CPWTotal" (total output for the future dates) and find the two most similar days from the "PDA" tab and return those two "Total Outputs". I tried using a mash up of a index(small) and min(abs) function to find the smallest differences, but it didn't work nearly as well as I hoped.

Below is the closest I've gotten, but the comparison to building number has been a major hang up.

PDA!AB contains the historical output data, PDA!W contains the historical data building number, A3 contains the future plan building number, and C3 contains the future output forecast.

{=INDEX(PDA!$AB$2:$AB$10000,MATCH(($A3 = PDA!$W$2:$W$10000)*MIN(ABS(PDA!$AB$2:$AB$10000-$C3)),ABS(PDA!$AB$2:$AB$10000-$C3),0))}

I know the ()*Min() portion might be tripping me up, but my logic was that is would be 0*Min() if the building number was not a match, thus returning a 0 for that line of the match, and I have not been able to figure out a better approach.

This isn't the last step, but its what I need to set up the index(match) to pull the rest of the operational data that I need to average out.

I am also open to VBA solutions.

EDIT: Sample Data:

Tab 1: PDA Return (worksheet where I want the output to be returned) 1 and 2 are the columns where the closest and 2nd closest match will be returned.

Bulding # 2016 CPW Forecast 1 2

4505 29-Feb-16 1789
4505 01-Mar-16 2270
4505 02-Mar-16 2108
4505 03-Mar-16 2028
4505 04-Mar-16 1912
4600 29-Feb-16 1653
4600 01-Mar-16 2264
4600 02-Mar-16 2232
4600 03-Mar-16 2013
4600 04-Mar-16 1876
4650 29-Feb-16 1799
4650 01-Mar-16 2254
4650 02-Mar-16 2181
4650 03-Mar-16 1902
4650 04-Mar-16 1910

Tab 2 (PDA) contains the info I want to search for:

Building # 2016 CPW Total Production

4505 29-Feb-16 1799
4505 01-Mar-16 2250
4505 02-Mar-16 2118
4505 03-Mar-16 2025
4505 04-Mar-16 1914
4600 29-Feb-16 1654
4600 01-Mar-16 2262
4600 02-Mar-16 2231
4600 03-Mar-16 2016
4600 04-Mar-16 1826
4650 29-Feb-16 1789
4650 01-Mar-16 2244
4650 02-Mar-16 2161
4650 03-Mar-16 1912
4650 04-Mar-16 1920

I don't know how to upload a sample excel file but you should be able to just copy/paste those tables.

解决方案

You are going to have to use a pseudo-MINIFS twice; once to return the minimum days differential and again to return the first matching row number.

For the purpose of dynamic range allocation, the following will define the extents of each range by the row number of the last date in column AB. The range PDA!W$2:W$14 becomes,

PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB))

All ranges will be defined this way. Full column references should not be used with the AGGREGATE function. Additions and deletions of data will be automatically adjusted for

Standard Formula for

The standard formula in D3 is,

=INDEX(PDA!$AC:$AC,
       AGGREGATE(15, 6, ROW(PDA!AC$2:INDEX(PDA!AC:AC, MATCH(1E+99, PDA!AB:AB )))/
     ((PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB ))=A$3)*
      (ABS(PDA!AB$2:INDEX(PDA!AB:AB, MATCH(1E+99, PDA!AB:AB))-C$3)=
       AGGREGATE(15,6,ABS(PDA!AB$2:INDEX(PDA!AB:AB, MATCH(1E+99, PDA!AB:AB ))-C$3)/
      (PDA!W$2:INDEX(PDA!W:W, MATCH(1E+99, PDA!AB:AB ))=A$3),1))),ROW(1:1)))

That returns the first match. Fill down for the second, third, etc matches if they exist.

    
            Sample data from [excel-2010] solution.

这篇关于如何找到第n个最相似的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆