索引多个列和匹配不同的值,返回列之间的唯一值列表 [英] Index multiple columns and Match distinct values, returning list of unique values across columns

查看:162
本文介绍了索引多个列和匹配不同的值,返回列之间的唯一值列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经搜索了很多广泛的解决方案,解决了我的问题。我已经提出了部分工作的解决方案,我将在底部包含那些可能知道如何修改/扩展它们以解决问题的人。



这里是我正在努力完成(以下描述参考此屏幕截图



1)我我正在寻找匹配列G与列A($ G2与$ A $ 2:$ A $ 10)的值。 2)如果值匹配,则从相邻列B,C和D中提取唯一值(不包括空格),并将其返回到列(H至M) 。因此,如所需结果1表格(G1:M10)所示,如果在A2:A10中找到G2(温哥华),则提取索引值列B,C和D,其中温哥华是一个匹配(即蓝色,绿色,绿色,红色,黄色,绿色,绿色,粉红色,粉红色,粉红色,),只返回唯一的值)横列H到L(蓝色,绿色,红色,黄色,青色,粉红色)。



我一直在寻找一个可以弹出H2的公式解决方案然后水平拖动到M2,以显示结果。但是,我没有结婚这个解决方案。我还考虑过一个替代方法是以逗号分隔的格式而不是跨列返回唯一的值(请参阅所需的结果2)。如果这更容易完成,我都是为了。



注意:我将在10,000行以上运行此公式。



部分解决方案我已经被吞噬了:



1){Array formula 1},结果可以看出



10K行数据解决方案



参考范围的大小增加。做为数组公式的唯一方法是在温哥华创建唯一列表时仅引用温哥华 的行,并且仅引用西雅图当您为西雅图创建唯一的列表时。换句话说,当温哥华的数据是行2:6时,不要引用温哥华的行2:10000。


注意:你将以A列为主键对数据进行排序。此操作需要它。未分类的数据(甚至分组)将不允许第二个MATCH函数找到终止行。


列B中的数据行包含温哥华A列可以参考。

  INDEX(B:B,MATCH(vancouver,A:A, 0)):INDEX(B:B,MATCH(vancouver,A:A))

现在你需要做的就是用以上代替所有出现的 $ B $ 2:$ B $ 16 。调整公式,并将 $ C $ 2:$ C $ 16 $ D $ 2:$ D $ 16 替换为好。奖金是,您可以摆脱支票以查看 IF($ A $ 2:$ A $ 16 = $ G2,... 因为根据定义,您只是引用行中列有适当的城市。请确保参考G2,不要在温哥华中进行硬编码。

 '在H2中作为数组公式与CSE 
= IFERROR(INDEX(INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0))):INDEX($ B:$ B ,MATCH($ G2,$ A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0))):INDEX B:$ B,MATCH($ G2,$ A:$ A))),COUNTIF($ G2:G2,INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0)) INDEX($ B:$ B,MATCH($ G2,$ A:$ A))),1),0)),
IFERROR(INDEX($ C:$ C,MATCH (A,$ A,0)):INDEX($ C:$ C,MATCH($ G2,$ A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ C:$ C,MATCH ($ G2,$ A:$ A,0)):INDEX($ C:$ C,MATCH($ G2,$ A:$ A)))),COUNTIF($ G2:G2,INDEX($ C:$ C,MATCH($ G2,$ A:$ A,0)):INDEX($ C:$ C,MATCH($ G2,$ A:$ A))),1),0)),
IFERROR(INDEX($ D:$ D,MATCH($ G2,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A)),MATCH 0,IF(SIGN(LEN(INDEX($ D:$ D,MATCH($ G2,$ A:$ A, 0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A)))),COUNTIF($ G2:G2,INDEX($ D:$ D,MATCH($ G2,$ A: $ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A))),1),0)),
TEXT(,))))

您应该能够使用该数组公式的计算时间,即使通过10K行。





我很确定这是本地工作表功能可以去



附录二



有一个进一步优化。工作表 IF 仅处理公式的部分为true。如果您查看城市的名称,并且只有在城市名称更改时才处理公式的数组部分,那么从上方直接复制到相同的位置时,您应该能够进一步限制计算。

 '在H2中作为数组公式,CSE 
= IF($ G2 = $ G1,H1,
IFERROR(INDEX ($ B:$ B,MATCH($ G2,$ A:$ A,0)):INDEX($ B:$ B,MATCH($ G2,$ A:$ A)),MATCH(0,IF (LEN($ B:$ B,MATCH($ G2,$ A:$ A,0)):INDEX($ B:$ B,MATCH($ G2,$ A:$ A))),COUNTIF ($ G2:G2,INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0)):INDEX($ B:$ B,MATCH($ G2,$ A:$ A)) ,$ 1,$ A,$ A,0)),
IFERROR(INDEX(INDEX($ C:$ C,MATCH($ G2,$ A:$ A,0)):INDEX($ C:$ C,MATCH G2,$ A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ C:$ C,MATCH($ G2,$ A:$ A,0)):INDEX($ C:$ C ($ G2,$ A:$ A))),COUNTIF($ G2:G2,INDEX($ C:$ C,MATCH($ G2,$ A:$ A,0)):INDEX($ C :$ C,MATCH($ G2,$ A:$ A)),1),0)),
IFERROR(INDEX(INDEX($ D:$ D,MATCH($ G2,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ D:$ D, $ A :$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A)))),COUNTIF($ G2:G2,INDEX($ D:$ D,MATCH ,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A))),1),0)),
TEXT(,))) ))

附录3



单一城市条目存在问题,此修正案负责处理。


您可能会收到通知参考警告。在技​​术上,警告是真实的,但是由于嵌套的IF结构,您永远不会获得循环引用。换句话说,循环引用仅在IF为假时有效。 Excel报告,因为它没有检查条件;只有在技术上是真实但不能真正发生的循环引用的可能性。




 在H2中作为数组公式与CSE 
= IF($ G2 = $ G1,H1,IF(COUNTIF($ A:$ A,$ G2)= 1,
IFERROR(INDEX(INDEX B:$ D,MATCH($ G2,$ A:$ A,0),0),MATCH(0,IF(INDEX($ B:$ D,MATCH($ G2,$ A:$ A,0) 0)",COUNTIF($ G2:G2,INDEX($ B:$ D,MATCH($ G2,$ A:$ A,0),0)),1),0) ($ B,$ B,$ B,$ B,$ B,B,MATCH($ G2,$ A:$ A,0)):INDEX A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0)):INDEX($ B:$ B,MATCH $ G2,$ A:$ A)))),COUNTIF($ G2:G2,INDEX($ B:$ B,MATCH($ G2,$ A:$ A,0)):INDEX($ B:$ B ,MATCH($ G2,$ A:$ A)),1),0)),
IFERROR(INDEX(INDEX($ C:$ C,MATCH($ G2,$ A:$ A,0 ):INDEX($ C:$ C,MATCH($ G2,$ A:$ A)),MATCH(0,IF(SIGN(LEN(INDEX($ C:$ C,MATCH($ G2,$ A: ($ C:$ C,MATCH($ G2,$ A:$ A)))),COUNTIF($ G2:G2,INDEX($ C:$ C,MATCH($ G2, $ A:$ A,0)):INDEX($ C:$ C,MATCH($ G2,$ A:$ A))),1),0)),
IFERROR INDEX($ D:$ D,MATCH($ G2,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A)),MATCH(0, IF(SIGN(LEN(INDEX($ D:$ D,MATCH($ G2,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A))) ),COUNTIF($ G2:G2,INDEX($ D:$ D,MATCH($ G2,$ A:$ A,0)):INDEX($ D:$ D,MATCH($ G2,$ A:$ A ))),1),0)),
TEXT(,)))))))






¹如果您的Excel版本不支持 TEXTJOIN函数,搜索本网站 [excel] [textjoin] ,以锯齿形长度分隔的字符串连接来查找替代方法。


I've searched far and wide for a solution to my problem... over several long weeks now. I've come up a partially working solutions, which I'll include at the bottom for those who might know how to modify/extend them to resolve the problem.

Here's what I'm trying to accomplish (the following descriptions are in reference to this screen capture http://imgur.com/oTkbjrw):

1) I am looking to match values from Column G with Column A ($G2 with $A$2:$A$10).

2) If the values match, then extract unique values (excluding blanks) from adjacent Columns B, C, and D, and return them across columns (H through M).

So, as shown in the 'desired results 1' table (G1:M10), if G2 (Vancouver) is found in A2:A10, then extract indexed values from columns B,C, and D where Vancouver is a match (i.e., Blue, ,Green,Green,Red,Yellow,Teal,Green, , , ,Pink,Pink,Pink, ,) and return only the unique values (excluding blanks) across columns H through L (blue,Green,Red,Yellow,Teal,Pink).

I've been looking for a formula solution that can be popped into H2 and then dragged horizontally to M2, to reveal the results. I'm not married to this solution, however. An alternative that I've also considered is to return the unique values in comma delimited format rather than across columns(see 'desired Results 2'). If this is easier to accomplish, I'm all for it.

Note: I will be running this formula on 10,000+ rows. A lean/efficient solution is desirable if possible.

PARTIAL SOLUTIONS I'VE PUT TOGETHER:

1) {Array formula 1}, results can be seen here:

=IFERROR(INDEX($B$2:$B$10, SMALL(IF(COUNTIF($G2,$A$2:$A$10), MATCH(ROW($B$2:$B$10), ROW($B$2:$B$10)), ""), COLUMN(A1))),"")

This formula is only able to index column B, but it successfully matches and returns values across columns. Unfortunately, it doesn't extract unique values and returns blank cells.

2) Re: Justin's comment about a likely VBA solution, figure I should pop up one of the VBA solutions I came across.

Function UNIQUE_PH(Lookupvalue As String, LookupRange As Range, ColumnNumber As Integer)
Dim i As Long
Dim Result As String
For i = 1 To LookupRange.Columns(1).Cells.Count
  If LookupRange.Cells(i, 1) = Lookupvalue Then
    For J = 1 To i - 1
    If LookupRange.Cells(J, 1) = Lookupvalue Then
      If LookupRange.Cells(J, ColumnNumber) = LookupRange.Cells(i, ColumnNumber) Then
        GoTo Skip
      End If
    End If
    Next J
    Result = Result & " " & LookupRange.Cells(i, ColumnNumber) & ","
Skip:
  End If
Next i
UNIQUE_PH = Left(Result, Len(Result) - 1)
End Function

'It takes 3 arguments as inputs:

'1. Lookupvalue – A string that we need to look-up in a range of cells.
'2. LookupRange  – An array of cells from where we need to fetch the data 
'3. ColumnNumber – It is the column number of the table/array from which matching value is to be returned (e.g. 2 for second column).

This basically does what the above formula accomplishes, with the exception that it successfully identifies unique values. It presents the results in one cell as comma delimited (similar to 'desired Results 2'). This isn't able to search through multiple columns and doesn't remove blank cells. I also tried to run it on 10,000 rows of data and it was very slow. I know very little about VBA, so I'm not sure what could be contributing to the lag.

解决方案

Sample Data solution

'in G2
=A2
'in H2 as an array formula with CSE
=IFERROR(INDEX($B$2:$B$16, MATCH(0, IF($A$2:$A$16=$G2, IF(SIGN(LEN($B$2:$B$16)), COUNTIF($G2:G2, $B$2:$B$16), 1), 1), 0), 1),
 IFERROR(INDEX($C$2:$C$16, MATCH(0, IF($A$2:$A$16=$G2, IF(SIGN(LEN($C$2:$C$16)), COUNTIF($G2:G2, $C$2:$C$16), 1), 1), 0), 1),
 IFERROR(INDEX($D$2:$D$16, MATCH(0, IF($A$2:$A$16=$G2, IF(SIGN(LEN($D$2:$D$16)), COUNTIF($G2:G2, $D$2:$D$16), 1), 1), 0), 1),
 TEXT(,))))
'in G19
=G2
'in H19
=TEXTJOIN(",", TRUE, H2:N2)

Fill H2 right then G2:N2 down as appropriate. Fill G19:H19 down to collate the values above. See footnote ¹ if you get a #NAME! error on the TEXTJOIN function.

10K Rows of Data solution

Array formulas chew up calculations exponentially the referenced ranges increase in size. The only way to do this as an array formula is to only reference the rows with Vancouver when you are creating a unique list for Vancouver and only reference the rows with Seattle when you are creating a unique list for Seattle. In other words, don't reference rows 2:10000 for Vancouver when Vancouver's data is in rows 2:6.

Note: you will have to sort your data with column A as the primary key. This operation requires it. Unsorted data (even grouped) will not allow the second MATCH function to locate the terminating row.

The rows of data in column B containing Vancouver in column A can be referenced with this.

INDEX(B:B, MATCH("vancouver", A:A, 0)):INDEX(B:B, MATCH("vancouver", A:A))

Now all you have to do is replace all occurrences of $B$2:$B$16 with the above. Adjust the formula and make the replacements for $C$2:$C$16 and $D$2:$D$16 as well. The bonus is that you can get rid of the check to see IF($A$2:$A$16=$G2, ... since by definition, you are only referencing rows with the appropriate city in column A. Make sure you reference G2 and do not hardcode in Vancouver.

'in H2 as an array formula with CSE
=IFERROR(INDEX(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A))), 1), 0)),
 IFERROR(INDEX(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A))), 1), 0)),
 IFERROR(INDEX(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A))), 1), 0)),
 TEXT(,))))

You should be able to live with the calculation time on that array formula even through 10K rows.

I'm pretty sure that this is as far as native worksheet functions can go. Further improvements would be using variant memory arrays.

Addendum²

There is one further optimization. A worksheet IF only processes the part of the formula that is true. If you look at the names of the cities and only process the array portion of the formula when the city names change, copying directly from above when they are the same you should be able to limit the calculations further.

'in H2 as an array formula with CSE
=IF($G2=$G1, H1, 
    IFERROR(INDEX(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A))), 1), 0)),
    IFERROR(INDEX(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A))), 1), 0)),
    IFERROR(INDEX(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A))), 1), 0)),
    TEXT(,)))))

Addendum³

There was a problem with single city entries and this amendment takes care of that.

You may receive a Circular Reference warning with this. Technically, the warning is true but you wil never actually get a circular reference due to the nested IF structure. In other words, the circular reference is only valid when an IF is false. Excel reports this because it does not examine the conditions; only the possibility of a circular reference which technically is true but can never actually occur.

'in H2 as an array formula with CSE
=IF($G2=$G1, H1, IF(COUNTIF($A:$A, $G2)=1,
    IFERROR(INDEX(INDEX($B:$D, MATCH($G2, $A:$A, 0), 0), MATCH(0, IF(INDEX($B:$D, MATCH($G2, $A:$A, 0), 0)<>"", COUNTIF($G2:G2, INDEX($B:$D, MATCH($G2, $A:$A, 0), 0)), 1), 0)), TEXT(,)),
    IFERROR(INDEX(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($B:$B, MATCH($G2, $A:$A, 0)):INDEX($B:$B, MATCH($G2, $A:$A))), 1), 0)),
    IFERROR(INDEX(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($C:$C, MATCH($G2, $A:$A, 0)):INDEX($C:$C, MATCH($G2, $A:$A))), 1), 0)),
    IFERROR(INDEX(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)), MATCH(0, IF(SIGN(LEN(INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A)))), COUNTIF($G2:G2, INDEX($D:$D, MATCH($G2, $A:$A, 0)):INDEX($D:$D, MATCH($G2, $A:$A))), 1), 0)),
    TEXT(,))))))


¹ If your Excel version does not support the TEXTJOIN function, search this site for [excel][textjoin] to find alternatives in ragged-length, delimited string concatenation.

这篇关于索引多个列和匹配不同的值,返回列之间的唯一值列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆