标识以逗号分隔的字符串的新添加项 [英] Identifying new additions to a comma-delimited string
问题描述
我有一个Excel电子表格,其中包含50行以逗号分隔的数据.逗号分隔数据 从底部到顶部 内包含的要素数量,即第50行(最后一行)的分隔符最少,第1行(第一行)行)始终具有最多的分隔符.功能部件的数量随机增加,并且每个功能部件可以是唯一的或重复的.可以将多个或单个功能部件添加到每行的字符串中.这些要素将随机放置在前几行的逗号分隔字符串中,即可以将它们放置在前一行中字符串的中间,或者放在前一字符串的开头或结尾.如果一行中添加了多个,则它们可能不会放在一起.例如:
I have an excel spreadsheet with 50 rows of comma-delimited data. The number of features contained within the comma-delimited data increases from bottom to top i.e row 50 (the last row) always has the fewest delimiters, and row 1 (the first row) always has the most delimiters. The number of features increases randomly, and each feature can be either unique or duplicated. Either multiple or single features can be added to the string on each row. The features are placed randomly into the previous rows' comma-delimited string i.e they can be placed into the middle of the string on the previous row, or at the beginning or the end of the previous string. If there are multiple added to a row, they may not be placed together. For example:
1 fish,pig,cat,dog,fish,mouse,fish,cow
2 pig,cat,dog,fish,mouse,fish
3 pig,cat,dog,fish,mouse
4 pig,cat,dog,mouse
5 pig,cat,dog,mouse
6 cat,dog,mouse
7 cat,mouse
8 cat,mouse
9 cat
10
我需要提取已添加到每行上以逗号分隔的字符串中的功能,最好使用UDF.上面示例中的期望输出为:
I need to extract the feature(s) that have been added to the comma-delimited string on each row, preferably using a UDF. The desired output from the above example would be:
1 fish,cow
2 fish
3 fish
4
5 pig
6 dog
7
8 mouse
9 cat
10
使用UDF比较相邻行并提取相邻列的两行之间的唯一值(例如,如果UDF用于B4中的第4行和第5行,则B4将为空白),我取得了一些成功. ,如果UDF用于B3中的第3行和第4行,则B3的值为"fish").但是,这会导致问题,因为某些功能是重复的(请参见上例中的第1行和第2行).当将重复项添加到字符串中时,这会导致UDF返回空白值.
I have had some success using a UDF that compares adjacent rows, and extracts any unique values between the two rows in an adjacent column (i.e if the UDF is used on rows 4 and 5 in B4, B4 will be blank; however, if the UDF is used on rows 3 and 4 in B3, B3 will have the value "fish") . However, this causes problems since some of the features are duplicated (see rows 1 and 2 in the above example). This results in the UDF returning a blank value when a duplicate has been added to the string.
我在堆栈交换中发现的这些(非常细微调整的)UDF(尤其是前者)取得了最大的成功:
I have had most success with these (very slightly adjusted) UDFs that I found on stack exchange, particularly the former:
Function NotThere(BaseText As String, TestText As String) As String
Dim V As Variant, BaseWords() As String
NotThere = "" & TestText & ","
For Each V In Split(BaseText, ",")
NotThere = Replace(NotThere, V & ",", ",")
Next
NotThere = Mid(Application.Trim(NotThere), 3, Len(NotThere) - 0)
End Function
和
Function Dups(R1 As String, R2 As String) As String
Dim nstr As String, R As Variant
For Each R In Split(R2, ",")
If InStr(R1, Trim(R)) = 0 Then
nstr = nstr & IIf(nstr = "", R, "," & R)
End If
Next R
Dups = nstr
End Function
我也尝试了此处建议的方法: http://www.ozgrid.com /VBA/array-differences.htm ,但不断出现#VALUE错误.
I have also tried the method suggested here: http://www.ozgrid.com/VBA/array-differences.htm, but continually get #VALUE errors.
推荐答案
迭代两个数组,并在找到重复项时将其删除.完成后返回剩下的内容:
iterate both arrays and remove as duplicates are found. When done return what is left:
Function newadd(rng1 As String, rng2 As String) As String
If rng1 = "" Then
newadd = rng2
Exit Function
End If
Dim spltStr1() As String
spltStr1 = Split(rng1, ",")
Dim spltstr2() As String
spltstr2 = Split(rng2, ",")
Dim i As Long, j As Long
Dim temp As String
For i = LBound(spltstr2) To UBound(spltstr2)
For j = LBound(spltStr1) To UBound(spltStr1)
If spltStr1(j) = spltstr2(i) Then
spltStr1(j) = ""
spltstr2(i) = ""
Exit For
End If
Next j
If spltstr2(i) <> "" Then
temp = temp & "," & spltstr2(i)
End If
Next i
newadd = Mid(temp, 2)
End Function
这篇关于标识以逗号分隔的字符串的新添加项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!