如何在Excel中找到区分大小写的重复项(如果是100k或更多的记录),请删除整行? [英] How to delete entire row when case sensitive duplicates are found in Excel (for 100k records or more)?
问题描述
这是如何清除Excel中的敏感信息(对于100k或更多记录)的重复项的后续问题?
。
由于他的代码过程仅操纵列A的数据,所以我也想删除整行数据,如果
区分大小写含义
<
Case1
case1
cASE1
都是唯一记录。
code>字典以检查二进制唯一性和变体数组以加快速度。要使用字典,您需要包含对 Microsoft脚本运行时库的引用
(工具>参考> Microsoft脚本运行时库) p>
我已经测试了10万行,平均在笔记本电脑 0.25秒。
Sub RemoveDuplicateRows()
Dim data As Range
设置数据= ThisWorkbook.Worksheets(Sheet1)UsedRange
Dim v As Variant,tags As Variant
v = data
ReDim标签(1到UBound(v),1到1)
标签(1,1)= 0 '保持标题
Dim dict As Dictionary
设置dict =新字典
dict.CompareMode = BinaryCompare
Dim i As Long
对于i = LBound(v,1)To UBound(v,1)
与dict
如果不是.Exists(v(i,1))然后'v(i,1)第一列
标签(i,1)= i
.Add Key:= v(i,1), Item:= vbNullString
End If
End With
Next i
Dim rngTags As Range
设置rngTags = data.Columns(data.Columns.count + 1)
rngTags.Value = tags
Union(data,rngTags).Sort key1:= rngTags,Orientation:= xlTopToBottom,Header:= xlYes
Dim count As Long
count = rngTags.End(xlDown).Row
rngTags.EntireColumn.Delete
data.Resize(UBound(v,1) - count + 1) .Offset(count).EntireRow.Delete
End Sub
根据来自此问题的精彩答案
This is a follow up question from How to remove duplicates that are case SENSITIVE in Excel (for 100k records or more)? .
Since his code procedure manipulates the data of column A only, I'd like to also delete the entire row of data if case-sensitive duplicate is found.
Case sensitive meaning:
- Case1
- case1
- cASE1
Are all unique records.
You can use a Dictionary
to check for binary uniqueness and variant arrays to speed things up. To use the dictionary you will need to include a reference to Microsoft Scripting Runtime Library
(Tools > References > Microsoft Scripting Runtime library)
I've tested this with 100,000 rows which takes on average 0.25 seconds on my laptop.
Sub RemoveDuplicateRows()
Dim data As Range
Set data = ThisWorkbook.Worksheets("Sheet1").UsedRange
Dim v As Variant, tags As Variant
v = data
ReDim tags(1 To UBound(v), 1 To 1)
tags(1, 1) = 0 'keep the header
Dim dict As Dictionary
Set dict = New Dictionary
dict.CompareMode = BinaryCompare
Dim i As Long
For i = LBound(v, 1) To UBound(v, 1)
With dict
If Not .Exists(v(i, 1)) Then 'v(i,1) comparing the values in the first column
tags(i, 1) = i
.Add Key:=v(i, 1), Item:=vbNullString
End If
End With
Next i
Dim rngTags As Range
Set rngTags = data.Columns(data.Columns.count + 1)
rngTags.Value = tags
Union(data, rngTags).Sort key1:=rngTags, Orientation:=xlTopToBottom, Header:=xlYes
Dim count As Long
count = rngTags.End(xlDown).Row
rngTags.EntireColumn.Delete
data.Resize(UBound(v, 1) - count + 1).Offset(count).EntireRow.Delete
End Sub
Based on the brilliant answer from this question
这篇关于如何在Excel中找到区分大小写的重复项(如果是100k或更多的记录),请删除整行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!