如何在Excel中找到区分大小写的重复项(如果是100k或更多的记录),请删除整行? [英] How to delete entire row when case sensitive duplicates are found in Excel (for 100k records or more)?

查看:932
本文介绍了如何在Excel中找到区分大小写的重复项(如果是100k或更多的记录),请删除整行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是如何清除Excel中的敏感信息(对于100k或更多记录)的重复项的后续问题?



由于他的代码过程仅操纵列A的数据,所以我也想删除整行数据,如果


区分大小写含义



<
  • Case1

  • case1

  • cASE1

  • 都是唯一记录。



    解决方案

    code>字典以检查二进制唯一性和变体数组以加快速度。要使用字典,您需要包含对 Microsoft脚本运行时库的引用



    (工具>参考> Microsoft脚本运行时库) p>

    我已经测试了10万行,平均在笔记本电脑 0.25秒。

      Sub RemoveDuplicateRows()
    Dim data As Range
    设置数据= ThisWorkbook.Worksheets(Sheet1)UsedRange

    Dim v As Variant,tags As Variant
    v = data
    ReDim标签(1到UBound(v),1到1)
    标签(1,1)= 0 '保持标题

    Dim dict As Dictionary
    设置dict =新字典
    dict.CompareMode = BinaryCompare

    Dim i As Long
    对于i = LBound(v,1)To UBound(v,1)
    与dict
    如果不是.Exists(v(i,1))然后'v(i,1)第一列
    标签(i,1)= i
    .Add Key:= v(i,1), Item:= vbNullString
    End If
    End With
    Next i

    Dim rngTags As Range
    设置rngTags = data.Columns(data.Columns.count + 1)
    rngTags.Value = tags

    Union(data,rngTags).Sort key1:= rngTags,Orientation:= xlTopToBottom,Header:= xlYes

    Dim count As Long
    count = rngTags.End(xlDown).Row

    rngTags.EntireColumn.Delete
    data.Resize(UBound(v,1) - count + 1) .Offset(count).EntireRow.Delete
    End Sub

    根据来自此问题的精彩答案


    This is a follow up question from How to remove duplicates that are case SENSITIVE in Excel (for 100k records or more)? .

    Since his code procedure manipulates the data of column A only, I'd like to also delete the entire row of data if case-sensitive duplicate is found.

    Case sensitive meaning:

    1. Case1
    2. case1
    3. cASE1

    Are all unique records.

    解决方案

    You can use a Dictionary to check for binary uniqueness and variant arrays to speed things up. To use the dictionary you will need to include a reference to Microsoft Scripting Runtime Library

    (Tools > References > Microsoft Scripting Runtime library)

    I've tested this with 100,000 rows which takes on average 0.25 seconds on my laptop.

    Sub RemoveDuplicateRows()
        Dim data As Range
        Set data = ThisWorkbook.Worksheets("Sheet1").UsedRange
    
        Dim v As Variant, tags As Variant
        v = data
        ReDim tags(1 To UBound(v), 1 To 1)
        tags(1, 1) = 0 'keep the header
    
        Dim dict As Dictionary
        Set dict = New Dictionary
        dict.CompareMode = BinaryCompare
    
        Dim i As Long
        For i = LBound(v, 1) To UBound(v, 1)
            With dict
                If Not .Exists(v(i, 1)) Then 'v(i,1) comparing the values in the first column 
                    tags(i, 1) = i
                    .Add Key:=v(i, 1), Item:=vbNullString
                End If
            End With
        Next i
    
        Dim rngTags As Range
        Set rngTags = data.Columns(data.Columns.count + 1)
        rngTags.Value = tags
    
        Union(data, rngTags).Sort key1:=rngTags, Orientation:=xlTopToBottom, Header:=xlYes
    
        Dim count As Long
        count = rngTags.End(xlDown).Row
    
        rngTags.EntireColumn.Delete
        data.Resize(UBound(v, 1) - count + 1).Offset(count).EntireRow.Delete
    End Sub
    

    Based on the brilliant answer from this question

    这篇关于如何在Excel中找到区分大小写的重复项(如果是100k或更多的记录),请删除整行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆