根据字符串内容对组下的每个字符串进行分类(1 次)?(Excel) [英] Categorize Each String under Group (1 time), based on String content? (Excel)

查看:17
本文介绍了根据字符串内容对组下的每个字符串进行分类(1 次)?(Excel)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

复杂的问题……我先解释一下,也许有更好的解决方案而不是使用迭代计算:

Complicated problem... let me explain first, maybe there is a better solution rather than using iterative calculations:

(工作簿链接)

图像显示示例(显示我正在使用的内容)

问题:

有 4,000 多个字符串,并希望将它们分类为预先确定的组(基于字符串的内容).

Have 4,000+ Strings and would like to categorize them into pre-determined Groups (based on the String's content).

  1. 每个字符串只能分配给一个组.(即55 加仑桶水龙头"将列在水龙头"列下,因为它包含水龙头"一词.)

  1. Each String should only be assigned to one Group. (ie. "55 gallon drum faucet" will be listed under "faucet" column, since it contains the word "faucet".)

一旦归入一个组,该字符串将不会归入任何其他组.(即55 加仑桶水龙头"一旦被归类在水龙头"下,就不会被归类到桶"下).

Once categorized into a Group, the string won't be categorized under any other Groups. (ie. "55 gallon drum faucet" won't be categorized under "drum" once it's already been categorized under "faucet").

每个字符串在哪个组下并不重要,只要它被分类即可.

It really doesn't matter which Group each String does under, as long as it's categorized.

注意:(我几乎找到了使用迭代计算的解决方案,但效果不佳).

Note: (I've almost found a solution using iterative calculation, but it doesn't quite work).

解决方案:

我解决问题的方式是:

  1. 使用公式计算字符串(A 列)在工作表中重复的次数:

  1. Count the number of times the String (Column A) was duplicated in the worksheet using the formula:

 Formula: =COUNTIF($E$2:$IA$10000,A3)

  • 这个公式在 C 列中.
  • 创建了一个公式,该公式将根据字符串是否包含组词(即水龙头"、啤酒"、加仑"、厨房"等)对组下的字符串进行分类...AND 之前没有使用过(即 C 列,其中包含上述公式).

    Created a formula that will categorize a String underneath a Group based on whether the String contains the Group word (ie. "faucet", "beer", "gallon", "kitchen", etc)... AND has not been used before (ie. Column C, which contains the formula from above).

      Formula: =IF(C3<1,IF(IFERROR(SEARCH("faucet",A3),0)>0,A3,""),"")
    

  • 为 C 列中的所有 4,000 个字符串以及每个单独的组"列向下拖动公式.

  • Drag formula down for all 4,000 strings in Column C, and for each individual "Group" column.

    这种方法的问题在于它会进行迭代计算:

    1. 将字符串归类到一个组下(但不会将 Times Dup'd 字段从 0 增加到 1)...

    1. 将Times Dup'd"字段从 0 增加到 1,但会阻止将字符串分类到组"列下.

    <小时>

    关于如何解决迭代计算问题的任何建议?(我知道它一直在计算,因为它是依赖的,所以必须用 1 个正确"的解决方案来解决……我想知道是否有任何方法可以创建某种块",因此只能计算一个方式...)


    Any suggestions on how to address the iterative calculation issue? (I know it keeps calculating back and forth since it's dependent, so will have to settle with 1 "right" solution... I'm wondering if there's any way to create some sort of 'block' so it can only be calculated one way...)

    任何帮助将不胜感激!

    推荐答案

    通过您的数据运行此过程.它在一对变体数组中执行所有处理.

    Run this procedure through your data. It performs all processing within a pair of variant arrays.

    Sub byGroup()
        Dim g As Long, s As Long, aSTRs As Variant, aGRPs As Variant
    
        appTGGL bTGGL:=False
    
        With Worksheets("Sheet1")
            aSTRs = .Range(.Cells(2, 1), .Cells(Rows.Count, 1).End(xlUp)).Value2
            With .Range(.Cells(1, 5), .Cells(Rows.Count, 1).End(xlUp).Offset(0, Application.Match("zzz", .Rows(1)) - 1))
                .Resize(.Rows.Count, .Columns.Count).Offset(1, 0).ClearContents
                aGRPs = .Cells.Value2
            End With
    
            For s = LBound(aSTRs, 1) To UBound(aSTRs, 1)
                For g = LBound(aGRPs, 2) To UBound(aGRPs, 2)
                    If CBool(InStr(1, aSTRs(s, 1), aGRPs(1, g), vbTextCompare)) Then
                        aGRPs(s + 1, g) = aSTRs(s, 1)
                        Exit For
                    End If
                Next g
            Next s
    
            .Cells(1, 5).Resize(UBound(aGRPs, 1), UBound(aGRPs, 2)) = aGRPs
    
        End With
    
        appTGGL
    End Sub
    
    Public Sub appTGGL(Optional bTGGL As Boolean = True)
        Debug.Print Timer
        Application.ScreenUpdating = bTGGL
        Application.EnableEvents = bTGGL
        Application.DisplayAlerts = bTGGL
        Application.Calculation = IIf(bTGGL, xlCalculationAutomatic, xlCalculationManual)
    End Sub
    

    经过的时间(不包括您的工作表公式重新计算应在 1-2 秒范围内.

    Elapsed time (excluding your worksheet formula recalculation should be in the 1-2 second range.

    匹配组的优先级从左到右.如果您认为 '55 加仑桶' 应归为 而不是 加仑,那么请确保桶在第 1 行中的加仑之前.

    Priority of the matching groups is left to right. If you think '55 gallon drum' should be grouped as drum and not gallon then make sure that drum comes before gallon in row 1.

    将启用宏的新工作簿另存为 Excel 二进制工作簿 (.XLSB) 可使工作簿文件大小减少大约一半.

    Saving the new macro-enabled workbook as an Excel Binary Workbook (.XLSB) reduces the workbook file size by roughly half.

    这篇关于根据字符串内容对组下的每个字符串进行分类(1 次)?(Excel)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆