我怎样才能找到号码的列表中最常见的组合？ [英] How can I find most frequent combinations of numbers in a list?

查看：103 发布时间：2015/11/30 21:04:28 algorithm

本文介绍了我怎样才能找到号码的列表中最常见的组合？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

想象一下，你有数字（或字母）的列表，如

1177783777297461145777267337774652113777236237118777

我想找到号码的最常见的组合，在这个列表：

1位长的组合 - 这是在这个列表中的最频繁的号码

2位长的组合 - 可能是11

3位数字长的组合 - 可能是'777'等

是否有这样的任务，一些特殊algorythm？

更新好了，我自己我有codeD以下（JAVA）。貌似执行时间是成比例的数据量乘以图案尺寸：

 公共静态无效的主要（字串[] args）
{
    INT DATA_SIZE = 10000;
    INT []数据=新的INT [DATA_SIZE]
    的for（int i = 0; I＆LT; DATA_SIZE;我++）
    {
        数据[I] =（int）的（10 *的Math.random（））％10;
        System.out.print（数据[I]）;
    }

    INT []样式1 =新INT [] {1，2，3};
    INT [] PATTERN2 =新INT [] {7,7，7};
    INT [] pattern3 =新INT [] {7,7};

    的System.out.println（）;
    的System.out.println（匹配（数据，样式1））;
    的System.out.println（匹配（数据，式样2））;
    的System.out.println（匹配（数据，pattern3））;
}

静态INT比赛（INT []的数据，INT []模式）
{
    INT匹配= 0;
    INT I = 0;
    而（I＆LT; data.length）
    {
        比赛的isEqual =（数据，我，模式）？匹配+ 1：匹配;
        我++;
    }
    返回匹配;
}

静态的isEqual布尔（INT []一，诠释了startIndex，INT [] A2）
{
    若（a == A2）
    {
        返回true;
    }
    若（a == NULL || A2 == NULL）
    {
        返回false;
    }

    的for（int i = 0; I＆LT; a2.length;我++）
    {
        如果（一个[在startIndex + 1]！= A2 [I]）
        {
            返回false;
        }
    }

    返回true;
}

解决方案

要找到最多的长度至少为k的序列重复的长度为n的字符串，你可以建立一个后缀树（的 http://en.wikipedia.org/wiki/Suffix_tree ）的线性时间，然后找到具有节点大多数儿童描述长度为k（或更多）的序列，从根

总的来说，这是在输入的字符串的长度线性时间

有关小k，你最好用天真的算法：

 从收藏导入计数器

高清输入（S，K）：
    C =计数器（）
    对于i中的xrange（LEN（S） -  k）的：
        C [S [I：I + K] + = 1
    返回c.most_common（1）[0] [0]


对于k在的xrange（1，4）：
    打印输入（'1177783777297461145777267337774652113777236237118777'，K）

Imagine you have a list of numbers (or letters), such as

1177783777297461145777267337774652113777236237118777

I want to find the most frequent combinations of numbers in this list:

for 1-digit-long combinations - it is the most frequent number in this list

for 2-digit-long combinations - probably '11'

for 3-digits-long combinations - probably '777' etc

Is there some special algorythm for such a tasks?

UPDATE Well, by myself I have coded the following (Java). Looks like execution time is proportional to data size multiplied by pattern size:

public static void main(String[] args)
{
    int DATA_SIZE = 10000;
    int[] data = new int[DATA_SIZE];
    for (int i = 0; i < DATA_SIZE; i++)
    {
        data[i] = (int) (10 * Math.random()) % 10;
        System.out.print(data[i]);
    }

    int[] pattern1 = new int[]{1, 2, 3};
    int[] pattern2 = new int[]{7, 7, 7};
    int[] pattern3 = new int[]{7, 7};

    System.out.println();
    System.out.println(match(data, pattern1));
    System.out.println(match(data, pattern2));
    System.out.println(match(data, pattern3));
}

static int match(int[] data, int[] pattern)
{
    int matches = 0;
    int i = 0;
    while (i < data.length)
    {
        matches = isEqual(data, i, pattern) ? matches + 1 : matches;
        i++;
    }
    return matches;
}

static boolean isEqual(int[] a, int startIndex, int[] a2)
{
    if (a == a2)
    {
        return true;
    }
    if (a == null || a2 == null)
    {
        return false;
    }

    for (int i = 0; i < a2.length; i++)
    {
        if (a[startIndex + i] != a2[i])
        {
            return false;
        }
    }

    return true;
}

解决方案

To find the largest number of repeats of a sequence of length at least k in a string of length n, you can build a suffix tree (http://en.wikipedia.org/wiki/Suffix_tree) in linear time, then find the node that has the most children that describes sequences of length k (or more) from the root.

Overall, this is linear time in the length of the input string.

For small k, you're better off with a naive algorithm:

from collections import Counter

def input(s, k):
    c = Counter()
    for i in xrange(len(s) - k):
        c[s[i : i + k]] += 1
    return c.most_common(1)[0][0]


for k in xrange(1, 4):
    print input('1177783777297461145777267337774652113777236237118777', k)

这篇关于我怎样才能找到号码的列表中最常见的组合？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我怎样才能找到号码的列表中最常见的组合？ [英] How can I find most frequent combinations of numbers in a list?

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

我怎样才能找到号码的列表中最常见的组合？ [英] How can I find most frequent combinations of numbers in a list?

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭