如何得到没有。在c#win中一起出现的一封信。形成 [英] How to get no. of occurrence of a letter which are together in c# win. form

查看:49
本文介绍了如何得到没有。在c#win中一起出现的一封信。形成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何获得不。在c#win中出现一起出现的信件表格



例如: -

案例

How to get no. of occurrence of a letter which are together in c# win. form

Example:-
Case

string s = "aaaabbbcca";



输出

a-4,b- 3,c-2,a-1



注:

1 )由于性能问题,请不要提供循环迭代的解决方案

2)我必须使用 GC.Collect( )因为在我的实际案例中 10,000,000,000字符串长度所以我需要在处理完数据后立即释放内存。



到目前为止,我的代码是


Output
a-4,b-3,c-2,a-1

Note:
1) Please don't provide solutions with loops or iteration because of performance issue
2) I have to use GC.Collect() because there is 10,000,000,000 length of string in my real case so i need to free memory as soon as the data has been processed)

My code till now is

StringBuilder Output = new StringBuilder();

int Times = 0;
char NewChar = snewbuild[0];
char Lastchar = NewChar;//Need first char
for (; snewbuild.Length > 0; )
{
      NewChar = snewbuild[0];
      if (Lastchar == NewChar)
      {
            Times++;
      }
      else
      {
            Output.Append(Lastchar + "-" + Times + ",");
            Times = 1;
            GC.Collect();
      }
      Lastchar = NewChar;
      snewbuild.Remove(0, 1);
}
Output.Append(Lastchar + "-" + Times);





任何一段代码或任何有关以下问题的新想法都将受到赞赏。提前谢谢



Any piece of code or any new idea for the following question will be appreciated & thanks in advance

推荐答案

public static string TallyPhraseFrequencies(string input)
{
    StringBuilder output = new StringBuilder();
    int frequency = 1;
    char c = input[0];
    for (int i = 1; i < input.Length; i++)
    {
        if (i == input.Length - 1)
        {
            if (input[i] != c)
            {
                output.Append(c.ToString()).Append("-").Append(frequency).Append(",");
                output.Append(input[i].ToString()).Append("-1,");
            }
            else
                output.Append(c.ToString()).Append("-").Append(frequency + 1).Append(",");
        }
        else if (input[i] != c)
        {
            output.Append(c.ToString()).Append("-").Append(frequency).Append(",");
            c = input[i];
            frequency = 1;
        }
        else
            frequency++;
    }
    return output.ToString();
}


.NET对象不能> 2GB。在尺寸方面。 64位系统上的字符串对象... Unicode =每个字符两个字节...最大可能超过1GB。



StringBuilder有一个最大容量与Int32相同:+2,147,483,647。



虽然我相信你可以读取 10gb。文本文件(5GB的Unicode双字节字符代码),并逐块处理它来分析相同字母的相邻频率,我不敢相信你可以拥有那个大小的内存中字符串。



如果没有某种形式的循环/迭代,无法进行频率分析



无论如何,这里有一种方法可以解决这个问题:
.NET objects cannot be > 2gb. in size. A string object on a 64-bit system ... Unicode = two bytes per character ... would probably max out at a little over 1gb.

StringBuilder has a maximum capacity the same as that of an Int32: +2,147,483,647.

While I believe you could read a 10gb. text file (5gb. of Unicode two-byte character codes), and process it chunk-by-chunk to analyze same-letter adjacent frequency, I cannot believe you can have an in-memory string of that size.

There is also simply no way to perform frequency analysis without some form of loop/iteration.

Anyhow, here's one way you could go about this:
private string parseFrequencies(string data)
{
    StringBuilder sb1 = new StringBuilder();
    StringBuilder sb2 = new StringBuilder();

    sb1.Append(data);

    int count, pos;

    while (sb1.Length > 0)
    {
        char c = sb1[0];

        pos = 1;
        count = 1;

        while (pos < sb1.Length)
        {
            if (sb1[pos] == c)
            {
                count++;
                pos++;
            }
            else
            {
                break;
            }
        }

        sb2.Append(string.Format("{0}-{1},", c, count));

        sb1.Remove(0, count);
    }

    return sb2.ToString();
}

// sample test
string freq = parseFrequencies("aaaabbbcca"); // => "a-4,b-3,c-2,a-1,"


所需的循环可以整理一下,例如

The loops required can be tidied up a bit, for example
public static void RunLengthEncode(string s) {
  Console.WriteLine(s);
  int pos = 0;
  while (pos < s.Length) {
    char c = s[pos];
    int startPos = pos;
    for (; pos < s.Length && c == s[pos]; ++pos) ;
    int runLength = pos - startPos;
    // example output
    Console.Write("{0}-{1},", c, runLength);
  }
  Console.WriteLine();
}





Alan。



Alan.


这篇关于如何得到没有。在c#win中一起出现的一封信。形成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆