最小化LINQ字符串令牌计数器 [英] Minimize LINQ string token counter

查看:125
本文介绍了最小化LINQ字符串令牌计数器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

跟帖上回答<一href="http://stackoverflow.com/questions/4035563/extract-keywords-from-text-in-net/4035757#4035757">an早期的问题。

有没有一种方法,以进一步降低,避免了外部 String.Split 打电话?我们的目标是一个关联容器 {道理,算}

 的字符串src =字符串中的每个字符,利用剩下的+
    从字符串开始+
    作为一个子,算它,如果它与目标字符串开头;

字符串[]目标= src.Split(新的char [] {''});

VAR的结果= target.GroupBy(T =&gt;新建
{
    海峡= T,
    数= target.Count(分=&GT; sub.Equals(T))
});
 

解决方案

当你拥有了它,现在,它的工作(在某种程度上),但是是非常低效的。由于是,结果是分组的枚举,而不是(文字,计数)对你可能会想。

GROUPBY的那过载()需要一个功能选择键。您正在有效地执行该计算集合中的每一个项目。如果不打算使用正则EX pressions忽视标点符号的路线,它应该写成像这样:

 的字符串src =字符串中的每个字符,利用剩下的+
             从字符串开始+
             作为一个子,算它,如果它与目标字符串开头;

VAR的结果= src.Split()//默认的拆分按空格
                 .GroupBy(海峡=&GT; STR)//组词的价值
                 。选择(G =&gt;新建
                              {
                                  海峡= g.Key,//值
                                  数= g.Count()//该值的数
                              });

//排序由计数的话结果
VAR sortedResults = results.OrderByDescending(P =&GT; p.str);
 

Followup on answer to an earlier question.

Is there a way to further reduce this, avoiding the external String.Split call? The goal is an associative container of {token, count}.

string src = "for each character in the string, take the rest of the " +
    "string starting from that character " +
    "as a substring; count it if it starts with the target string";

string[] target = src.Split(new char[] { ' ' });

var results = target.GroupBy(t => new
{
    str = t,
    count = target.Count(sub => sub.Equals(t))
});

解决方案

As you have it right now, it will work (to some extent) but is terribly inefficient. As is, the result is an enumeration of groupings, not the (word, count) pairs you might be thinking.

That overload of GroupBy() takes a function to select the key. You are effectively performing that calculation for every item in the collection. Without going the route of using regular expressions ignoring punctuation, it should be written like so:

string src = "for each character in the string, take the rest of the " +
             "string starting from that character " +
             "as a substring; count it if it starts with the target string";

var results = src.Split()               // default split by whitespace
                 .GroupBy(str => str)   // group words by the value
                 .Select(g => new
                              {
                                  str = g.Key,      // the value
                                  count = g.Count() // the count of that value
                              });

// sort the results by the words that were counted
var sortedResults = results.OrderByDescending(p => p.str);

这篇关于最小化LINQ字符串令牌计数器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆