从.NET中的文本中提取关键字 [英] Extract keywords from text in .NET

查看:166
本文介绍了从.NET中的文本中提取关键字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要计算每个关键字在一个字符串中重复出现的次数,并按最高编号排序. 为此目的,.NET代码中最快的算法是什么?

I need to calculate how many times each keyword is reoccurring in a string, with sorting by highest number. What's the fastest algorithm available in .NET code for this purpose?

推荐答案

下面的代码将具有计数的唯一令牌分组

code below groups unique tokens with count

string[] target = src.Split(new char[] { ' ' });

var results = target.GroupBy(t => new
{
    str = t,
    count = target.Count(sub => sub.Equals(t))
});

这终于开始对我更有意义...

This is finally starting to make more sense to me...

下面的代码导致计数与目标子字符串相关:

code below results in count correlated with target substring:

string src = "for each character in the string, take the rest of the " +
    "string starting from that character " +
    "as a substring; count it if it starts with the target string";
string[] target = {"string", "the", "in"};

var results = target.Select((t, index) => new {str = t, 
    count = src.Select((c, i) => src.Substring(i)).
    Count(sub => sub.StartsWith(t))});

结果现在是:

+       [0] { str = "string", count = 4 }   <Anonymous Type>
+       [1] { str = "the", count = 4 }  <Anonymous Type>
+       [2] { str = "in", count = 6 }   <Anonymous Type>

以下原始代码:

string src = "for each character in the string, take the rest of the " +
    "string starting from that character " +
    "as a substring; count it if it starts with the target string";
string[] target = {"string", "the", "in"};

var results = target.Select(t => src.Select((c, i) => src.Substring(i)).
    Count(sub => sub.StartsWith(t))).OrderByDescending(t => t);

衷心感谢此先前的答复.

调试器的结果(需要额外的逻辑以包含匹配的字符串及其计数):

Results from debugger (which need extra logic to include the matching string with its count):

-       results {System.Linq.OrderedEnumerable<int,int>}    
-       Results View    Expanding the Results View will enumerate the IEnumerable   
        [0] 6   int
        [1] 4   int
        [2] 4   int

这篇关于从.NET中的文本中提取关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆