在字符串中计算字频率(最高有效字词),不包括关键字 [英] Counting Word Frequency (most significant words) in a String, excluding keywords

查看:205
本文介绍了在字符串中计算字频率(最高有效字词),不包括关键字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算字符串中的字词(不包括某些关键字)的频率,并对它们进行排序。

 <$> 



< c $ c>这是stackoverflow。我重复stackoverflow。

排除关键字的位置

  ExKeywords()= {i,is} 

输出应该像

  stackoverflow 
重复
this

PS没有!我不是重新设计谷歌! :)

解决方案

  string input =这是stackoverflow,重复stackoverflow。 
string [] keywords = new [] {i,is};
Regex regex = new Regex(\\w +);

foreach(var group in regex.Matches(input)
.OfType< Match>()
.Select(c => c.Value.ToLowerInvariant())
.Where(c =>!keywords.Contains(c))
.GroupBy(c => c)
.OrderByDescending(c => c.Count())
.ThenBy(c => c.Key))
{
Console.WriteLine(group.Key);
}


I would like to count the frequency of words (excluding some keywords) in a string and sort them DESC. So, how can i do it?

In the following string...

This is stackoverflow. I repeat stackoverflow.

Where the excluding keywords are

ExKeywords() ={"i","is"}

the output should be like

stackoverflow  
repeat         
this           

P.S. NO! I am not re-designing google! :)

解决方案

string input = "This is stackoverflow. I repeat stackoverflow.";
string[] keywords = new[] {"i", "is"};
Regex regex = new Regex("\\w+");

foreach (var group in regex.Matches(input)
    .OfType<Match>()
    .Select(c => c.Value.ToLowerInvariant())
    .Where(c => !keywords.Contains(c))
    .GroupBy(c => c)
    .OrderByDescending(c => c.Count())
    .ThenBy(c => c.Key))
{
    Console.WriteLine(group.Key);
}

这篇关于在字符串中计算字频率(最高有效字词),不包括关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆