如何使用ruby以有效的方式获取词频? [英] How to get words frequency in efficient way with ruby?

查看:29
本文介绍了如何使用ruby以有效的方式获取词频?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

样本输入:

"I was 09809 home -- Yes! yes!  You was"

和输出:

{ 'yes' => 2, 'was' => 2, 'i' => 1, 'home' => 1, 'you' => 1 }

我的代码不起作用:

def get_words_f(myStr)
    myStr=myStr.downcase.scan(/\w/).to_s;
    h = Hash.new(0)
    myStr.split.each do |w|
       h[w] += 1 
    end
    return h.to_a;
end

print get_words_f('I was 09809 home -- Yes! yes!  You was');

推荐答案

这可行,但我对 Ruby 也有点陌生.可能有更好的解决方案.

This works but I am kinda new to Ruby too. There might be a better solution.

def count_words(string)
  words = string.split(' ')
  frequency = Hash.new(0)
  words.each { |word| frequency[word.downcase] += 1 }
  return frequency
end

代替.split(' '),你也可以做.scan(/\w+/);但是,.scan(/\w+/) 会将 arent 分开在 "aren't" 中,而 .split(' ') 不会.

Instead of .split(' '), you could also do .scan(/\w+/); however, .scan(/\w+/) would separate aren and t in "aren't", while .split(' ') won't.

示例代码的输出:

print count_words('I was 09809 home -- Yes! yes!  You was');

#{"i"=>1, "was"=>2, "09809"=>1, "home"=>1, "yes"=>2, "you"=>1}

这篇关于如何使用ruby以有效的方式获取词频?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆