在Python中将词频转换为图形直方图 [英] Converting word frequency to a graphical histogram in python

查看:104
本文介绍了在Python中将词频转换为图形直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我现在拥有的,这要归功于Pavel Anossov.我正在尝试将已输出的单词频率转换为星号.

This is what I have right now, thanks to Pavel Anossov. I am trying to convert the word frequency that has been outputed into asterisks.

import sys
import operator 
from collections import Counter
def candidateWord():


   with open("sample.txt", 'r') as f:
      text = f.read()
   words = [w.strip('!,.?1234567890-=@#$%^&*()_+')for w in text.lower().split()]
            #word_count[words] = word_count.get(words,0) + 1
   counter = Counter(words)

   print("\n".join("{} {}".format(*p) for p in counter.most_common()))

candidateWord()

这就是我现在作为输出的内容.

This is what I have right now as an output.

how 3

i 2

am 2

are 2

you 2

good 1

hbjkdfd 1

我要尝试使用的公式是最频繁出现的单词出现M次,当前单词出现N次,打印的星号为:

The formula I want to try and use is the most frequent word occurs M times and the current word occurs N times, the number of asterisks printed is:

(50 * N) / M

推荐答案

代码:

import sys
import operator 
from collections import Counter
def candidateWord():
   with open("sample.txt", 'r') as f:
      text = f.read()
   words = [w.strip('!,.?1234567890-=@#$%^&*()_+')for w in text.lower().split()]
            #word_count[words] = word_count.get(words,0) + 1
   counter = Counter(words)

   # I added the code below...
   columns = 80
   n_occurrences = 10
   to_plot = counter.most_common(n_occurrences)
   labels, values = zip(*to_plot)
   label_width = max(map(len, labels))
   data_width = columns - label_width - 1
   plot_format = '{:%d}|{:%d}' % (label_width, data_width)
   max_value = float(max(values))
   for i in range(len(labels)):
     v = int(values[i]/max_value*data_width)
     print(plot_format.format(labels[i], '*'*v))

candidateWord()

输出:

the |***************************************************************************
and |**********************************************                             
of  |******************************************                                 
to  |***************************                                                
a   |************************                                                   
in  |********************                                                       
that|******************                                                         
i   |****************                                                           
was |*************                                                              
it  |**********                                                                 

这篇关于在Python中将词频转换为图形直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆