N_gram频率python NTLK [英] N_gram frequency python NTLK

查看：69 发布时间：2021/6/7 20:44:30 python pandas nltk tf-idf countvectorizer

本文介绍了N_gram频率python NTLK的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想编写一个函数，返回给定文本的 n-gram 中每个元素的频率.请帮助.我做了这个代码来计算 2 克的频率

I want to write a function that returns the frequency of each element in the n-gram of a given text. Help please. I did this code fo counting frequency of 2-gram

代码:

 from nltk import FreqDist
 from nltk.util import ngrams    
 def compute_freq():
     textfile = "please write a function"
     bigramfdist = FreqDist()
     threeramfdist = FreqDist()
     for line in textfile:
         if len(line) > 1:
             tokens = line.strip().split(' ')
             bigrams = ngrams(tokens, 2)
             bigramfdist.update(bigrams)
      return bigramfdist
  bigramfdist = compute_freq()

推荐答案

我没有看到预期的输出部分，因此我认为这是可能需要的.

I don't see an expected output section, hence I assume this is what might need.

import nltk

def compute_freq(sentence, n_value=2):

    tokens = nltk.word_tokenize(sentence)
    ngrams = nltk.ngrams(tokens, n_value)
    ngram_fdist = nltk.FreqDist(ngrams)
    return ngram_fdist

默认情况下，此函数返回二元组的频率分布 - 例如，

By default this function returns frequency distribution of bigrams - for example,

text = "This is an example sentence."
freq_dist = compute_freq(text)

现在，freq_dist 看起来像 -

Now, freq_dist would look like -

FreqDist({('is', 'an'): 1, ('example', 'sentence'): 1, ('an', 'example'): 1, ('This', 
'is'): 1, ('sentence', '.'): 1})

从这里你可以像这样打印键和值

From here you can print the keys and values like so

for k,v in freq_dist.items():
    print(k, v) 

('is', 'an') 1
('example', 'sentence') 1
('an', 'example') 1
('This', 'is') 1
('sentence', '.') 1

对于除二元语法以外的任何其他内容，只需在调用函数时更改n_value"参数即可.例如，

For anything other that bigram, just change the 'n_value' argument when calling the function. For example,

freq_dist = compute_freq(text, n_value=3) #will give you trigram distribution

('example', 'sentence', '.') 1
('an', 'example', 'sentence') 1
('This', 'is', 'an') 1
('is', 'an', 'example') 1

这篇关于N_gram频率python NTLK的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

N_gram频率python NTLK [英] N_gram frequency python NTLK

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

N_gram频率python NTLK [英] N_gram frequency python NTLK

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭