确定单元格数组中每个单词的出现次数 [英] Determining the number of occurrences of each word in cell array

查看:97
本文介绍了确定单元格数组中每个单词的出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个庞大的单词向量,我想要一个仅包含唯一单词以及每个单词出现频率的向量.我已经尝试过histhistc,但是它们用于数值. 我知道函数tabulate,但是它给单词一些'(例如,这变成了'this'). 如果您对MATLAB有任何想法,那就太好了.谢谢

I have huge vector of words, and I want a vector with the unique words only, and the frequency for each word. I've already tried hist and histc but they are for numeric value. I know the function tabulate but it gives the words some ' (e.g this turns to 'this'). If you have any idea how to do it MATLAB it would be great. thanks

推荐答案

您在正确的轨道上!只需先使用uniquehist准备数字输入.诀窍在于,可以将unique返回的单词出现ID用作hist函数的输入,因此无需显式的for循环即可获得计数:

You were on the right track! Just use unique first to prepare the numeric input for hist. The trick is that the word occurence ids returned by unique can be used as input for the hist function, so you can get the counts without explicit for loops:

words = {'abba' 'bed' 'carrot' 'damage' 'bed'};
[unique_words, ~, occurrences] = unique(words);
unique_counts = hist(occurrences, 1:max(occurrences));

这将产生:

>> unique_words 
    'abba'    'bed'    'carrot'    'damage'

>> unique_counts
     1     2     1     1

这篇关于确定单元格数组中每个单词的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆