确定单元格数组中每个单词的出现次数 [英] Determining the number of occurrences of each word in cell array
问题描述
我有一个庞大的单词向量,我想要一个仅包含唯一单词以及每个单词出现频率的向量.我已经尝试过hist
和histc
,但是它们用于数值.
我知道函数tabulate
,但是它给单词一些'(例如,这变成了'this').
如果您对MATLAB有任何想法,那就太好了.谢谢
I have huge vector of words, and I want a vector with the unique words only, and the frequency for each word. I've already tried hist
and histc
but they are for numeric value.
I know the function tabulate
but it gives the words some ' (e.g this turns to 'this').
If you have any idea how to do it MATLAB it would be great. thanks
推荐答案
您在正确的轨道上!只需先使用unique
为hist
准备数字输入.诀窍在于,可以将unique
返回的单词出现ID用作hist
函数的输入,因此无需显式的for
循环即可获得计数:
You were on the right track! Just use unique
first to prepare the numeric input for hist
. The trick is that the word occurence ids returned by unique
can be used as input for the hist
function, so you can get the counts without explicit for
loops:
words = {'abba' 'bed' 'carrot' 'damage' 'bed'};
[unique_words, ~, occurrences] = unique(words);
unique_counts = hist(occurrences, 1:max(occurrences));
这将产生:
>> unique_words
'abba' 'bed' 'carrot' 'damage'
>> unique_counts
1 2 1 1
这篇关于确定单元格数组中每个单词的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!