如何根据表格扫描每个单词的值，然后计算出来，然后从中制作VSM(向量空间模型) [英] How to scan values from each words based on tables and then calculate it And Make The VSM (Vector Space Model) From It

查看：116 发布时间：2020/5/6 15:33:40 matlab text-mining

本文介绍了如何根据表格扫描每个单词的值，然后计算出来，然后从中制作VSM(向量空间模型)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

说我有一个表，其中包含另一个表中每个单词的概率.该表有2个类别. actual 和 non_actual .我将其命名为 master_table

Say that I have a table that contains probabilities from each words from another table. This Table has 2 classes; actual and non_actual. I will name it master_table

 actual = [0.5;0.4;0.6;0.75;0.23;0.96;0.532]; %sum of the probabilities is 1.     
actual + non_actual = 1
non_actual = [0.5;0.6:0.4;0.25;0.77;0.04;0.468];
words = {'finn';'jake';'iceking';'marceline';'shelby';'bmo';'naptr'};
master_table = table(actual,non_actual,...
'RowNames',words)

然后我有一个包含句子的表格.我将其命名为T2

And then I have a table that contains sentences. I will name it T2

sentence = {'finn marceline naptr';'jake finn simon marceline haha';'jake finn finn jake iceking';'bmo shelby shelby finn naptr';'naptr naptr jake finn bmo shelby'}
T2 = table('RowNames',sentence)

如何制作(不属于master_table的单词，例如"simon"，"haha"具有值1，因此不会影响确定该类的概率的计算):

How to make like this (Words that dont belong in the master_table like "simon", "haha" have value 1, so it wont affects the calculation of the probabilities to determine the class) :

                                    actual %determines the value based on probabilities from each words%        non_actual               class
finn marceline naptr                0.5 * 0.75 * 0.532                                                         0.5 * 0.25 * 0.468        compares the value from each class. if actual > non_actual then the class should be "actual"
jake finn simon marceline haha      0.4 * 0.5 * 1 * 0.25 * 1                                                   0.6 * 0.5 * 1 * 0.75 * 1
jake finn finn jake iceking
bmo shelby shelby finn naptr
naptr naptr jake finn bmo shelby

以及如何根据上述问题制作VSM(向量空间模型):

And how to make the VSM (vector space model) from the problem above:

                                                                        WORDS                                   
                                    | bmo | finn | jake | iceking | haha | marceline | naptr | shelby | simon |     %words sorted alphabetically      
finn marceline naptr                   0     1       0        0       0        1         1       0       0      
jake finn simon marceline haha         0     1       1        0       1        1         0       0       1
jake finn finn jake iceking            0     2       2        1       0        0         0       0       0
bmo shelby shelby finn naptr           1     1       0        0       0        0         1       1       0      
naptr naptr jake finn bmo shelby       1     1       1        0       0        0         1       1       0

推荐答案

这也有点循环，但是我觉得性能不是问题.我会先创建一个更大的表，然后在循环中更改值:

This is a bit loopy as well but i fell like performance is not an issue. I would create a bigger table first and then change the values in a loop:

T2 = table(ones(height(T2),1),ones(height(T2),1),repmat({''},height(T2),1),'RowNames',sentence,'VariableNames',{'actual' 'non_actual' 'outcome'});

for i=1:height(T2)
    % split the row name
    A=strsplit([T2.Properties.RowNames{i,:}]);
    actual=1; %which is neutral for multiplication
    non_actual=1; 
    for j=1:length(A)
       actual = actual *  master_table{A(j),1};
       non_actual = non_actual *  master_table{A(j),2};
    end
    %if you need those
    T2.actual(i)=actual;
    T2.non_actual(i)=non_actual;

    if actual > non_actual
        T2.outcome(i)={'actual'};
    else
        T2.outcome(i)={'non_actual'};
    end;
end;

这篇关于如何根据表格扫描每个单词的值，然后计算出来，然后从中制作VSM(向量空间模型)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何根据表格扫描每个单词的值，然后计算出来，然后从中制作VSM(向量空间模型) [英] How to scan values from each words based on tables and then calculate it And Make The VSM (Vector Space Model) From It

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何根据表格扫描每个单词的值，然后计算出来，然后从中制作VSM(向量空间模型) [英] How to scan values from each words based on tables and then calculate it And Make The VSM (Vector Space Model) From It

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭