Matlab,直方图中的每个条形对应于哪个字母 [英] matlab, each bar in histogram correspond to which letter

查看:106
本文介绍了Matlab,直方图中的每个条形对应于哪个字母的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有400个文件,每个文件包含大约500000个字符,而这500000个字符仅由大约20个字母组成.我想制作一个直方图,指示使用的最多10个字母(x轴)和每个字母的使用次数(y轴).我写了这段代码,其中缺少一些内容,我想知道每个小节对应于哪个字母.我应该在代码上添加什么?您可以更改整个代码,但是保留此代码对我来说更好.提供完整的代码,这样我就可以直接将其复制到脚本中并运行它.

I have 400 files, each one contains about 500000 character, and those 500000 characters consists only from about 20 letters. I want to make a histogram indicating the most 10 letters used (x-axis) and number of times each letter is used (y-axis). I wrote this code which has missing thing which is I want to know each bar is corresponding to which letter. What should I add on the code ? You can change the whole code, but keeping this is better for me. provide me the whole code so I can copy it directly to a script and run it.

     i = 1;
     z = zeros(1, 10);
        for i=1:400
    j = num2str(i);
    file_name = strcat('part',j,'txt');
    file_id = fopen(file_name);
    part = fread(file_id, inf, 'uchar');
    h = hist(part,10);
    z = z + h;
    fclose(file_id);
end

推荐答案

首先,您对hist的使用是错误的. hist(data,10)将从包含10个bin的数据中创建直方图,因此一个bin将对应于文件中的多个字符.

First of all, your use of hist is wrong. hist(data,10) will create a histogram from data that consists of 10 bins, so a bin will correspond to more than one character in your files.

解决此问题的一种方法是在预定义的bin上使用hist,例如:

A way to solve this would be to use hist on predefined bins like:

bins = 1:255; % define the bins for hist
histSum = zeros(numel(bins),1);

for file=1:10;
    data = randi(25,100) + 'a';     %Generate random data - letters between 'a' and 'z'
    data = reshape(T,numel(T),1);   % Make it a vector

    histSum = histSum + hist(data,bins)';
end

请注意,您必须定义垃圾箱以容纳所有个可能的值,因此范围是1到255

Note that you have to define your bins to accommodate all possible values, therefore ranging from 1 to 255

这篇关于Matlab,直方图中的每个条形对应于哪个字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆