在MATLAB中仅找到相关点 [英] Find only relevant points in MATLAB

查看:67
本文介绍了在MATLAB中仅找到相关点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MATLAB函数,可以在样本中查找特征点.不幸的是,它仅在大约90%的时间内有效.但是,当我知道应该看样品中的哪个位置时,可以将其增加到几乎100%.因此,我想知道MATLAB中是否有一个函数可以让我找到大多数结果所在的范围,从而可以重新计算特征点.我有一个可以存储所有结果的向量,正确的结果应在-24.000至24.000之间的3%范围内.错误的结果总是低于正确的范围.不幸的是,我的统计背景很生疏,所以我不确定如何称呼它. 有人可以给我提示我想要什么吗? MATLAB中是否内置了一个函数,可以在例如以下情况下给我最小的范围90%的结果都是谎言.

I have a MATLAB function that finds charateristic points in a sample. Unfortunatley it only works about 90% of the time. But when I know at which places in the sample I am supposed to look I can increase this to almost 100%. So I would like to know if there is a function in MATLAB that would allow me to find the range where most of my results are, so I can then recalculate my characteristic points. I have a vector which stores all the results and the right results should lie inside a range of 3% between -24.000 to 24.000. Wheras wrong results are always lower than the correct range. Unfortunatley my background in statistics is very rusty so I am not sure how this would be called. Can somebody give me a hint what I would be looking for? Is there a function build into MATLAB that would give me the smallest possible range where e.g. 90% of the results lie.

很抱歉,如果我没有明确说明我的问题.向量中的所有内容只能在-24.000到24.000之间.我的结果中大约90%的范围大约为1.44([24-(-24)] * 3%= 1.44).这些很有可能是正确的结果.剩余的10%不在该范围内,并且始终较低(为什么我不确定采用均值是个好主意).这10%是错误的,是由于我输入数据中出现的错误造成的.为了找到剩余的10%,我想重复我的计算,但是现在我只想检查一下较小的范围. 因此,我的目标是确定我的正确范围在哪里.删除我发现的超出该范围的值.然后重新计算我的值,而不是在-24.000到24.000之间的范围内,而是在一个小范围内重新计算,我已经找到了90%的值.

I am sorry if I didn't make my question clear. Everything in my vector can only range between -24.000 and 24.000. About 90% of my results will be in a range which spans approximately 1.44 ([24-(-24)]*3% = 1.44). These are very likely to be the correct results. The remaining 10% are outside of that range and always lower (why I am not sure taking then mean value is a good idea). These 10% are false and result from blips in my input data. To find the remaining 10% I want to repeat my calculations, but now I only want to check the small range. So, my goal is to identify where my correct range lies. Delete the values I have found outside of that range. And then recalculate my values, not on a range between -24.000 and 24.000, but rather on a the small range where I already found 90% of my values.

推荐答案

您正在寻找的相关要点是

The relevant points you're looking for are the percentiles:

% generate sample data
data = [randn(900,1) ; randn(50,1)*3 + 5; ; randn(50,1)*3 - 5];
subplot(121), hist(data)
subplot(122), boxplot(data)

% find 5th, 95th percentiles (range that contains 90% of the data)
limits = prctile(data, [5 95])

% find data in that range
reducedData = data(limits(1) < data & data < limits(2));

存在其他方法来检测离群值,例如三个标准偏差规则,其中包括许多

Other approachs exist to detect outliers, such as the IQR outlier test and the three standard deviation rule, among many others:

%% three standard deviation rule
z = 3;
bounds = z * std(data)
reducedData = data( abs(data-mean(data)) < bounds );

%% IQR outlier test
Q = prctile(data, [25 75]);
IQ = Q(2)-Q(1);
%a = 1.5;   % mild outlier
a = 3.0;    % extreme outlier
bounds = [Q(1)-a*IQ , Q(2)+a*IQ]
reducedData = data(bounds(1) < data & data < bounds(2));


顺便说一句,如果要获取与曲线下90%面积相对应的z值(|X|<z),请使用:


BTW if you want to get the z value (|X|<z) that corresponds to 90% area under the curve, use:

area = 0.9;                 % two-tailed probability
z = norminv(1-(1-area)/2)

这篇关于在MATLAB中仅找到相关点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆