在MATLAB中使用find函数的问题 [英] Problem using the find function in MATLAB

查看:211
本文介绍了在MATLAB中使用find函数的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两组数据,我试图合并。其中一个包含来自第一列实验的实际潜伏期(例如0.345,0.455 ...不得超过3个小数位)以及来自该实验的其他数据。另一个包含0.001到0.500的查找延迟列表,其中包含0.001个增量,以及其他数据。这两个数据集是X - Y的双打。



我想要做的是类似...


$ b $对于i = 1:length(actual_latency)
row = find(predicted_data(:,1)== actual_latency(i))
full_set(i),b

  ,1:4)= [actual_latency(i)other_info(i)predict_info(row,2)... 
predict_info(row,3)];
end

...为了找到 predicted_data 查找延迟对应于实际延迟。然后我用它来创建一个合并的数据集 full_set



我觉得这很简单,但查找函数在查找实际的延迟时会抛出一个空矩阵,而这个延迟在 predict_data(:,1)(as我在调试过程中进行了双重检查)。

此外,如果我用for循环替换find来做同样的工作,我会得到一个类似的错误。它似乎不是系统性的 - 使用不同的参与者数据集将它抛在不同的地方。另外,在调试模式下,如果我使用find来试图找到一个硬编码的 actual_latency 值,它并不总是工作。有时是的,有时没有。



我真的在挠头,所以如果有人对可能发生的事情有任何想法,我会很感激。

解决方案

你可能遇到浮点比较的问题, b

  predicted_data(:1)== actual_latency(i)

即使你的数字出现只有三位小数的精确度,他们仍然可能会有很小的差异,而不会被显示出来,因此给你一个空矩阵查找不能得到确切的 match。



浮点数的一个特点是某些数不能被精确表示,因为它们不是2的整数次幂。发生在数字0.1和0.001。如果您反复添加或乘以其中一个数字,您可以看到一些意外的行为。 Amro在他的评论中指出了一个例子: 0.3 不是正好等于 3 * 0.1 。这也可以通过以两种不同的方式创建查询延迟列表来说明。您可以使用正常的冒号语法

  vec1 = 0.001:0.001:0.5; 

或者您可以使用 LINSPACE

  vec2 = linspace(0.001,0.5,500); 

你会认为这两个向量会相等,但是再想一想! p>

 >> isequal(vec1,vec2)

ans =

0%#FALSE!

这是因为两种方法通过以不同的方式连续执行0.001或0.001的乘法来创建矢量,向量中的某些条目给予稍微不同的值。您可以查看这个技术解决方案了解更多细节。



比较浮点数时,应该使用一些容差来进行比较。例如,查找查询列表中的条目索引在实际延迟的0.0001以内:

  tolerance = 0.0001;对于i = 1:
:length(actual_latency)
row = find(abs(predicted_data(:,1) - actual_latency(i))< tolerance);
...

浮点比较的主题也包含在相关问题


I have two arrays of data that I'm trying to amalgamate. One contains actual latencies from an experiment in the first column (e.g. 0.345, 0.455... never more than 3 decimal places), along with other data from that experiment. The other contains what is effectively a 'look up' list of latencies ranging from 0.001 to 0.500 in 0.001 increments, along with other pieces of data. Both data sets are X-by-Y doubles.

What I'm trying to do is something like...

for i = 1:length(actual_latency) 
   row = find(predicted_data(:,1) == actual_latency(i))
   full_set(i,1:4) = [actual_latency(i) other_info(i) predicted_info(row,2) ...
                      predicted_info(row,3)];
end

...in order to find the relevant row in predicted_data where the look up latency corresponds to the actual latency. I then use this to created an amalgamated data set, full_set.

I figured this would be really simple, but the find function keeps failing by throwing up an empty matrix when looking for an actual latency that I know is in predicted_data(:,1) (as I've double-checked during debugging).

Moreover, if I replace find with a for loop to do the same job, I get a similar error. It doesn't appear to be systematic - using different participant data sets throws it up in different places.

Furthermore, during debugging mode, if I use find to try and find a hard-coded value of actual_latency, it doesn't always work. Sometimes yes, sometimes no.

I'm really scratching my head over this, so if anyone has any ideas about what might be going on, I'd be really grateful.

解决方案

You are likely running into a problem with floating point comparisons when you do the following:

predicted_data(:,1) == actual_latency(i)

Even though your numbers appear to only have three decimal places of precision, they may still differ by very small amounts that are not being displayed, thus giving you an empty matrix since FIND can't get an exact match.

One feature of floating point numbers is that certain numbers can't be exactly represented, since they aren't an integer power of 2. This occurs with the numbers 0.1 and 0.001. If you repeatedly add or multiply one of these numbers you can see some unexpected behavior. Amro pointed out one example in his comment: 0.3 is not exactly equal to 3*0.1. This can also be illustrated by creating your look-up list of latencies in two different ways. You can use the normal colon syntax:

vec1 = 0.001:0.001:0.5;

Or you can use LINSPACE:

vec2 = linspace(0.001,0.5,500);

You'd think these two vectors would be equal to one another, but think again!:

>> isequal(vec1,vec2)

ans =

     0  %# FALSE!

This is because the two methods create the vectors by performing successive additions or multiplications of 0.001 in different ways, giving ever so slightly different values for some entries in the vector. You can take a look at this technical solution for more details.

When comparing floating point numbers, you should therefore do your comparisons using some tolerance. For example, this finds the indices of entries in the look-up list that are within 0.0001 of your actual latency:

tolerance = 0.0001;
for i = 1:length(actual_latency)
  row = find(abs(predicted_data(:,1) - actual_latency(i)) < tolerance);
  ...

The topic of floating point comparison is also covered in this related question.

这篇关于在MATLAB中使用find函数的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆