使用numpy比较和查找两个不同大小的数组之间的错误 [英] comparing and finding error between two arrays of different sizes using numpy

查看:42
本文介绍了使用numpy比较和查找两个不同大小的数组之间的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有长度各不相同的 csv 文件.但是,我有一个真值文件,它的采样率为每秒 10 次,但正在记录的数据在秒边界上每秒记录一次.我试图匹配这些第二个边界来比较自动化测试的错误.下面是我的 csv 文件的示例.

I have csv files that all vary in length. However, I have a truth file that has samples at 10x per second, but the data that is being recorded is recorded once per second on second boundaries. I am trying to match these second boundaries to compare the error for automating tests. Below are an example of what my csv files look like.

真相档案

0,   1
0.1, 2
0.2, 3
.
.
.
x, n

测量文件

0, 1.01
1, 9.99
3, 30.05
.
.
.
x, n

我为我的测量结果在每个数据拉取的真值文件中拉入数据集,并且我正在尝试进行快速比较以查看与测量文件中的数据值相关联的时间值是否在真值文件中相同时间值之间的误差.我如何准确地搜索数组以确定其中一个值是否相等,而不必在每次采样数据更改时使用 for 循环来搜索数组?

I pull in the dataset for the truth file on each data-pull for my measured results, and I am trying to do a quick comparison to see if the time value associated with data value in the measured file is within a margin of error between the same time value in the truth file. How exactly can I search through an array for whether one of the values is equivalent without having to use a for loop to search through the array everytime I sample for data changes?

推荐答案

更新:

假设:

  • 数据和真值在四个数组中给出tr_t(真值时间)tr_v(真值)da_t(数据时间)和da_v(数据值)
  • 真实数据是完整的,采样频率为 10 Hz,换句话说,tr_t = np.arange(N)/10
  • data and truth are given in four arrays tr_t (truth times) tr_v (truth values) da_t (data times) and da_v (data values)
  • the truth data are complete and sampled at 10 Hz, in other words tr_t = np.arange(N) / 10

在这些假设下,与给定数据样本匹配的真值记录索引(da_t[i], da_v[i])ind = int(np.round(da_t[i]] * 10)) 如果数据时间不能准确落在十分之一秒 np.isclose(da_t[i], tr_t[ind], reltol, abstol)可用于过滤掉不充分的匹配项.以相同的方式比较值.

Under these assumptions the index of the truth record matching a given data sample (da_t[i], da_v[i]) is ind = int(np.round(da_t[i] * 10)) if data times can't be relied upon to fall exactly on tenths of seconds np.isclose(da_t[i], tr_t[ind], reltol, abstol) can be used to filter out insufficient matches. Values are compared in the same manner.

以矢量化形式:

inds = np.round(10 * da_t).astype(int)
mask = np.isclose(da_v, tr_v[inds], reltol, abstol) \  # required
       & np.isclose(da_t, tr_t[inds], reltol, abstol)  # optional

如果 tr_t 不规则但仍按升序查找索引:

Finding the indices if tr_t is irregular but still in ascending order:

inds = np.searchsorted(tr_t, da_t)
inds += tr_t[(inds + 1) % len(tr_t)] - da_t > da_t - tr_t[inds]

这篇关于使用numpy比较和查找两个不同大小的数组之间的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆