使用numpy比较和查找两个不同大小的数组之间的错误 [英] comparing and finding error between two arrays of different sizes using numpy
问题描述
我有长度各不相同的 csv 文件.但是,我有一个真值文件,它的采样率为每秒 10 次,但正在记录的数据在秒边界上每秒记录一次.我试图匹配这些第二个边界来比较自动化测试的错误.下面是我的 csv 文件的示例.
I have csv files that all vary in length. However, I have a truth file that has samples at 10x per second, but the data that is being recorded is recorded once per second on second boundaries. I am trying to match these second boundaries to compare the error for automating tests. Below are an example of what my csv files look like.
真相档案
0, 1
0.1, 2
0.2, 3
.
.
.
x, n
测量文件
0, 1.01
1, 9.99
3, 30.05
.
.
.
x, n
我为我的测量结果在每个数据拉取的真值文件中拉入数据集,并且我正在尝试进行快速比较以查看与测量文件中的数据值相关联的时间值是否在真值文件中相同时间值之间的误差.我如何准确地搜索数组以确定其中一个值是否相等,而不必在每次采样数据更改时使用 for 循环来搜索数组?
I pull in the dataset for the truth file on each data-pull for my measured results, and I am trying to do a quick comparison to see if the time value associated with data value in the measured file is within a margin of error between the same time value in the truth file. How exactly can I search through an array for whether one of the values is equivalent without having to use a for loop to search through the array everytime I sample for data changes?
推荐答案
更新:
假设:
- 数据和真值在四个数组中给出
tr_t
(真值时间)tr_v
(真值)da_t
(数据时间)和da_v
(数据值) - 真实数据是完整的,采样频率为 10 Hz,换句话说,
tr_t = np.arange(N)/10
- data and truth are given in four arrays
tr_t
(truth times)tr_v
(truth values)da_t
(data times) andda_v
(data values) - the truth data are complete and sampled at 10 Hz, in other words
tr_t = np.arange(N) / 10
在这些假设下,与给定数据样本匹配的真值记录索引(da_t[i], da_v[i])
是 ind = int(np.round(da_t[i]] * 10))
如果数据时间不能准确落在十分之一秒 np.isclose(da_t[i], tr_t[ind], reltol, abstol)
可用于过滤掉不充分的匹配项.以相同的方式比较值.
Under these assumptions the index of the truth record matching a given data sample (da_t[i], da_v[i])
is ind = int(np.round(da_t[i] * 10))
if data times can't be relied upon to fall exactly on tenths of seconds np.isclose(da_t[i], tr_t[ind], reltol, abstol)
can be used to filter out insufficient matches. Values are compared in the same manner.
以矢量化形式:
inds = np.round(10 * da_t).astype(int)
mask = np.isclose(da_v, tr_v[inds], reltol, abstol) \ # required
& np.isclose(da_t, tr_t[inds], reltol, abstol) # optional
如果 tr_t
不规则但仍按升序查找索引:
Finding the indices if tr_t
is irregular but still in ascending order:
inds = np.searchsorted(tr_t, da_t)
inds += tr_t[(inds + 1) % len(tr_t)] - da_t > da_t - tr_t[inds]
这篇关于使用numpy比较和查找两个不同大小的数组之间的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!