Python/SciPy的峰值查找算法 [英] Peak-finding algorithm for Python/SciPy

查看:737
本文介绍了Python/SciPy的峰值查找算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以通过找到一阶导数的零交叉点或某些东西来自己写点东西,但是它似乎包含在标准库中,具有足够的通用性.有人知道吗?

I can write something myself by finding zero-crossings of the first derivative or something, but it seems like a common-enough function to be included in standard libraries. Anyone know of one?

我的特定应用是2D阵列,但通常将其用于查找FFT等中的峰值.

My particular application is a 2D array, but usually it would be used for finding peaks in FFTs, etc.

具体来说,在这些类型的问题中,有多个强峰值,然后是由噪声引起的许多较小的峰值",应将其忽略.这些只是示例;不是我的实际数据:

Specifically, in these kinds of problems, there are multiple strong peaks, and then lots of smaller "peaks" that are just caused by noise that should be ignored. These are just examples; not my actual data:

一维峰:

二维峰:

峰值查找算法将找到这些峰的位置(而不仅仅是它们的值),理想情况下将找到真正的样本间峰,而不仅仅是具有最大值的索引,可能使用

The peak-finding algorithm would find the location of these peaks (not just their values), and ideally would find the true inter-sample peak, not just the index with maximum value, probably using quadratic interpolation or something.

通常,您只关心几个强峰,因此选择它们是因为它们超过某个阈值,或者因为它们是有序列表的第一个 n 个峰,按幅度排序.

Typically you only care about a few strong peaks, so they'd either be chosen because they're above a certain threshold, or because they're the first n peaks of an ordered list, ranked by amplitude.

正如我所说,我自己会写这样的东西.我只是问是否有一个已知的运作良好的功能或软件包.

As I said, I know how to write something like this myself. I'm just asking if there's a pre-existing function or package that's known to work well.

更新:

翻译了MATLAB脚本,它对于一维情况也很有效,但可能会更好

I translated a MATLAB script and it works decently for the 1-D case, but could be better.

更新后的更新:

sixtenbe 为一维案例创建了更好的版本.

sixtenbe created a better version for the 1-D case.

推荐答案

函数 scipy.signal.find_peaks 对此很有用.但重要的是要弄清楚其参数widththresholddistance ,尤其是prominence ,以获得良好的峰提取.

The function scipy.signal.find_peaks, as its name suggests, is useful for this. But it's important to understand well its parameters width, threshold, distance and above all prominence to get a good peak extraction.

根据我的测试和文档,突出的概念是有用的概念",用于保持良好的峰值并丢弃嘈杂的峰值.

According to my tests and the documentation, the concept of prominence is "the useful concept" to keep the good peaks, and discard the noisy peaks.

什么是(地形)突出?它是从山顶下降到更高地形所需的最低高度" ,如此处所示:

What is (topographic) prominence? It is "the minimum height necessary to descend to get from the summit to any higher terrain", as it can be seen here:

这个想法是:

突出程度越高,峰越重要".

The higher the prominence, the more "important" the peak is.

测试:

我故意使用了一个(嘈杂的)频率变化的正弦曲线,因为它显示出很多困难.我们可以看到width参数在这里不是很有用,因为如果您将最小值width设置得太高,那么它将无法跟踪高频部分中非常接近的峰值.如果您将width设置得太低,则信号左侧会出现许多不需要的峰值. distance同样的问题. threshold仅与直接邻居进行比较,在这里没有用. prominence是提供最佳解决方案的一种.请注意,您可以结合使用许多这些参数!

I used a (noisy) frequency-varying sinusoid on purpose because it shows many difficulties. We can see that the width parameter is not very useful here because if you set a minimum width too high, then it won't be able to track very close peaks in the high frequency part. If you set width too low, you would have many unwanted peaks in the left part of the signal. Same problem with distance. threshold only compares with the direct neighbours, which is not useful here. prominence is the one that gives the best solution. Note that you can combine many of these parameters!

代码:

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

这篇关于Python/SciPy的峰值查找算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆