Python/SciPy 的寻峰算法 [英] Peak-finding algorithm for Python/SciPy

查看:53
本文介绍了Python/SciPy 的寻峰算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以通过查找一阶导数的零交叉点或其他东西来自己编写一些东西,但这似乎是一个足够通用的函数,可以包含在标准库中.有人知道吗?

我的特定应用是二维数组,但通常用于查找 FFT 等中的峰值.

具体来说,在这类问题中,有多个强峰,然后是许多由噪声引起的较小的峰",应该忽略不计.这些只是例子;不是我的实际数据:

一维峰值:

二维峰值:

寻峰算法将找到这些峰的位置(不仅仅是它们的值),理想情况下会找到真正的样本间峰,而不仅仅是具有最大值的索引,可能使用

想法是:

<块引用>

突出度越高,峰值越重要".

测试:

我故意使用了(嘈杂的)变频正弦曲线,因为它显示出许多困难.我们可以看到 width 参数在这里不是很有用,因为如果您将最小 width 设置得太高,那么它将无法跟踪非常接近的峰值高频部分.如果您将 width 设置得太低,您会在信号的左侧部分出现许多不需要的峰值.distance 也有同样的问题.threshold 只和直接邻居比较,这里没有用.prominence 是提供最佳解决方案的那个.请注意,您可以组合许多这些参数!

代码:

将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt从 scipy.signal 导入 find_peaksx = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000)* 0.15峰值,_ = find_peaks(x, distance=20)peaks2, _ = find_peaks(x, prominence=1) # 最好的!peaks3, _ = find_peaks(x, width=20)peaks4, _ = find_peaks(x, threshold=0.4) # 到其直接相邻样本所需的垂直距离,非常没用plt.subplot(2, 2, 1)plt.plot(peaks, x[peaks], "xr");plt.plot(x);plt.legend(['距离'])plt.subplot(2, 2, 2)plt.plot(peaks2, x[peaks2], "ob");plt.plot(x);plt.legend(['突出'])plt.subplot(2, 2, 3)plt.plot(peaks3, x[peaks3], "vg");plt.plot(x);plt.legend(['width'])plt.subplot(2, 2, 4)plt.plot(peaks4, x[peaks4], "xk");plt.plot(x);plt.legend(['阈值'])plt.show()

I can write something myself by finding zero-crossings of the first derivative or something, but it seems like a common-enough function to be included in standard libraries. Anyone know of one?

My particular application is a 2D array, but usually it would be used for finding peaks in FFTs, etc.

Specifically, in these kinds of problems, there are multiple strong peaks, and then lots of smaller "peaks" that are just caused by noise that should be ignored. These are just examples; not my actual data:

1-dimensional peaks:

2-dimensional peaks:

The peak-finding algorithm would find the location of these peaks (not just their values), and ideally would find the true inter-sample peak, not just the index with maximum value, probably using quadratic interpolation or something.

Typically you only care about a few strong peaks, so they'd either be chosen because they're above a certain threshold, or because they're the first n peaks of an ordered list, ranked by amplitude.

As I said, I know how to write something like this myself. I'm just asking if there's a pre-existing function or package that's known to work well.

Update:

I translated a MATLAB script and it works decently for the 1-D case, but could be better.

Updated update:

sixtenbe created a better version for the 1-D case.

解决方案

The function scipy.signal.find_peaks, as its name suggests, is useful for this. But it's important to understand well its parameters width, threshold, distance and above all prominence to get a good peak extraction.

According to my tests and the documentation, the concept of prominence is "the useful concept" to keep the good peaks, and discard the noisy peaks.

What is (topographic) prominence? It is "the minimum height necessary to descend to get from the summit to any higher terrain", as it can be seen here:

The idea is:

The higher the prominence, the more "important" the peak is.

Test:

I used a (noisy) frequency-varying sinusoid on purpose because it shows many difficulties. We can see that the width parameter is not very useful here because if you set a minimum width too high, then it won't be able to track very close peaks in the high frequency part. If you set width too low, you would have many unwanted peaks in the left part of the signal. Same problem with distance. threshold only compares with the direct neighbours, which is not useful here. prominence is the one that gives the best solution. Note that you can combine many of these parameters!

Code:

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

这篇关于Python/SciPy 的寻峰算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆