在质谱图中找到峰的位置 [英] find peaks location in a spectrum numpy

查看:355
本文介绍了在质谱图中找到峰的位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个TOF光谱,我想使用python(numpy)实现一种算法,该算法查找光谱的所有最大值并返回相应的x值.
我在网上查询了一下,发现下面报告了该算法.

I have a TOF spectrum and I would like to implement an algorithm using python (numpy) that finds all the maxima of the spectrum and returns the corresponding x values.
I have looked up online and I found the algorithm reported below.

这里的假设是,在最大值附近之前的值与最大值之间的差大于数字DELTA.问题是我的光谱由均匀分布的点组成,甚至接近最大值,因此永远不会超过DELTA,并且peakdet函数将返回一个空数组.

The assumption here is that near the maximum the difference between the value before and the value at the maximum is bigger than a number DELTA. The problem is that my spectrum is composed of points equally distributed, even near the maximum, so that DELTA is never exceeded and the function peakdet returns an empty array.

您知道如何解决此问题吗?我真的很感谢注释,以便更好地理解代码,因为我是python的新手.

Do you have any idea how to overcome this problem? I would really appreciate comments to understand better the code since I am quite new in python.

谢谢!

import sys
from numpy import NaN, Inf, arange, isscalar, asarray, array

def peakdet(v, delta, x = None): 
   maxtab = []
   mintab = []

   if x is None:
      x = arange(len(v))
      v = asarray(v)

   if len(v) != len(x):
      sys.exit('Input vectors v and x must have same length')
   if not isscalar(delta):
      sys.exit('Input argument delta must be a scalar')
   if delta <= 0:
      sys.exit('Input argument delta must be positive')

   mn, mx = Inf, -Inf
   mnpos, mxpos = NaN, NaN

   lookformax = True

   for i in arange(len(v)):
      this = v[i]
      if this > mx:
         mx = this
         mxpos = x[i]
      if this < mn:
         mn = this
         mnpos = x[i]

      if lookformax:
         if this < mx-delta:
            maxtab.append((mxpos, mx))
            mn = this
            mnpos = x[i]
            lookformax = False
      else:
         if this > mn+delta:
            mintab.append((mnpos, mn))
            mx = this
            mxpos = x[i]
            lookformax = True

return array(maxtab), array(mintab)

下图显示了光谱的一部分.实际上,我的峰值比这里显示的还要多.

Below is shown part of the spectrum. I actually have more peaks than those shown here.

推荐答案

我认为这可以作为起点.我不是信号处理专家,但是我对生成的信号Y进行了尝试,该信号看上去很像您的信号,并且噪声更大:

This, I think could work as a starting point. I'm not a signal-processing expert, but I tried this on a generated signal Y that looks quite like yours and one with much more noise:

from scipy.signal import convolve
import numpy as np
from matplotlib import pyplot as plt
#Obtaining derivative
kernel = [1, 0, -1]
dY = convolve(Y, kernel, 'valid') 

#Checking for sign-flipping
S = np.sign(dY)
ddS = convolve(S, kernel, 'valid')

#These candidates are basically all negative slope positions
#Add one since using 'valid' shrinks the arrays
candidates = np.where(dY < 0)[0] + (len(kernel) - 1)

#Here they are filtered on actually being the final such position in a run of
#negative slopes
peaks = sorted(set(candidates).intersection(np.where(ddS == 2)[0] + 1))

plt.plot(Y)

#If you need a simple filter on peak size you could use:
alpha = -0.0025
peaks = np.array(peaks)[Y[peaks] < alpha]

plt.scatter(peaks, Y[peaks], marker='x', color='g', s=40)

样本结果: 对于嘈杂的一个,我用alpha过滤了峰:

The sample outcomes: For the noisy one, I filtered peaks with alpha:

如果alpha需要更复杂的方法,则可以尝试使用例如关于它们是混合高斯的假设(我最喜欢的是Otsu阈值,存在于cvskimage中)或某种聚类(k均值可能起作用).

If the alpha needs more sophistication you could try dynamically setting alpha from the peaks discovered using e.g. assumptions about them being a mixed gaussian (my favourite being the Otsu threshold, exists in cv and skimage) or some sort of clustering (k-means could work).

作为参考,这是我用来生成信号的:

And for reference, this I used to generate the signal:

Y = np.zeros(1000)

def peaker(Y, alpha=0.01, df=2, loc=-0.005, size=-.0015, threshold=0.001, decay=0.5):  
    peaking = False
    for i, v in enumerate(Y):
        if not peaking:
            peaking = np.random.random() < alpha
            if peaking:
                Y[i] = loc + size * np.random.chisquare(df=2)
                continue
        elif Y[i - 1] < threshold:
            peaking = False

        if i > 0:
            Y[i] = Y[i - 1] * decay

peaker(Y)

支持降低基准线

我这样做是为了模拟倾斜的基线:

I simulated a slanting base-line by doing this:

Z = np.log2(np.arange(Y.size) + 100) * 0.001
Y = Y + Z[::-1] - Z[-1]

然后使用固定的alpha进行检测(请注意,我更改了alpha上的符号):

Then to detect with a fixed alpha (note that I changed sign on alpha):

from scipy.signal import medfilt

alpha = 0.0025
Ybase = medfilt(Y, 51) # 51 should be large in comparison to your peak X-axis lengths and an odd number.
peaks = np.array(peaks)[Ybase[peaks] - Y[peaks] > alpha] 

产生以下结果(基线绘制为黑色虚线):

Resulting in the following outcome (the base-line is plotted as dashed black line):

简化和评论

我简化了代码,就像@skymandr所评论的那样,两个convolve都使用一个内核.这也消除了调整收缩率时的魔术数,因此任何大小的内核都应该这样做.

I simplified the code to use one kernel for both convolves as @skymandr commented. This also removed the magic number in adjusting the shrinkage so that any size of the kernel should do.

用于选择"valid"作为convolve的选项.在"same"上它可能同样适用,但是我选择"valid",因此我不必考虑边缘条件以及该算法是否可以检测到那里的尖峰.

For the choice of "valid" as option to convolve. It would probably have worked just as well with "same", but I choose "valid" so I didn't have to think about the edge-conditions and if the algorithm could detect spurios peaks there.

这篇关于在质谱图中找到峰的位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆