如何获得多个峰下的面积值 [英] How to get value of area under multiple peaks

查看:28
本文介绍了如何获得多个峰下的面积值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些来自生物分析仪的数据,这些数据为我提供了时间(x 轴)和吸光度值(y 轴).时间是每 0.05 秒,从 32 秒到 138 秒,所以你可以想象我有多少数据点.我已经使用 plotly 和 matplotlib 创建了一个图形,只是为了让我有更多的库来寻找解决方案,所以任何一个库中的解决方案都可以!我想要做的是让我的脚本找到每个峰下的面积并返回我的值.

I have some data from a bioanalyzer which gives me time (x-axis) and absorbance values (y-axis). The time is every .05 seconds and its from 32s to 138 so you can imagine how many data points I have. I've created a graph using plotly and matplotlib, just so that I have more libraries to work with to find a solution, so a solution in either library is ok! What I'm trying to do is make my script find the area under each peak and return my value.

def create_plot(sheet_name):
    sample = book.sheet_by_name(sheet_name)
    data = [[sample.cell_value(r, c) for r in range(sample.nrows)] for c in range(sample.ncols)]
    y = data[2][18:len(data[2]) - 2]
    x = np.arange(32, 138.05, 0.05)
    indices = peakutils.indexes(y, thres=0.35, min_dist=0.1)
    peaks = [y[i] for i in indices]

此代码段获取我的 Y 值、X 值和峰值索引.现在有没有办法得到每条曲线下的面积?假设有 15 个索引.

This snippet gets my Y values, X values and indices of the peaks. Now is there a way to get the area under each curve? Let's say that there are 15 indices.

图表如下所示:

推荐答案

自动回答

给定一组 xy 值以及一组 peaks(x-峰的坐标),下面介绍如何自动找到每个峰下的面积.我假设 xypeaks 都是 Numpy 数组:

An automated answer

Given a set of x and y values as well as a set of peaks (the x-coordinates of the peaks), here's how you can automatically find the area under each of the peaks. I'm assuming that x, y, and peaks are all Numpy arrays:

import numpy as np

# find the minima between each peak
ixpeak = x.searchsorted(peaks)
ixmin = np.array([np.argmin(i) for i in np.split(y, ixpeak)])
ixmin[1:] += ixpeak
mins = x[ixmin]

# split up the x and y values based on those minima
xsplit = np.split(x, ixmin[1:-1])
ysplit = np.split(y, ixmin[1:-1])

# find the areas under each peak
areas = [np.trapz(ys, xs) for xs,ys in zip(xsplit, ysplit)]

输出:

示例数据已设置为(或多或少)保证每个峰下的面积为 1.0,因此底部图中的结果是正确的.绿色 X 标记是每两个峰值之间最小值的位置.属于"每个峰的曲线部分被确定为与每个峰相邻的最小值之间的曲线部分.

The example data has been set up so that the area under each peak is (more-or-less) guaranteed to be 1.0, so the results in the bottom plot are correct. The green X marks are the locations of the minimum between each two peaks. The part of the curve "belonging" to each peak is determined as the part of the curve in-between the minima adjacent to each peak.

这是我用来生成示例数据的完整代码:

Here's the complete code I used to generate the example data:

import scipy as sp
import scipy.stats

prec = 1e5
n = 10
N = 150
r = np.arange(0, N+1, N//n)

# generate some reasonable fake data
peaks = np.array([np.random.uniform(s, e) for s,e in zip(r[:-1], r[1:])])
x = np.linspace(0, N + n, num=int(prec))
y = np.max([sp.stats.norm.pdf(x, loc=p, scale=.4) for p in peaks], axis=0)

以及我用来制作绘图的代码:

and the code I used to make the plots:

import matplotlib.pyplot as plt

# plotting stuff
plt.figure(figsize=(5,7))
plt.subplots_adjust(hspace=.33)
plt.subplot(211)
plt.plot(x, y, label='trace 0')
plt.plot(peaks, y[ixpeak], '+', c='red', ms=10, label='peaks')
plt.plot(mins, y[ixmin], 'x', c='green', ms=10, label='mins')
plt.xlabel('dep')
plt.ylabel('indep')
plt.title('Example data')
plt.ylim(-.1, 1.6)
plt.legend()

plt.subplot(212)
plt.bar(np.arange(len(areas)), areas)
plt.xlabel('Peak number')
plt.ylabel('Area under peak')
plt.title('Area under the peaks of trace 0')
plt.show()

这篇关于如何获得多个峰下的面积值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆