如何找到列表元素超过特定条件的持续时间? [英] How can I find the duration a list element exceeds a certain criteria?

查看:95
本文介绍了如何找到列表元素超过特定条件的持续时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为我的荣誉项目分析EEG数据.具体来说,我正在分析EEG爆发(即超过标准的计数数量).每秒记录一次EEG值.在我的情况下,EEG爆发的标准是每次EEG值超过5分钟平均值的15%.但是,我还想找到持续时间保持在均值15%以上的持续时间.我该怎么办?

I'm analysing EEG data for my honours project. Specifically, I'm analysing EEG bursts (i.e. the number of counts that it exceeds a criteria). The EEG values are recorded every second. In my case, the criteria of a EEG burst is every time the EEG value exceeds 15% of the 5-minute mean. However, I would also like to find the duration that it remains above 15% of the mean. How can I do this?

这是5分钟块的数据,区域2 [0] [0] [0]

edit: this is the data for a 5-min block, regions2[0][0][0]

array([2.6749268, 2.3965783, 2.2648365, 2.2901094, 2.5218956, 2.6369736,
   2.3637865, 2.1217346, 1.895203 , 2.055559 , 1.9365673, 2.4218671,
   2.530769 , 2.385391 , 2.293663 , 2.3126507, 2.323733 , 2.2733889,
   2.3903291, 2.6176193, 2.6430926, 2.58586  , 2.352392 , 2.4955454,
   2.6099124, 2.3200274, 2.1760976, 2.5159674, 2.76305  , 2.3733828,
   2.4342089, 2.4008656, 2.2075768, 2.232682 , 2.2406263, 2.4858663,
   2.4188566, 2.5680597, 2.5303915, 2.3958497, 2.2115357, 2.444274 ,
   2.5103524, 2.0567694, 2.441487 , 2.430129 , 2.4614134, 2.282298 ,
   2.4610975, 2.5782802, 2.3088896, 2.660237 , 2.8228939, 2.386515 ,
   1.9969627, 2.0703123, 2.891341 , 2.929259 , 2.3676789, 2.39686  ,
   2.559953 , 2.4817688, 2.4235504, 2.2657301, 2.6064477, 2.6751654,
   2.5263813, 2.1663566, 2.2710345, 2.6688013, 2.1095626, 2.560567 ,
   2.6420567, 2.3834925, 2.4658787, 2.6067703, 2.5786612, 2.6147954,
   2.5842502, 2.5785747, 2.427758 , 2.2909386, 2.2653525, 2.382083 ,
   2.5664327, 2.5153337, 1.820536 , 2.5582454, 2.3047743, 2.0991004,
   2.4578576, 2.7292717, 2.6083386, 2.4281838, 2.453028 , 2.4099083,
   2.3806388, 2.3578563, 2.590041 , 2.621177 , 2.3468106, 2.145658 ,
   2.077852 , 2.2439861, 2.6040363, 2.7262418, 2.5456822, 2.4032714,
   2.4305286, 2.3440735, 2.4467494, 2.8309298, 2.484087 , 2.4205194,
   2.9501045, 2.9746544, 2.4083674, 2.3036501, 2.6792996, 2.3589804,
   2.3387434, 3.1610718, 3.1351097, 2.5165584, 2.52014  , 2.4717925,
   2.3442857, 2.4484215, 2.6329467, 2.5656624, 2.1032746, 2.6107414,
   2.6543136, 2.3989596, 2.491326 , 2.448652 , 2.0739408, 2.1546881,
   2.2125206, 2.3453302, 2.206572 , 2.5481203, 2.4757648, 2.321009 ,
   2.151885 , 2.5293174, 2.4324925, 2.5090148, 2.511013 , 2.3891613,
   2.410615 , 2.4933898, 2.5637872, 2.5406888, 2.3262205, 2.3038528,
   2.3525562, 2.5020993, 2.418803 , 2.3449333, 2.3671083, 2.2996266,
   2.3794844, 2.567588 , 2.4661932, 2.2262957, 2.23916  , 2.4768639,
   2.5996208, 2.0967448, 2.6293585, 2.63859  , 2.1346848, 2.396465 ,
   2.5590088, 2.7100194, 2.5844605, 2.509511 , 2.3888776, 2.3450603,
   2.3059156, 2.3003197, 2.265796 , 2.409108 , 2.3471978, 2.3690042,
   2.3681839, 2.1804225, 2.3097868, 2.5214322, 2.5647783, 2.2263277,
   2.237076 , 2.1725235, 2.7022493, 2.6051672, 2.3920796, 2.5295384,
   2.3592722, 2.3246412, 2.6447232, 2.471837 , 2.7084064, 2.891163 ,
   2.853586 , 2.5431767, 2.626647 , 2.384906 , 2.4660795, 2.3987703,
   1.952864 , 2.200941 , 2.0904007, 2.3755703, 2.4843922, 2.2047417,
   2.356966 , 2.3752437, 2.3846717, 2.306039 , 2.1720207, 2.3954203,
   2.3085623, 2.3531506, 1.9799676, 2.211963 , 2.141896 , 2.23061  ,
   2.3369045, 2.0759099, 2.3769715, 2.2083194, 2.3833442, 2.519347 ,
   2.3158543, 2.349166 , 2.4172142, 2.4422796, 2.513461 , 2.3867416,
   2.2908096, 2.5697057, 2.5763984, 2.319024 , 2.2902663, 2.6102533,
   2.6331408, 2.0932658, 2.323995 , 2.5237315, 2.5517514, 2.2664344,
   2.6006377, 2.3638246, 2.1815908, 2.440507 , 2.319951 , 2.399297 ,
   2.4986413, 2.3624146, 2.2377589, 2.277511 , 2.2512653, 2.1235788,
   2.1558921, 2.0374463, 2.2526782, 2.3825428, 2.2467673, 2.2891142,
   2.2968547, 2.2971904, 2.3470707, 2.2792215, 2.4101918, 2.4403389,
   2.674443 , 2.7500827, 2.4176645, 2.4565153, 2.5339708, 2.545448 ,
   2.3907104, 2.4448986, 2.571172 , 2.3371966, 2.5683126, 2.6280603,
   2.4276655, 2.4700544, 2.421201 , 2.499506 , 2.555047 , 2.7138271,
   2.546179 , 2.484039 , 2.3956237, 2.591265 , 2.6552162, 2.6930716],
  dtype=float32)

1.15 * np.mean(regions2[0][0][0])
Out[17]: 2.783561062812805

np.where(regions2[0][0][0] > 1.15 * np.mean(regions2[0][0][0]))
Out[13]: (array([ 52,  56,  57, 111, 114, 115, 121, 122, 203, 204], dtype=int64),)

edit:查看大于均值1.15的值的索引号,我意识到这表明了我正在寻找的持续时间.也就是说,如果它是单个数字52,则是1秒钟的脉冲串;如果是56和57(2个连续的索引编号),则是2秒钟的脉冲串.如何制作一个脚本来计算每个5分钟块的平均突发持续时间?在此示例中,其平均值为(1s + 2s + 1s + 2s + 2s + 2s)

edit: Looking at the index numbers of values greater than 1.15 of the mean, I realised that shows the duration that I'm looking for. i.e. if it's a single number 52 then its a 1 second burst, if it's 56 and 57 (2 consecutive index numbers), then its a 2 second burst. How can I make a script to calculate the average burst durations for this each 5 min block? In this example, it would be average of (1s + 2s + 1s + 2s + 2s + 2s)

我附上了一个有效的代码段,该代码段可以通过遍历ndarray并比较两个连续的EEG值(如果前一个值较小,则增加1个突发计数(温度+ = 1))来计算突发数量大于平均值的1.15,而后者的值大于平均值的1.15).这是一个5分钟的区块,我将重复18个5分钟的区块.

I've attached a code snippet that works and is able to count the number of bursts by iterating through the ndarray and comparing two consecutive EEG values (add 1 count of burst (temp +=1) if the former value is less than 1.15 of the mean and the latter value is more than 1.15 of the mean). This is for one 5-minute block and I repeat this for 18 5-minute blocks.

burst = 0
for l in range(1,len(regions[i][j][k])): #regions[i][j][k] refers to a 5-min block of EEG data
    if regions[i][j][k][l-1] < (1.15 * np.nanmean(regions[i][j][k])) and regions2[i][j][k][l] > (1.15 * np.nanmean(regions2[i][j][k])):
        burst += 1 #counts a burst

我以为我可以找到爆发发生点的索引号和爆发结束的值的索引号(EEG值降至平均值的1.15以下).持续时间(以秒为单位)将是两个索引值之间的差.

I was thinking that I could find the index number of the point at which the burst occurs and index number of the value at which the burst ends (EEG value drops below 1.15 of the mean). The duration, in seconds, would be the difference between the two index values.

但是,我不知道如何执行此操作,或者这是否是最好的方法.这些爆发多次发生,因此我的最终目标是每5分钟查找一次这些爆发的平均值.几个月前,我才开始学习python,因此不胜感激!

However, I have no idea how to do this, or if this is the best way. These bursts occur multiple times so my ultimate objective is to find the average values of these burst every 5 minutes. I just started learning python a few months back so any help would be greatly appreciated!

推荐答案

您可以使用numpy获取这些区域.我从另一个

You can get these regions using numpy. I leveraged part of an answer from another question to do this.

具有索引位置和它们之间的点数以及控制台/绘图屏幕截图的结果:

Results with index positions and number of points between them as well as screenshot of console/plot:

执行此操作的代码(您未提供任何数据,我使用正弦波作为基础):

The code to do this (You had no data given, I used a sine-wave as base):

import numpy as np
import datetime
import matplotlib.pyplot as plot

base = datetime.datetime(2019, 1, 1,8,0,0)
tarr = np.array([base + datetime.timedelta(seconds=i) for i in range(60*60)]) #  1 h

time = np.arange(0, 6*6, 0.01) *0.2;
amp = np.sin(time)

# get all indexes that interest me - yours would have the mean here
# not used in rest of code: contains all indexes that are above 0.75
indx = np.nonzero(abs(amp) > 0.75)
# looks like: (array([ 425,  426,  427, ..., 3597, 3598, 3599], dtype=int64),)


def contiguous_regions(cond):
    """Credits: https://stackoverflow.com/a/4495197/7505395"""
    """Finds contiguous True regions of the boolean array "condition". Returns
    a 2D array where the first column is the start index of the region and the
    second column is the end index."""

    # Find the indicies of changes in "condition"
    d = np.diff(cond)
    idx, = d.nonzero() 

    # We need to start things after the change in "condition". Therefore, 
    # we'll shift the index by 1 to the right.
    idx += 1

    if condition[0]:
        # If the start of condition is True prepend a 0
        idx = np.r_[0, idx]

    if condition[-1]:
        # If the end of condition is True, append the length of the array
        idx = np.r_[idx, cond.size] 

    # Reshape the result into two columns
    idx.shape = (-1,2)
    return idx


condition = np.abs(amp) > 0.75

# create a plot visualization
fig, ax = plot.subplots()
ax.plot(tarr, amp)

# print values that fullfill our condition - also do some plotting
for start, stop in contiguous_regions(condition):
    segment = amp[start:stop]
    print (start, stop, f"Amount: {stop-start}")

    # plot some lines into the graph, off by 1 for stops in graph
    ax.vlines(x=tarr[start], ymin=amp[start]+(-0.1 if amp[start]>0 else 0.1),
                             ymax=1.0 if amp[start]>0 else -1.0, color='r')
    ax.vlines(x=tarr[stop-1], ymin=amp[start]+(-0.1 if amp[start]>0 else 0.1), 
                              ymax=1.0 if amp[start]>0 else -1.0, color='r')
    ax.hlines(y=amp[start]-0.08,xmin=tarr[start],xmax=tarr[stop-1])

# show plot    
plot.show()

诀窍是应用 ndp.diff 到其中具有[False,True,...]的数组,可以完全满足条件.然后从np.nonzero()结果元组中获取索引.

The trick is to apply ndp.diff to the array with [False,True,...]'s in them that fullfill the condition. Then get the indexes from np.nonzero() result-tuple.

这篇关于如何找到列表元素超过特定条件的持续时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆