仅在知道结果测量计数的情况下对 pandas 数据框重新采样 [英] Resample pandas dataframe only knowing result measurement count

查看：81 发布时间：2020/5/13 18:36:26 python pandas time-series resampling multi-index

本文介绍了仅在知道结果测量计数的情况下对 pandas 数据框重新采样的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个看起来像这样的数据框:

I have a dataframe which looks like this:

Trial    Measurement    Data
    0              0      12 
                   1       4
                   2      12
    1              0      12
                   1      12
    2              0      12
                   1      12
                   2     NaN
                   3      12

我想对数据进行重新采样，以便每个试验只有两次测量所以我想把它变成这样:

I want to resample my data so that every trial has just two measurements So I want to turn it into something like this:

Trial    Measurement    Data
    0              0       8 
                   1       8
    1              0      12
                   1      12
    2              0      12
                   1      12

这项相当罕见的任务源于我的数据在刺激表现方面有故意的抖动.

This rather uncommon task stems from the fact that my data has an intentional jitter on the part of the stimulus presentation.

我知道pandas具有重采样功能，但是我不知道如何将其应用于我的二级索引，同时根据一级索引将数据保留在离散类别中:(

I know pandas has a resample function, but I have no idea how to apply it to my second-level index while keeping the data in discrete categories based on the first-level index :(

我还想遍历我的一级索引，但是显然

Also, I wanted to iterate, over my first-level indices, but apparently

for sub_df in np.arange(len(df['Trial'].max()))

不起作用，因为由于'Trial'是大熊猫找不到的索引.

Won't work because since 'Trial' is an index pandas can't find it.

推荐答案

好吧，这不是我见过的最漂亮的，而是从看起来像这样的框架中

Well, it's not the prettiest I've ever seen, but from a frame looking like

>>> df
   Trial  Measurement  Data
0      0            0    12
1      0            1     4
2      0            2    12
3      1            0    12
4      1            1    12
5      2            0    12
6      2            1    12
7      2            2   NaN
8      2            3    12

然后我们可以手动构建两个平均"对象，然后使用pd.melt重塑输出的形状:

then we can manually build the two "average-like" objects and then use pd.melt to reshape the output:

avg = df.groupby("Trial")["Data"].agg({0: lambda x: x.head((len(x)+1)//2).mean(), 
                                       1: lambda x: x.tail((len(x)+1)//2).mean()}) 
result = pd.melt(avg.reset_index(), "Trial", var_name="Measurement", value_name="Data")
result = result.sort("Trial").set_index(["Trial", "Measurement"])

产生

>>> result

                   Data
Trial Measurement      
0     0               8
      1               8
1     0              12
      1              12
2     0              12
      1              12

这篇关于仅在知道结果测量计数的情况下对 pandas 数据框重新采样的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

仅在知道结果测量计数的情况下对 pandas 数据框重新采样 [英] Resample pandas dataframe only knowing result measurement count

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

仅在知道结果测量计数的情况下对 pandas 数据框重新采样 [英] Resample pandas dataframe only knowing result measurement count

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭