创建通用列并转换时间序列，如数据 [英] Create common columns and transform time series like data

查看：57 发布时间：2020/5/2 6:38:23 python python-3.x pandas list dataframe

本文介绍了创建通用列并转换时间序列，如数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Excel工作表，其中包含30多个工作表，用于不同的参数，例如BP，心率等.

I have an excel sheet which contains more than 30 sheets for different parameters like BP, Heart rate etc.

其中一个数据框(df1-由一张excel创建)如下图所示

One of the dataframe (df1 - created from one sheet of excel) looks like as shown below

df1= pd.DataFrame({'person_id':[1,1,1,1,2,2,2,2,3,3,3,3,3,3],'level_1': ['H1Date','H1','H2Date','H2','H1Date','H1','H2Date','H2','H1Date','H1','H2Date','H2','H3Date','H3'],
               'values': ['2006-10-30 00:00:00','6.6','2006-08-30 00:00:00','4.6','2005-10-30 00:00:00','6.9','2016-11-30 00:00:00','6.6','2006-10-30 00:00:00','6.6','2006-11-30 00:00:00','8.6',
                       '2106-10-30 00:00:00','16.6']})

可以使用下面的代码生成另一张excel文件中的另一个数据框(df2)

Another dataframe (df2) from another sheet of excel file can be generated using the code below

df2= pd.DataFrame({'person_id':[1,1,1,1,2,2,2,2,3,3,3,3,3,3],'level_1': ['GluF1Date','GluF1','GluF2Date','GluF2','GluF1Date','GluF1','GluF2Date','GluF2','GluF1Date','GluF1','GluF2Date','GluF2','GluF3Date','GluF3'],
               'values': ['2006-10-30 00:00:00','6.6','2006-08-30 00:00:00','4.6','2005-10-30 00:00:00','6.9','2016-11-30 00:00:00','6.6','2006-10-30 00:00:00','6.6','2006-11-30 00:00:00','8.6',
                       '2106-10-30 00:00:00','16.6']})

类似地，有30多个这样的数据帧，它们的值具有相同的格式(日期和测量值)，但列名(H1，GluF1，H1Date，H100，H100Date，GluF1Date，P1，PDate，UACRDate，UACR100等) )是不同的

Similarly there are more than 30 dataframes like this with values of the same format (Date & measurement value) but column names (H1, GluF1, H1Date,H100,H100Date, GluF1Date,P1,PDate,UACRDate,UACR100, etc) are different

基于SO搜索，我试图做的事情如下所示

What I am trying to do based on SO search is as shown below

g = df1.level_1.str[-2:] # Extracting column names
    df1['lvl'] = df1.level_1.apply(lambda x: int(''.join(filter(str.isdigit, x)))) # Extracting level's number
    df1= df1.pivot_table(index=['person_id', 'lvl'], columns=g, values='values', aggfunc='first')
    final = df1.reset_index(level=1).drop(['lvl'], axis=1)

上面的代码给出了这样的输出，这是不期望的

The above code gives an output like this which is not expected

这不起作用，因为g不会对所有记录产生相同的字符串输出(列名).如果子字符串提取得到相同的输出，我的代码将起作用，但是由于数据就像序列，所以我无法使其统一

This doesn't work as g doesn't result in same string output (column names) for all records. My code would work if the substring extract has resulted in same output but since the data is like sequence, I am not able to make it uniform

我希望每个数据帧的输出如下所示.请注意，一个人可以有3条记录(H1..H3)/10条记录(H1..H10)/100条记录(例如:H1 ... H100).都有可能.

I expect my output to be like as shown below for each dataframe. Please note that a person can have 3 records (H1..H3)/10 records (H1..H10) / 100 records (ex: H1...H100). It is all possible.

更新的屏幕截图

创建通用列并转换时间序列，如数据 [英] Create common columns and transform time series like data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

创建通用列并转换时间序列，如数据 [英] Create common columns and transform time series like data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭