python结合了26个不同时间戳和相同列的数据帧 [英] python combining 26 dataframes of different timestamps and same columns

查看:105
本文介绍了python结合了26个不同时间戳和相同列的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

美国的数据库以.csv格式从世界各地接收一天的信息.有15天的数据,来自26个地方.总共15x26 = 390个数据帧.另外,有26个地方有一个公共参考数据帧.现在,我想将390 + 1参考数据帧组合为一个.我在这里给出了我的问题的一个示例:

A database in US receives one day information from a place around the world in .csv format. There are 15 days of data and coming from 26 places. A total of 15x26 = 390 data frames. In addition, 26 places have a common reference data frame. Now, I want to combine 390 + 1 reference dataframes into one. I have given here a sample of my problem:

plA_d1df =        ### place A day1 dataframe
                       Adata
2019-01-10 07:35:22   10
2019-01-10 08:15:17   20

plB_d1df = 
                       Bdata
2019-01-10 07:38:45   30
2019-01-10 08:18:57   40

ptA_d2df = 
                       Adata
2019-01-21 07:35:42   50
2019-01-21 08:15:17   60

ptB_d2df = 
                       Bdata
2019-01-21 07:39:04   70
2019-01-21 08:19:22   80

reference = 
                          ref
2019-01-10 07:35:00     500
2019-01-10 07:38:00     530
2019-01-10 08:15:00     560
2019-01-10 08:18:00     590
2019-01-21 07:35:00     610
2019-01-21 07:39:00     640
2019-01-21 08:15:00     670
2019-01-21 08:19:00     700

所有地点和参考的以上数据应与地点-A的时间戳合并,如下所示:

Above data of all places and reference should be combined to the timestamp of place-A as given below:

combdf = 
  datetime            ref0  Adata     ref1   Bdata  
2019-01-10 07:35:22    500   10      530    30
2019-01-10 08:15:17    560   20      590    40  
2019-01-21 07:35:42    610   50      640    70
2019-01-21 08:15:17    670   60      700    80 

在引用已解决的 answer 后,我实现了以下代码:

I implemented following code after referring the solved answer:

biglist = [[plA_d1df,plB_d1df],[plA_d2df,plB_d2df]] ## dataframes are in a nested list of list
l = []
s1 = []
### refdf = reference dataframe
for i in range(0,len(biglist),1):
    for j in range(0,len(biglist[i]),1):       
            s1=refdf.reindex(biglist[i][j].index,method='nearest')            
        if j==0:
            l.append(s1.join(biglist[i][j]))
        else:
            l.append(s1.join(biglist3[i][j]).reindex(l[0].index,method='nearest'))
combdf = pd.concat(l,1) 

以上代码成功运行.组合数据帧combdf的时间戳与位置A匹配,这是我想要的.但是同一位置的列未合并.而是每天创建单独的列.因此,我调高了8列,而不是4列,大部分用nan填充. 我目前的输出是:

Above code ran successfully. Timestamp of combined dataframe combdf matches with place A, which is what I wanted. But the columns of same place did not merge. Instead, seperate columns were created for each day. So I eneded up having 8 columns, instead 4, mostly filled with nan. My present output is:

combdf = 
  datetime            ref0  Adata   ref1   Bdata   ref0  Adata   ref1   Bdata  
2019-01-10 07:35:22    500   10     530    30       nan    ..          nan
2019-01-10 08:15:17    560   20     590    40       nan    ..          nan
2019-01-21 07:35:42    nan    ..          nan       610   50     640    70
2019-01-21 08:15:17   nan    ..          nan        670   60     700    80 

我必须进行哪些更正才能将列合并为相同的列.

What corrections I have to make to merge columns into same.

推荐答案

将代码更改为

biglist = [[df1,df2],[df3,df4]] ## dataframes are in a nested list of list
l = []
s1 = []
for i in range(0,len(biglist),1):
    l1=[]
    for j in range(0,len(biglist[i]),1):
            s1=refdf.reindex(biglist[i][j].index,method='nearest')
            if j==0:
                l1.append(s1.join(biglist[i][j]))
            else:
                l1.append(s1.join(biglist[i][j]).reindex(l1[0].index,method='nearest'))
    l.append(pd.concat(l1,axis=1))
combdf = pd.concat(l,0)
combdf
Out[252]: 
                     ref  Adata  ref  Bdata
2019-01-10 07:35:22  500     10  530     30
2019-01-10 08:15:17  560     20  590     40
2019-01-21 07:35:42  610     50  640     70
2019-01-21 08:15:17  670     60  700     80

这篇关于python结合了26个不同时间戳和相同列的数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆