合并多个DataFrames pandas [英] Merge multiple DataFrames Pandas

查看:66
本文介绍了合并多个DataFrames pandas 的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能被视为各种方法的详尽解释的副本,但是我可以似乎由于数据帧数量增加而无法解决我的问题.

This might be considered as a duplicate of a thorough explanation of various approaches, however I can't seem to find a solution to my problem there due to a higher number of Data Frames.

我有多个数据帧(超过10个),每个数据帧在一列VARX中有所不同.这只是一个快速且过于简化的示例:

I have multiple Data Frames (more than 10), each differing in one column VARX. This is just a quick and oversimplified example:

import pandas as pd

df1 = pd.DataFrame({'depth': [0.500000, 0.600000, 1.300000],
       'VAR1': [38.196202, 38.198002, 38.200001],
       'profile': ['profile_1', 'profile_1','profile_1']})

df2 = pd.DataFrame({'depth': [0.600000, 1.100000, 1.200000],
       'VAR2': [0.20440, 0.20442, 0.20446],
       'profile': ['profile_1', 'profile_1','profile_1']})

df3 = pd.DataFrame({'depth': [1.200000, 1.300000, 1.400000],
       'VAR3': [15.1880, 15.1820, 15.1820],
       'profile': ['profile_1', 'profile_1','profile_1']})

对于同一轮廓,每个df具有相同或不同的深度,所以

Each df has same or different depths for the same profiles, so

我需要创建一个新的DataFrame来合并所有单独的DataFrame,其中操作的关键列depthprofile,并显示 all 每个配置文件的深度值.

I need to create a new DataFrame which would merge all separate ones, where the key columns for the operation are depth and profile, with all appearing depth values for each profile.

因此,VARX的值应为NaN,其中该轮廓的变量没有深度测量.

The VARX value should be therefore NaN where there is no depth measurement of that variable for that profile.

结果应该是一个新的,压缩的DataFrame,其中所有VARX作为depthprofile的附加列,如下所示:

The result should be a thus a new, compressed DataFrame with all VARX as additional columns to the depth and profile ones, something like this:

name_profile    depth   VAR1        VAR2        VAR3
profile_1   0.500000    38.196202   NaN         NaN
profile_1   0.600000    38.198002   0.20440     NaN
profile_1   1.100000    NaN         0.20442     NaN
profile_1   1.200000    NaN         0.20446     15.1880
profile_1   1.300000    38.200001   NaN         15.1820
profile_1   1.400000    NaN         NaN         15.1820

请注意,配置文件的实际数量要大得多.

Note that the actual number of profiles is much, much bigger.

有什么想法吗?

推荐答案

考虑在每个数据帧上设置索引,然后使用pd.concat运行水平合并:

Consider setting index on each data frame and then run the horizontal merge with pd.concat:

dfs = [df.set_index(['profile', 'depth']) for df in [df1, df2, df3]]

print(pd.concat(dfs, axis=1).reset_index())
#      profile  depth       VAR1     VAR2    VAR3
# 0  profile_1    0.5  38.198002      NaN     NaN
# 1  profile_1    0.6  38.198002  0.20440     NaN
# 2  profile_1    1.1        NaN  0.20442     NaN
# 3  profile_1    1.2        NaN  0.20446  15.188
# 4  profile_1    1.3  38.200001      NaN  15.182
# 5  profile_1    1.4        NaN      NaN  15.182

这篇关于合并多个DataFrames pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆