如何在数据框中合并两行大 pandas [英] How to merge two rows in a dataframe pandas
问题描述
df看起来如下:
PC评级CY评级PY HT
0 DE101 NaN AA GV
0 DE101 AA + NaN GV
我已经尝试创建两个独立的数据框,并将它们与df .merge(df2)没有成功。结果应该是以下
PC评级CY评级PY HT
0 DE101 AA + AA GV
任何想法?感谢提前
可以df.update是一个可能的解决方案吗?
编辑:
df.head(1).combine_first(df.tail(1))
这适用于上面的例子。然而,对于包含数值的列,此方法不产生所需的输出,例如。对于
PC评级CY评级PY HT MV1 MV2
0 DE101 NaN AA GV 0 20
0 DE101 AA + NaN GV 10 0
输出应为:
PC评级CY评级PY HT MV1 MV2
0 DE101 AA + AA GV 10 20
上面的公式并不总结最后两列中的值,而是将值放在数据框的第一行。
PC评级CY评级PY HT MV1 MV2
0 DE101 AA + AA GV 0 20
如何解决这个问题?
您可以使用
Incase有混合数据类型的列,将它们分成它的组成部分 dtype
列,然后执行
obj_df = df.select_dtypes(include = [np.object])
num_df = df.select_dtypes(exclude = [np.object])
obj_df.head(1).combine_first(obj_df.tail(1))。join(num_df.head(1).add(num_df.tail(1)))
I have a dataframe with two rows and I'd like to merge the two rows to one row. The df Looks as follows:
PC Rating CY Rating PY HT
0 DE101 NaN AA GV
0 DE101 AA+ NaN GV
I have tried to create two seperate dataframes and Combine them with df.merge(df2) without success. The result should be the following
PC Rating CY Rating PY HT
0 DE101 AA+ AA GV
Any ideas? Thanks in advance Could df.update be a possible solution?
EDIT:
df.head(1).combine_first(df.tail(1))
This works for the example above. However, for columns containing numerical values, this approach doesn't yield the desired output, e.g. for
PC Rating CY Rating PY HT MV1 MV2
0 DE101 NaN AA GV 0 20
0 DE101 AA+ NaN GV 10 0
The output should be:
PC Rating CY Rating PY HT MV1 MV2
0 DE101 AA+ AA GV 10 20
The formula above doesn't sum up the values in the last two columns, but takes the values in the first row of the dataframe.
PC Rating CY Rating PY HT MV1 MV2
0 DE101 AA+ AA GV 0 20
How could this problem be fixed?
You can make use of DF.combine_first()
method after separating the DF
into 2 parts where the null values in the first half would be replaced with the finite values in the other half while keeping it's other finite values untouched:
df.head(1).combine_first(df.tail(1))
# Practically this is same as → df.head(1).fillna(df.tail(1))
Incase there are columns of mixed datatype, partitioning them into it's constituent dtype
columns and then performing various operations on it would be feasible by chaining them across.
obj_df = df.select_dtypes(include=[np.object])
num_df = df.select_dtypes(exclude=[np.object])
obj_df.head(1).combine_first(obj_df.tail(1)).join(num_df.head(1).add(num_df.tail(1)))
这篇关于如何在数据框中合并两行大 pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!