Pandas Concat增加行数 [英] Pandas Concat increases number of rows

查看:193
本文介绍了Pandas Concat增加行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要串联两个数据框,所以我希望一个数据框位于另一个数据框。
但是首先我对初始数据帧进行了一些转换:

I'm concatenating two dataframes, so I want to one dataframe is located to another. But first I did some transformation to initial dataframe:

scaler = MinMaxScaler() 
real_data = pd.DataFrame(scaler.fit_transform(df[real_columns]), columns = real_columns)

,然后串联:

categorial_data  = pd.get_dummies(df[categor_columns], prefix_sep= '__')
train = pd.concat([real_data, categorial_data], axis=1, ignore_index=True)

我不知道为什么,但是行数增加:

I dont know why, but number of rows increased:

print(df.shape, real_data.shape, categorial_data.shape, train.shape)
(1700645, 23) (1700645, 16) (1700645, 130) (1703915, 146)

如您所见,train的列数等于real_data和categorial_data列的总和

As you can see number of columns for train equals to sum of columns real_data and categorial_data

推荐答案

问题是有时您对单个数据执行多项操作时框架对象,索引会保留在内存中。因此,使用 df.reset_index()将解决您的问题。

The problem is that sometimes when you perform several operations on a single dataframe object, the index persists in the memory. So using df.reset_index() will solve your problem.

这篇关于Pandas Concat增加行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆