pandas 数据框concat提供了不必要的NA/NaN列 [英] pandas dataframe concat is giving unwanted NA/NaN columns

查看:49
本文介绍了 pandas 数据框concat提供了不必要的NA/NaN列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

而不是此示例为水平在Pandas Dataframe pd之后.concat我得到NaNs ,我正在尝试垂直操作:

Instead of this example where it is horizontal After Pandas Dataframe pd.concat I get NaNs, I'm trying vertical:

import pandas
a=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])

f=[]
for i in range(0,len(df)):
    f.append(df['Date'][i] + ' ' + df['letters'][i])

df['new']=f

c=[x for x in range(0,5)]
b=[]
b += [['NA'] * (5 - len(b))]
df_a = pandas.DataFrame.from_records(b,columns=c)

df_b=pandas.concat([df,df_a], ignore_index=True)

df_b 输出与 df_b = pandas.concat([df,df_a],axis = 0)

结果:

     0    1    2    3    4      Date letters mixed         new numbers
0  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       6
1  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       3
2  NaN  NaN  NaN  NaN  NaN  1/3/2014       c    x3  1/3/2014 c       1
0   NA   NA   NA   NA   NA       NaN     NaN   NaN         NaN     NaN

所需:

       Date letters numbers mixed         new
0  1/2/2014       a       6    z1  1/2/2014 a
1  1/2/2014       a       3    z1  1/2/2014 a
2  1/3/2014       c       1    x3  1/3/2014 c
0  NA             NA      NA   NA  NA

推荐答案

我将直接使用正确的列创建一个数据框 df_a .

I would create a dataframe df_a with the correct columns directly.

只需稍微重构一下代码,它就会提供

With a little refactoring of your code, it gives

import pandas
a=[['Date', 'letters', 'numbers', 'mixed'], \
   ['1/2/2014', 'a', '6', 'z1'],\
   ['1/2/2014', 'a', '3', 'z1'],\
   ['1/3/2014', 'c', '1', 'x3']]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])
df['new'] = df['Date'] + ' ' + df['letters']

n = len(df.columns)
b = [['NA'] * n]
df_a = pandas.DataFrame.from_records(b,columns=df.columns)
df_b = pandas.concat([df,df_a])

它给出了

       Date letters numbers mixed         new
0  1/2/2014       a       6    z1  1/2/2014 a
1  1/2/2014       a       3    z1  1/2/2014 a
2  1/3/2014       c       1    x3  1/3/2014 c
0        NA      NA      NA    NA          NA

最终:

df_b = pandas.concat([df,df_a]).reset_index(drop=True)

它给出

       Date letters numbers mixed         new
0  1/2/2014       a       6    z1  1/2/2014 a
1  1/2/2014       a       3    z1  1/2/2014 a
2  1/3/2014       c       1    x3  1/3/2014 c
3        NA      NA      NA    NA          NA

这篇关于 pandas 数据框concat提供了不必要的NA/NaN列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆