pandas 的DataFrame双重转置将数字类型更改为对象 [英] Panda's DataFrame double transpose changes numeric types to object

查看:125
本文介绍了 pandas 的DataFrame双重转置将数字类型更改为对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从Excel中2个单独的位置读取标头和数据帧的数据(两者均正确对齐但不相邻).标头可能包含许多空格,因此我需要丢弃这些标头和数据中的相应列.因此,我的最后一帧包含非空的标头和与这些标头相对应的数据.以下使用转置的逻辑有效,但在两次转置时我丢失了数据类型-请参阅下面的特定示例- 问题 1)关于我如何无需换位就可以实现的任何建议? 2)这是跨界应该如何运作的?在第二次换位时是否应该不再次推断dtypes?

I'm reading a header and the data for the dataframe from 2 separate locations in excel (both are aligned properly but not adjacent). The header potentially contains many blanks and so I need to discard those headers and the corresponding columns in the data. So my final frame has non-null headers and data corresponding to those headers. The logic below using transposion works but I'm losing the data types upon double transposion - see specific example below - question 1) any suggestion on how I can achieve it without transposition? 2) is this how transpostion supposed to work? Should it not infer the dtypes again upon second transposition?

  In [25]:

hd=pd.DataFrame({0:['num'],
                 1:np.nan,
                 2:['ltr']})
hd
Out[25]:
0   1   2
0    num    NaN  ltr
In [26]:

data=pd.DataFrame({0:np.arange(3),
                 1:['a','b','c'],
                 2:['d','e','f']})
data
Out[26]:
0   1   2
0    0   a   d
1    1   b   e
2    2   c   f
In [27]:

df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()     
df
Out[27]:
num ltr
0    0   d
1    1   e
2    2   f
In [28]:

df.dtypes
Out[28]:
0
num    object
ltr    object
dtype: object

In [25]:

hd=pd.DataFrame({0:['num'],
                 1:np.nan,
                 2:['ltr']})
hd
Out[25]:
0   1   2
0    num    NaN  ltr
In [26]:

data=pd.DataFrame({0:np.arange(3),
                 1:['a','b','c'],
                 2:['d','e','f']})
data
Out[26]:
0   1   2
0    0   a   d
1    1   b   e
2    2   c   f
In [27]:

df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()     
df
Out[27]:
num ltr
0    0   d
1    1   e
2    2   f
In [28]:

df.dtypes
Out[28]:
0
num    object
ltr    object
dtype: object

当您要开始使用混合dtypes时,

推荐答案

将dtypes转换为object.正如预期的那样,dtypes是基于列的.如果要重新推断它们,可以使用df.convert_objects().

transposition converted dtypes to object when you have mixed-dtypes to begin. this is as expected, dtypes are column based. you can use df.convert_objects() if you want to re-infer them.

但是,只需执行以下操作:

However, just do this:

In [10]: data.loc[:,hd.iloc[0].notnull()]
Out[10]: 
   0  2
0  0  d
1  1  e
2  2  f

In [11]: data.loc[:,hd.iloc[0].notnull()].dtypes
Out[11]: 
0     int64
2    object
dtype: object

这篇关于 pandas 的DataFrame双重转置将数字类型更改为对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆