pandas.DataFrame.update中不需要的类型转换 [英] unwanted type conversion in pandas.DataFrame.update

查看:47
本文介绍了pandas.DataFrame.update中不需要的类型转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在更新中,pandas是否有任何理由将列的类型从int更改为float,我可以阻止它吗?这是问题的一些示例代码

Is there any reason why pandas changes the type of columns from int to float in update, and can I prevent it from doing it? Here is some example code of the problem

import pandas as pd
import numpy as np

df = pd.DataFrame({'int': [1, 2], 'float': [np.nan, np.nan]})

print('Integer column:')
print(df['int'])

for _, df_sub in df.groupby('int'):
    df_sub['float'] = float(df_sub['int'])
    df.update(df_sub)

print('NO integer column:')
print(df['int']) 

推荐答案

原因如下:由于您有效地屏蔽了列中的某些值并将其替换(用您的更新),因此某些值可能会变为`nan

here's the reason for this: since you are effectively masking certain values on a column and replace them (with your updates), some values could become `nan

在整数数组中这是不可能的,因此将数字dtypes先验转换为float(以提高效率),因为先检查会比这样做更昂贵

in an integer array this is impossible, so numeric dtypes are apriori converted to float (for efficiency), as checking first is more expensive that doing this

可以改回dtype ...只是现在不在代码中,因此存在一个错误(虽然修复起来有些微不足道):github.com/pydata/pandas/issues/4094

a change of dtype back is possible...just not in the code right now, therefor this a bug (a bit non-trivial to fix though): github.com/pydata/pandas/issues/4094

这篇关于pandas.DataFrame.update中不需要的类型转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆