通过apply传递给函数时,pandas列的数据类型更改为object? [英] Data type of pandas column changes to object when it's passed to a function via apply?
问题描述
我需要在函数中使用pandas列的dtype
,但是由于某些原因,当我使用apply
调用函数时,dtype
更改为object
.有人知道这里发生了什么吗?
I need to use the dtype
of a pandas column in a function, but for some reason when I call the function using apply
, the dtype
is changed to object
. Does anyone know what is happening here?
import pandas as pd
df = pd.DataFrame({'stringcol':['a'], 'floatcol': [1.5]})
df.dtypes
Out[1]:
floatcol float64
stringcol object
dtype: object
df.apply(lambda col: col.dtype)
Out[2]:
floatcol object
stringcol object
dtype: object
请注意,如果直接传递该列,则不会发生此问题:
Note that this problem doesn't happen if the column is passed directly:
f = lambda col: col.dtype
f(test.floatcol)
Out[3]: dtype('float64')
推荐答案
它似乎是由于DataFrame._apply_standard
中的优化所致.该方法代码中的快速路径"将创建一个输出Series,其dtype为df.values
的dtype,在您的情况下为object
,因为DataFrame是混合类型.如果将reduce=False
传递给apply
调用,则结果是正确的:
It appears to be due to an optimization in DataFrame._apply_standard
. The "fast path" in the code of that method creates an output Series whose dtype is the dtype of df.values
, which in your case is object
since the DataFrame is of mixed type. If you pass reduce=False
to your apply
call, the result is correct:
>>> df.apply(lambda col: col.dtype, reduce=False)
floatcol float64
stringcol object
dtype: object
(我必须说,我对文档中reduce
吉布斯的这种行为尚不清楚.)
(I must say that it is not clear to me how this behavior of reduce
jibes with the documentation.)
这篇关于通过apply传递给函数时,pandas列的数据类型更改为object?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!