使用熔体后类别dtype发生变化 [英] Categorical dtype changes after using melt
问题描述
在回答这个问题时,我发现在熊猫数据框上使用melt
之后,以前是有序的分类dtype成为object
.这是预期的行为吗?
In answering this question, I found that after using melt
on a pandas dataframe, a column that was previously an ordered Categorical dtype becomes an object
. Is this intended behaviour?
注意:不是在寻找解决方案,只是想知道是否有任何原因导致此行为,或者这不是预期的行为.
Note: not looking for a solution, just wondering if there is any reason for this behaviour or if it's not intended behavior.
示例:
使用以下数据框df
:
Cat L_1 L_2 L_3
0 A 1 2 3
1 B 4 5 6
2 C 7 8 9
df['Cat'] = pd.Categorical(df['Cat'], categories = ['C','A','B'], ordered=True)
# As you can see `Cat` is a category
>>> df.dtypes
Cat category
L_1 int64
L_2 int64
L_3 int64
dtype: object
melted = df.melt('Cat')
>>> melted
Cat variable value
0 A L_1 1
1 B L_1 4
2 C L_1 7
3 A L_2 2
4 B L_2 5
5 C L_2 8
6 A L_3 3
7 B L_3 6
8 C L_3 9
现在,如果我查看Cat
,它已成为对象:
Now, if I look at Cat
, it's become an object:
>>> melted.dtypes
Cat object
variable object
value int64
dtype: object
这是故意的吗?
推荐答案
In source code . 0.22.0(My old version)
for col in id_vars:
mdata[col] = np.tile(frame.pop(col).values, K)
mcolumns = id_vars + var_name + [value_name]
将使用np.tile
返回数据类型对象.
Which will return the datatype object with np.tile
.
它已在0.23.4中修复(我更新pandas
之后)
It has been fixed in 0.23.4(After I update my pandas
)
df.melt('Cat')
Out[6]:
Cat variable value
0 A L_1 1
1 B L_1 4
2 C L_1 7
3 A L_2 2
4 B L_2 5
5 C L_2 8
6 A L_3 3
7 B L_3 6
8 C L_3 9
df.melt('Cat').dtypes
Out[7]:
Cat category
variable object
value int64
dtype: object
更多信息如何修复:
for col in id_vars:
id_data = frame.pop(col)
if is_extension_type(id_data): # here will return True , then become concat not np.tile
id_data = concat([id_data] * K, ignore_index=True)
else:
id_data = np.tile(id_data.values, K)
mdata[col] = id_data
这篇关于使用熔体后类别dtype发生变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!