为什么用字符串和timedelta转换DataFrame会转换dtype? [英] Why does transposing a DataFrame with strings and timedeltas convert the dtype?

查看:102
本文介绍了为什么用字符串和timedelta转换DataFrame会转换dtype?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这种行为对我来说似乎很奇怪:如果另一列是timedelta,则在转置dfid列(字符串)将转换为时间戳.

This behavior seems odd to me: the id column (a string) gets converted to a timestamp upon transposing the df if the other column is a timedelta.

import pandas as pd
df = pd.DataFrame({'id': ['00115', '01222', '32333'],
                   'val': [12, 14, 170]})
df['val'] = pd.to_timedelta(df.val, unit='M')

print(df.T)
#                         0                      1                      2
#id  0 days 00:00:00.000000 0 days 00:00:00.000001 0 days 00:00:00.000032
#val      365 days 05:49:12      426 days 02:47:24     5174 days 06:27:00

type(df.T[0][0])
#pandas._libs.tslib.Timedelta

没有时间增量,它可以按我的预期工作,并且id列仍然是字符串,即使另一列是整数并且所有字符串都可以安全地转换为整数.

Without the timedelta it works as I'd expect, and the id column remains a string, even though the other column is an integer and all of the strings could be safely cast to integers.

df2 = pd.DataFrame({'id': ['00115', '01222', '32333'],
                    'val': [1, 1231, 1413]})

type(df2.T[0][0])
#str

为什么在第一个实例中更改id的类型,而在第二个实例中却没有更改?

Why does the type of id get changed in the first instance, but not the second?

推荐答案

应该在列中考虑一个数据框.每列必须具有单个数据类型.转置时,您正在更改新列中现在彼此关联的单元格.转置之前,您有一个字符串列和一个timedelta列.转置后,每列都有一个字符串和一个timedelta.熊猫必须决定如何铸造新的专栏.它决定与timedelta一起使用.我认为这是一个愚蠢的选择.

A dataframe should be thought of in columns. Each column must have a single data type. When you transpose, you are changing which cells are now associated with each other in the new columns. Prior to transpose, you had an string column and a timedelta column. After transpose, each column had a string and a timedelta. Pandas has to decide how to cast the new columns. It decided to go with timedelta. It is my opinion that this is a goofy choice.

您可以通过更改新构造的数据帧上的dtype来更改此行为.

You can change this behavior by changing the dtype on a newly constructed dataframe.

pd.DataFrame(df.values.T, df.columns, df.index, dtype=object)

                     0                  1                   2
id               00115              01222               32333
val  365 days 05:49:12  426 days 02:47:24  5174 days 06:27:00

这篇关于为什么用字符串和timedelta转换DataFrame会转换dtype?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆