在Pandas DataFrame中检查dtype时的警告 [英] Caveats while checking dtype in pandas DataFrame

查看：646 发布时间：2020/5/24 0:45:03 python pandas dataframe

本文介绍了在Pandas DataFrame中检查dtype时的警告的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

由指导回答我开始根据其dtype建立用于处理数据帧列的管道.但是在得到一些意外的输出和调试之后，我最终得到了测试数据帧和测试dtype检查:

Guided by this answer I started to build up pipe for processing columns of dataframe based on its dtype. But after getting some unexpected output and some debugging i ended up with test dataframe and test dtype checking:

# Creating test dataframe
test = pd.DataFrame({'bool' :[False, True], 'int':[-1,2],'float': [-2.5, 3.4],
                     'compl':np.array([1-1j, 5]),
                     'dt'   :[pd.Timestamp('2013-01-02'), pd.Timestamp('2016-10-20')],
                     'td'   :[pd.Timestamp('2012-03-02')- pd.Timestamp('2016-10-20'),
                              pd.Timestamp('2010-07-12')- pd.Timestamp('2000-11-10')],
                     'prd'  :[pd.Period('2002-03','D'), pd.Period('2012-02-01', 'D')],
                     'intrv':pd.arrays.IntervalArray([pd.Interval(0, 0.1), pd.Interval(1, 5)]),
                     'str'  :['s1', 's2'],
                     'cat'  :[1, -1],
                     'obj'  :[[1,2,3], [5435,35,-52,14]]
                    })
test['cat'] = test['cat'].astype('category')
test
test.dtypes

# Testing types
types = list(test.columns)
df_types = pd.DataFrame(np.zeros((len(types),len(types)), dtype=bool),
                        index = ['is_'+el for el in types],
                        columns = types)
for col in test.columns:
    df_types.at['is_bool', col] = pd.api.types.is_bool_dtype(test[col])
    df_types.at['is_int' , col] = pd.api.types.is_integer_dtype(test[col])
    df_types.at['is_float',col] = pd.api.types.is_float_dtype(test[col])
    df_types.at['is_compl',col] = pd.api.types.is_complex_dtype(test[col])
    df_types.at['is_dt'  , col] = pd.api.types.is_datetime64_dtype(test[col])
    df_types.at['is_td'  , col] = pd.api.types.is_timedelta64_dtype(test[col])
    df_types.at['is_prd' , col] = pd.api.types.is_period_dtype(test[col])
    df_types.at['is_intrv',col] = pd.api.types.is_interval_dtype(test[col])
    df_types.at['is_str' , col] = pd.api.types.is_string_dtype(test[col])
    df_types.at['is_cat' , col] = pd.api.types.is_categorical_dtype(test[col])
    df_types.at['is_obj' , col] = pd.api.types.is_object_dtype(test[col])

# Styling func
def coloring(df):
    clr_g = 'color : green'
    clr_r = 'color : red'
    mask = ~np.logical_xor(df.values, np.eye(df.shape[0], dtype=bool))
    # OUTPUT
    return pd.DataFrame(np.where(mask, clr_g, clr_r),
                        index = df.index,
                        columns = df.columns)

# OUTPUT colored
df_types.style.apply(coloring, axis=None)

输出:

bool                  bool
int                  int64
float              float64
compl           complex128
dt          datetime64[ns]
td         timedelta64[ns]
prd              period[D]
intrv    interval[float64]
str                 object
cat               category
obj                 object

几乎一切都很好，但是此测试代码产生两个问题:

Almost everything is good, but this test code produces two questions:

最奇怪的是pd.api.types.is_string_dtype触发在category dtype上.这是为什么?是否应将其视为预期" 行为?
为什么在每个上触发is_string_dtype和is_object_dtype 其他?有点意外，因为即使在.dtypes中，这两种类型被标记为object，但是如果有人对其进行说明会更好一步一步来.

The most strange here is that pd.api.types.is_string_dtype fires on category dtype. Why is that? Should it be treated as 'expected' behavior?
Why is_string_dtype and is_object_dtype fires on each other? This is a bit expected, because even in .dtypes both types are noted as object, but it would be better if someone clarify it step by step.

Ps:奖金问题-我认为熊猫在构建新版本时应该通过其内部测试是正确的(例如测试代码中的df_types，但不带有红色"，而是记录有关错误的信息") ?

P.s.: Bonus question - am i right when thinking that pandas has its internal tests that should be passed when building new release (like df_types from test code, but not with 'coloring in red' rather 'recording info about errors')?

熊猫版0.24.2.

在Pandas DataFrame中检查dtype时的警告 [英] Caveats while checking dtype in pandas DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Pandas DataFrame中检查dtype时的警告 [英] Caveats while checking dtype in pandas DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭