如何确定列/变量在Pandas/NumPy中是否为数字? [英] How to determine whether a column/variable is numeric or not in Pandas/NumPy?

查看:2140
本文介绍了如何确定列/变量在Pandas/NumPy中是否为数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有更好的方法来确定Pandas和/或NumPy中的变量是否为numeric?

Is there a better way to determine whether a variable in Pandas and/or NumPy is numeric or not ?

我有一个自定义的dictionary,其中dtypes作为键,而numeric/not作为值.

I have a self defined dictionary with dtypes as keys and numeric / not as values.

推荐答案

您可以使用

You can use np.issubdtype to check if the dtype is a sub dtype of np.number. Examples:

np.issubdtype(arr.dtype, np.number)  # where arr is a numpy array
np.issubdtype(df['X'].dtype, np.number)  # where df['X'] is a pandas Series

这适用于numpy的dtype,但不适用于pd等熊猫特定类型.分类为Thomas is_numeric_dtype 函数,则比np.issubdtype更好.

This works for numpy's dtypes but fails for pandas specific types like pd.Categorical as Thomas noted. If you are using categoricals is_numeric_dtype function from pandas is a better alternative than np.issubdtype.

df = pd.DataFrame({'A': [1, 2, 3], 'B': [1.0, 2.0, 3.0], 
                   'C': [1j, 2j, 3j], 'D': ['a', 'b', 'c']})
df
Out: 
   A    B   C  D
0  1  1.0  1j  a
1  2  2.0  2j  b
2  3  3.0  3j  c

df.dtypes
Out: 
A         int64
B       float64
C    complex128
D        object
dtype: object


np.issubdtype(df['A'].dtype, np.number)
Out: True

np.issubdtype(df['B'].dtype, np.number)
Out: True

np.issubdtype(df['C'].dtype, np.number)
Out: True

np.issubdtype(df['D'].dtype, np.number)
Out: False

对于多列,您可以使用np.vectorize:

For multiple columns you can use np.vectorize:

is_number = np.vectorize(lambda x: np.issubdtype(x, np.number))
is_number(df.dtypes)
Out: array([ True,  True,  True, False], dtype=bool)

为了进行选择,熊猫现在具有 select_dtypes :

And for selection, pandas now has select_dtypes:

df.select_dtypes(include=[np.number])
Out: 
   A    B   C
0  1  1.0  1j
1  2  2.0  2j
2  3  3.0  3j

这篇关于如何确定列/变量在Pandas/NumPy中是否为数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆