如何确定列/变量在Pandas/NumPy中是否为数字? [英] How to determine whether a column/variable is numeric or not in Pandas/NumPy?
问题描述
是否有更好的方法来确定Pandas
和/或NumPy
中的变量是否为numeric
?
Is there a better way to determine whether a variable in Pandas
and/or NumPy
is numeric
or not ?
我有一个自定义的dictionary
,其中dtypes
作为键,而numeric
/not
作为值.
I have a self defined dictionary
with dtypes
as keys and numeric
/ not
as values.
推荐答案
You can use np.issubdtype
to check if the dtype is a sub dtype of np.number
. Examples:
np.issubdtype(arr.dtype, np.number) # where arr is a numpy array
np.issubdtype(df['X'].dtype, np.number) # where df['X'] is a pandas Series
这适用于numpy的dtype,但不适用于pd等熊猫特定类型.分类为Thomas is_numeric_dtype
函数,则比np.issubdtype更好.
This works for numpy's dtypes but fails for pandas specific types like pd.Categorical as Thomas noted. If you are using categoricals is_numeric_dtype
function from pandas is a better alternative than np.issubdtype.
df = pd.DataFrame({'A': [1, 2, 3], 'B': [1.0, 2.0, 3.0],
'C': [1j, 2j, 3j], 'D': ['a', 'b', 'c']})
df
Out:
A B C D
0 1 1.0 1j a
1 2 2.0 2j b
2 3 3.0 3j c
df.dtypes
Out:
A int64
B float64
C complex128
D object
dtype: object
np.issubdtype(df['A'].dtype, np.number)
Out: True
np.issubdtype(df['B'].dtype, np.number)
Out: True
np.issubdtype(df['C'].dtype, np.number)
Out: True
np.issubdtype(df['D'].dtype, np.number)
Out: False
对于多列,您可以使用np.vectorize:
For multiple columns you can use np.vectorize:
is_number = np.vectorize(lambda x: np.issubdtype(x, np.number))
is_number(df.dtypes)
Out: array([ True, True, True, False], dtype=bool)
为了进行选择,熊猫现在具有 select_dtypes
:
And for selection, pandas now has select_dtypes
:
df.select_dtypes(include=[np.number])
Out:
A B C
0 1 1.0 1j
1 2 2.0 2j
2 3 3.0 3j
这篇关于如何确定列/变量在Pandas/NumPy中是否为数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!