从pandas DataFrame删除非数字列 [英] Drop non-numeric columns from a pandas DataFrame
问题描述
在我的应用程序中,我加载的文本文件的结构如下:
In my application I load text files that are structured as follows:
- 第一个非数字列(ID)
- 许多非数字列(字符串)
- 许多数字列(浮点数)
非数字列的数量是可变的.目前,我将数据加载到这样的DataFrame中:
The number of the non-numeric columns is variable. Currently I load the data into a DataFrame like this:
source = pandas.read_table(inputfile, index_col=0)
我想将所有非数字列一举删除,而不知道它们的名称或索引,因为这样做可以读取它们的dtype.大熊猫有可能吗?还是我必须自己煮点东西?
I would like to drop all non-numeric columns in one fell swoop, without knowing their names or indices, since this could be doable reading their dtype. Is this possible with pandas or do I have to cook up something on my own?
推荐答案
To avoid using a private method you can also use select_dtypes, where you can either include or exclude the dtypes you want.
在这篇文章上完全相同.
或者在您的情况下,具体是:
source.select_dtypes(['number']) or source.select_dtypes([np.number]
Or in your case, specifically:
source.select_dtypes(['number']) or source.select_dtypes([np.number]
这篇关于从pandas DataFrame删除非数字列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!