为数据框的选定列应用select_dtypes [英] Applying select_dtypes for selected columns of a dataframe
问题描述
我有一些同时包含浮点数和字符串的列.我希望能够选择这些列并根据其数据类型应用不同的掩码.
I have a few columns that have both floats and strings. I want to be able to select these columns and apply different masks according to their data type.
我已经找到了select_dtypes()方法,但是它在整个数据帧上运行,我需要的是能够进行列选择.例如:
I have found select_dtypes() method but it runs over the entire dataframe what I need is to be able to do column selection. For example:
df['A'].select_dtypes(exclude=[np.number])
现在,当我尝试执行此操作时,我会得到
Right now when I try to do this I get
AttributeError: 'Series' object has no attribute 'select_dtypes'
要提供更多详细信息,假设我有这样的数据框:
To give more details let's say I have such dataframe:
df = pd.DataFrame([
[-1, 3, 0],
[5, 2, 1],
[-6, 3, 2],
[7, '<blank>', 3 ],
['<blank>', 2, 4],
['<blank>', '<blank>', '<blank>']], columns='A B C'.split())
我跑步时
df.select_dtypes(exclude=[np.number])
它没有给我一个错误,但是什么也没有发生,因为它找不到除np.number
It doesn't give me an error but also nothing happens since it did not find any column which contains only one dtype other than np.number
最后,我想用dtype选择创建一个遮罩,例如
In the end I want to create a mask with dtype selection such as
mask= df['A'].select_dtypes(exclude=[np.number])
注意:我需要不更改此字符串,因为在下一步中,我将将此数据帧呈现到html表,因此这些< blank >
字符串将给我空格.
Note : I need this strings not to be changed because at a further step I will render this dataframe to html table so these < blank >
strings will give me white spaces.
推荐答案
您可以定义将转换应用于数字的函数,然后根据转换是否成功进行过滤:
You can define a function to apply conversion to numeric, and then filter according to whether the conversion is successful:
def filter_type(s, num=True):
s_new = pd.to_numeric(s, errors='coerce')
if num:
return s[s_new.notnull()]
else:
return s[s_new.isnull()]
res = filter_type(df['A'], num=False)
print(res)
4 <blank>
5 <blank>
Name: A, dtype: object
这篇关于为数据框的选定列应用select_dtypes的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!