Panda .loc或.iloc从数据集中选择列 [英] Panda .loc or .iloc to select the columns from a dataset
本文介绍了Panda .loc或.iloc从数据集中选择列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我一直在尝试从数据集中为所有行选择一组特定的列.我尝试了以下类似的方法.
I have been trying to select a particular set of columns from a dataset for all the rows. I tried something like below.
train_features = train_df.loc[,[0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]]
我想提到的是,所有行都是包含在内的,但只需要编号的列即可. 有没有更好的方法来解决这个问题.
I want to mention that all rows are inclusive but only need the numbered columns. Is there any better way to approach this.
样本数据:
age job marital education default housing loan equities contact duration campaign pdays previous poutcome emp.var.rate cons.price.idx cons.conf.idx euribor3m nr.employed y
56 housemaid married basic.4y 1 1 1 1 0 261 1 999 0 2 1.1 93.994 -36.4 3.299552287 5191 1
37 services married high.school 1 0 1 1 0 226 1 999 0 2 1.1 93.994 -36.4 0.743751247 5191 1
56 services married high.school 1 1 0 1 0 307 1 999 0 2 1.1 93.994 -36.4 1.28265179 5191 1
我试图忽略数据集中的工作,婚姻,教育和y栏. y列是目标变量.
I'm trying to neglect job, marital, education and y column in my dataset. y column is the target variable.
推荐答案
If need select by positions use iloc
:
train_features = train_df.iloc[:, [0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]]
print (train_features)
age default housing loan equities contact duration campaign pdays \
0 56 1 1 1 1 0 261 1 999
1 37 1 0 1 1 0 226 1 999
2 56 1 1 0 1 0 307 1 999
previous poutcome emp.var.rate cons.price.idx cons.conf.idx euribor3m \
0 0 2 1.1 93.994 -36.4 3.299552
1 0 2 1.1 93.994 -36.4 0.743751
2 0 2 1.1 93.994 -36.4 1.282652
nr.employed
0 5191
1 5191
2 5191
另一种解决方案是 drop
不必要的列:
Another solution is drop
unnecessary columns:
cols= ['job','marital','education','y']
train_features = train_df.drop(cols, axis=1)
print (train_features)
age default housing loan equities contact duration campaign pdays \
0 56 1 1 1 1 0 261 1 999
1 37 1 0 1 1 0 226 1 999
2 56 1 1 0 1 0 307 1 999
previous poutcome emp.var.rate cons.price.idx cons.conf.idx euribor3m \
0 0 2 1.1 93.994 -36.4 3.299552
1 0 2 1.1 93.994 -36.4 0.743751
2 0 2 1.1 93.994 -36.4 1.282652
nr.employed
0 5191
1 5191
2 5191
这篇关于Panda .loc或.iloc从数据集中选择列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文