选择数据帧中的最后n列并排除最后n列 [英] Selecting last n columns and excluding last n columns in dataframe

查看:58
本文介绍了选择数据帧中的最后n列并排除最后n列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何:

  1. 选择数据框中的最后3列并创建一个新的数据框?

我尝试过:

y = dataframe.iloc[:,-3:]

  1. 排除最后3列并创建一个新的数据框?

我尝试过:

X = dataframe.iloc[:,:-3]

这正确吗?

我的代码中进一步出现了数组维错误,并希望确保此步骤正确.

I am getting array dimensional errors further in my code and want to make sure this step is correct.

谢谢

推荐答案

只要做到:

y = dataframe[dataframe.columns[-3:]]

这将对列进行切片,以便您可以从df中进行子选择

This slices the columns so you can sub-select from the df

示例:

In [221]:
df = pd.DataFrame(columns=np.arange(10))
df[df.columns[-3:]]

Out[221]:
Empty DataFrame
Columns: [7, 8, 9]
Index: []

我认为这里的问题是,因为您已截取了df的一部分,因此返回了一个视图,但是根据其余代码的作用,它会发出警告.您可以通过调用.copy()删除警告来进行显式复制.

I think the issue here is that because you have taken a slice of the df, it's returned a view but depending on what the rest of your code is doing it's raising a warning. You can make an explicit copy by calling .copy() to remove the warnings.

因此,如果我们进行复制,则分配只会影响该复制,而不会影响原始df:

So if we take a copy then assignment only affects the copy and not the original df:

In [15]:
df = pd.DataFrame(np.random.randn(5,10), columns= np.arange(10))
df

Out[15]:
          0         1         2         3         4         5         6  \
0  0.568284 -1.488447  0.970365 -1.406463 -0.413750 -0.934892 -1.421308   
1  1.186414 -0.417366 -1.007509 -1.620530 -1.322004  0.294540  1.205115   
2 -1.073894 -0.214972  1.516563 -0.705571  0.068666  1.690654 -0.252485   
3  0.923524 -0.856752  0.226294 -0.660085  1.259145  0.400596  0.559028   
4  0.259807  0.135300  1.130347 -0.317305 -1.031875  0.232262  0.709244   

          7         8         9  
0  1.741925 -0.475619 -0.525770  
1  2.137546  0.215665  1.908362  
2  1.180281 -0.144652  0.870887  
3 -0.609804 -0.833186 -1.033656  
4  0.480943  1.971933  1.928037  

In [16]:    
y = df[df.columns[-3:]].copy()
y

Out[16]:
          7         8         9
0  1.741925 -0.475619 -0.525770
1  2.137546  0.215665  1.908362
2  1.180281 -0.144652  0.870887
3 -0.609804 -0.833186 -1.033656
4  0.480943  1.971933  1.928037

In [17]:    
y[y>0] = 0
print(y)
df

          7         8         9
0  0.000000 -0.475619 -0.525770
1  0.000000  0.000000  0.000000
2  0.000000 -0.144652  0.000000
3 -0.609804 -0.833186 -1.033656
4  0.000000  0.000000  0.000000
Out[17]:
          0         1         2         3         4         5         6  \
0  0.568284 -1.488447  0.970365 -1.406463 -0.413750 -0.934892 -1.421308   
1  1.186414 -0.417366 -1.007509 -1.620530 -1.322004  0.294540  1.205115   
2 -1.073894 -0.214972  1.516563 -0.705571  0.068666  1.690654 -0.252485   
3  0.923524 -0.856752  0.226294 -0.660085  1.259145  0.400596  0.559028   
4  0.259807  0.135300  1.130347 -0.317305 -1.031875  0.232262  0.709244   

          7         8         9  
0  1.741925 -0.475619 -0.525770  
1  2.137546  0.215665  1.908362  
2  1.180281 -0.144652  0.870887  
3 -0.609804 -0.833186 -1.033656  
4  0.480943  1.971933  1.928037  

此处未发出警告,并且原始df未被触及.

Here no warning is raised and the original df is untouched.

这篇关于选择数据帧中的最后n列并排除最后n列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆