仅使用一行来交换 pandas 数据框中选定行的列值的正确语法是什么? [英] What is correct syntax to swap column values for selected rows in a pandas data frame using just one line?
问题描述
我正在将 pandas 版本0.14.1与Python 2.7.5配合使用,并且我有一个包含三列的数据框,例如:
I am using pandas version 0.14.1 with Python 2.7.5, and I have a data frame with three columns, e.g.:
import pandas as pd
d = {'L': ['left', 'right', 'left', 'right', 'left', 'right'],
'R': ['right', 'left', 'right', 'left', 'right', 'left'],
'VALUE': [-1, 1, -1, 1, -1, 1]}
df = pd.DataFrame(d)
idx = (df['VALUE'] == 1)
结果如下所示:
L R VALUE
0 left right -1
1 right left 1
2 left right -1
3 right left 1
4 left right -1
5 right left 1
对于其中VALUE == 1
的行,我想交换左右列的内容,以便所有左"值最终都在"L"列下,而右"值结束在"R"列下.
For rows where VALUE == 1
, I would like to swap the contents of the left and right columns, so that all of the "left" values will end up under the "L" column, and the "right" values end up under the "R" column.
已经在上面定义了idx
变量,通过使用如下临时变量,我可以轻松地在另外三行中做到这一点:
Having already defined the idx
variable above, I can easily do this in just three more lines, by using a temporary variable as follows:
tmp = df.loc[idx,'L']
df.loc[idx,'L'] = df.loc[idx,'R']
df.loc[idx,'R'] = tmp
但是,对于我来说,这似乎是一个笨拙而笨拙的语法;熊猫肯定支持更简洁的东西吗?我注意到,如果将输入中的列顺序交换到数据框.loc
属性,那么将得到以下交换的输出:
however this seems like really clunky and inelegant syntax to me; surely pandas supports something more succinct? I've noticed that if I swap the column order in the input to the data frame .loc
attribute, then I get the following swapped output:
In [2]: print(df.loc[idx,['R','L']])
R L
1 left right
3 left right
5 left right
这向我建议我只需使用以下单行代码即可实现与上述相同的交换:
This suggests to me that I should be able to implement the same swap as above, by using just the following single line:
df.loc[idx,['L','R']] = df.loc[idx,['R','L']]
但是,当我实际尝试执行此操作时,什么也没发生-列保持未交换状态.就像熊猫自动识别出我将列的顺序错误地放在赋值语句的右侧一样,它会自动纠正该问题.有没有一种方法可以在熊猫分配语句中禁用此列顺序自动更正",以便在不创建不必要的临时变量的情况下实现交换?
However when I actually try this, nothing happens--the columns remain unswapped. It's as if pandas automatically recognizes that I've put the columns in the wrong order on the right hand side of the assignment statement, and it automatically corrects for the problem. Is there a way that I can disable this "column order autocorrection" in pandas assignment statements, in order to implement the swap without creating unnecessary temporary variables?
推荐答案
一种避免与列名称对齐的方法是通过.values
下拉至基础数组:
One way you could avoid alignment on column names would be to drop down to the underlying array via .values
:
In [33]: df
Out[33]:
L R VALUE
0 left right -1
1 right left 1
2 left right -1
3 right left 1
4 left right -1
5 right left 1
In [34]: df.loc[idx,['L','R']] = df.loc[idx,['R','L']].values
In [35]: df
Out[35]:
L R VALUE
0 left right -1
1 left right 1
2 left right -1
3 left right 1
4 left right -1
5 left right 1
这篇关于仅使用一行来交换 pandas 数据框中选定行的列值的正确语法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!