Pandas - 根据行值有条件地为新列选择数据的源列 [英] Pandas - conditionally select source column of data for a new column based on row value
问题描述
是否有允许根据条件从不同列中进行选择的 Pandas 函数?这类似于 SQL Select 子句中的 CASE 语句.例如,假设我有以下 DataFrame:
foo = DataFrame([['美国',1,2],['加拿大',3,4],['加拿大',5,6]],列 = ('国家', 'x', 'y'))
我想在 Country=='USA' 时从 'x' 列中选择,当 Country=='Canada' 时从 'y' 列中进行选择,结果如下:
国家 x y z0 美国 1 2 11 加拿大 3 4 42 加拿大 5 6 6[3 行 x 4 列]
Using DataFrame.where
的 other
参数和 pandas.concat
:
如果您希望 z
作为列名,请指定 keys
:
Is there a pandas function that allows selection from different columns based on a condition? This is analogous to a CASE statement in a SQL Select clause. For example, say I have the following DataFrame:
foo = DataFrame(
[['USA',1,2],
['Canada',3,4],
['Canada',5,6]],
columns = ('Country', 'x', 'y')
)
I want to select from column 'x' when Country=='USA', and from column 'y' when Country=='Canada', resulting in something like the following:
Country x y z
0 USA 1 2 1
1 Canada 3 4 4
2 Canada 5 6 6
[3 rows x 4 columns]
Using DataFrame.where
's other
argument and pandas.concat
:
>>> import pandas as pd
>>>
>>> foo = pd.DataFrame([
... ['USA',1,2],
... ['Canada',3,4],
... ['Canada',5,6]
... ], columns=('Country', 'x', 'y'))
>>>
>>> z = foo['x'].where(foo['Country'] == 'USA', foo['y'])
>>> pd.concat([foo['Country'], z], axis=1)
Country x
0 USA 1
1 Canada 4
2 Canada 6
If you want z
as column name, specify keys
:
>>> pd.concat([foo['Country'], z], keys=['Country', 'z'], axis=1)
Country z
0 USA 1
1 Canada 4
2 Canada 6
这篇关于Pandas - 根据行值有条件地为新列选择数据的源列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!