大 pandas :将超过2列映射到一列 [英] pandas: map more than 2 columns to one column

查看:81
本文介绍了大 pandas :将超过2列映射到一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是此问题的更新版本,仅将两列映射到新列.

This is an updated version of this question, which dealt with mapping only two columns to a new column.

现在我有三列要使用相同的字典映射到单个新列(如果字典中没有匹配的键,则返回0).

Now I have three columns that I want to map to a single new column using the same dictionary (and return 0 if there is no matching key in the dictionary).

>> codes = {'2':1,
            '31':1,
            '88':9,
            '99':9}

>> df[['driver_action1','driver_action2','driver_action3']].to_dict()    
{'driver_action1': {0: '1',
  1: '1',
  2: '77',
  3: '77',
  4: '1',
  5: '4',
  6: '2',
  7: '1',
  8: '77',
  9: '99'},
 'driver_action2': {0: '4',
  1: '99',
  2: '99',
  3: '99',
  4: '1',
  5: '2',
  6: '2',
  7: '99',
  8: '99',
  9: '99'},
 'driver_action3': {0: '4',
  1: '99',
  2: '99',
  3: '99',
  4: '1',
  5: '99',
  6: '99',
  7: '99',
  8: '31',
  9: '31'}}

预期输出:

  driver_action1 driver_action2 driver_action3  newcolumn
0              1              4              4          0
1              1             99             99          9
2             77             99             99          9
3             77             99             99          9
4              1              1              1          9
5              4              2             99          1
6              2              2             99          1
7              1             99             99          9
8             77             99             31          1
9             99             99             31          1

我不确定如何使用.applymap()或Combine_first()来做到这一点.

I am not sure how to do this with .applymap() or combine_first().

推荐答案

尝试一下:

In [174]: df['new'] = df.stack(dropna=False).map(codes).unstack() \
     ...:               .iloc[:, ::-1].ffill(axis=1) \
     ...:               .iloc[:, -1].fillna(0)
     ...:

In [175]: df
Out[175]:
  driver_action1 driver_action2 driver_action3  new
0              1              4              4  0.0
1              1             99             99  9.0
2             77             99             99  9.0
3             77             99             99  9.0
4              1              1              1  0.0
5              4              2             99  1.0
6              2              2             99  1.0
7              1             99             99  9.0
8             77             99             31  9.0
9             99             99             31  9.0

替代解决方案:

df['new'] = df.stack(dropna=False).map(codes).unstack().T \
              .apply(lambda x: x[x.first_valid_index()]
                               if x.first_valid_index() else 0)

说明:

堆叠,映射,取消堆叠映射值:

stack, map, unstack mapped values:

In [188]: df.stack(dropna=False).map(codes).unstack()
Out[188]:
   driver_action1  driver_action2  driver_action3
0             NaN             NaN             NaN
1             NaN             9.0             9.0
2             NaN             9.0             9.0
3             NaN             9.0             9.0
4             NaN             NaN             NaN
5             NaN             1.0             9.0
6             1.0             1.0             9.0
7             NaN             9.0             9.0
8             NaN             9.0             1.0
9             9.0             9.0             1.0

反向列排序并沿columns轴应用正向填充:

reverse columns order and apply forward fill along columns axis:

In [190]: df.stack(dropna=False).map(codes).unstack().iloc[:, ::-1].ffill(axis=1)
Out[190]:
   driver_action3  driver_action2  driver_action1
0             NaN             NaN             NaN
1             9.0             9.0             9.0
2             9.0             9.0             9.0
3             9.0             9.0             9.0
4             NaN             NaN             NaN
5             9.0             1.0             1.0
6             9.0             1.0             1.0
7             9.0             9.0             9.0
8             1.0             9.0             9.0
9             1.0             9.0             9.0

选择最后一列,并用0填充NaN:

select last column and fill NaN's with 0:

In [191]: df.stack(dropna=False).map(codes).unstack().iloc[:, ::-1].ffill(axis=1).iloc[:, -1].fillna(0)
Out[191]:
0    0.0
1    9.0
2    9.0
3    9.0
4    0.0
5    1.0
6    1.0
7    9.0
8    9.0
9    9.0
Name: driver_action1, dtype: float64

这篇关于大 pandas :将超过2列映射到一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆