根据 Pandas 中现有列的函数创建新列的动态方法 [英] Dynamic way to create new columns as a function of existing columns in pandas
本文介绍了根据 Pandas 中现有列的函数创建新列的动态方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在寻找一种更具编程性的方法来创建多个新列作为 Pandas DataFrame 中现有列的函数.
I'm looking for a more programmatic way of creating multiple new columns as a function of existing columns in a Pandas DataFrame.
我有 14 列 Level_2 - Level_15.我想迭代地创建 14 个新列,将列 2-15、3-15、4-15 等相加.
I have 14 columns Level_2 - Level_15. I want to iteratively create 14 new columns that sum columns 2-15, then 3-15, then 4-15 and so on.
现在我的代码看起来像这样
Right now my code looks something like this
cols['2_sum'] = cols.Level_2 + cols.Level_3 + cols.Level_4 + cols.Level_5 + cols.Level_6 + cols.Level_7 + cols.Level_8 + cols.Level_9 + cols.Level_10 + cols.Level_11 + cols.Level_12 + cols.Level_13 + cols.Level_14 + cols.Level_15
cols['3_sum'] = cols.Level_3 + cols.Level_4 + cols.Level_5 + cols.Level_6 + cols.Level_7 + cols.Level_8 + cols.Level_9 + cols.Level_10 + cols.Level_11 + cols.Level_12 + cols.Level_13 + cols.Level_14 + cols.Level_15
cols['4_sum'] = cols.Level_4 + cols.Level_5 + cols.Level_6 + cols.Level_7 + cols.Level_8 + cols.Level_9 + cols.Level_10 + cols.Level_11 + cols.Level_12 + cols.Level_13 + cols.Level_14 + cols.Level_15
是否有更多的 Pandas 或 Pythonic 方法可以做到这一点?
Is there a more pandas or pythonic way to do this?
谢谢!
推荐答案
这是一个例子:
示例数据:
In [147]: df = pd.DataFrame(np.random.rand(3, 15),
...: columns=['ID'] + ['Level_{}'.format(x) for x in range(2, 16)])
...:
In [148]: df
Out[148]:
ID Level_2 Level_3 Level_4 Level_5 Level_6 Level_7 Level_8 Level_9 Level_10 Level_11 \
0 0.851407 0.957810 0.204217 0.848265 0.168324 0.010265 0.191499 0.787552 0.648678 0.424462 0.038888
1 0.354270 0.442843 0.631624 0.081120 0.357300 0.211621 0.177321 0.316312 0.836935 0.445603 0.267165
2 0.998240 0.341875 0.590768 0.475935 0.071915 0.720590 0.041327 0.926167 0.671880 0.516845 0.450720
Level_12 Level_13 Level_14 Level_15
0 0.465109 0.508491 0.282262 0.848373
1 0.205415 0.399493 0.537186 0.774417
2 0.131734 0.554596 0.253658 0.104193
解决方案:
In [149]: for n in range(15, 1, -1):
...: df['{}_sum'.format(15-n+2)] = df.filter(regex=r'Level_\d+').iloc[:, :n].sum(1)
...:
结果:
In [150]: df
Out[150]:
ID Level_2 Level_3 Level_4 Level_5 Level_6 Level_7 Level_8 Level_9 Level_10 ... \
0 0.851407 0.957810 0.204217 0.848265 0.168324 0.010265 0.191499 0.787552 0.648678 0.424462 ...
1 0.354270 0.442843 0.631624 0.081120 0.357300 0.211621 0.177321 0.316312 0.836935 0.445603 ...
2 0.998240 0.341875 0.590768 0.475935 0.071915 0.720590 0.041327 0.926167 0.671880 0.516845 ...
6_sum 7_sum 8_sum 9_sum 10_sum 11_sum 12_sum 13_sum 14_sum 15_sum
0 4.745067 4.279958 4.241070 3.816608 3.167931 2.380379 2.188880 2.178615 2.010292 1.162027
1 3.973259 3.767844 3.500679 3.055076 2.218140 1.901828 1.724508 1.512887 1.155587 1.074468
2 4.939755 4.808021 4.357301 3.840456 3.168576 2.242409 2.201082 1.480492 1.408577 0.932643
[3 rows x 29 columns]
这篇关于根据 Pandas 中现有列的函数创建新列的动态方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文