如何使用python在 pandas 数据框中有效地遍历行 [英] How to iterate over rows effectively in pandas data-frame using python

查看：99 发布时间：2020/10/17 2:48:27 python-3.x pandas dataframe iteration

本文介绍了如何使用python在 pandas 数据框中有效地遍历行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个看起来像这样的数据帧：

  ABC 
 13.06 12.95 -0.11 
 92.56 104.63 12.07 
 116.49 219.27 102.78 
 272.11 487.26 215.15 
 300.11 780.75 480.64

大约有100万条记录。

我要创建一个列D，其计算公式如下：

列<$ c的第一个值$ c> D 将为0，然后：

颜色D3 = =（D2 + 1）* C3 / B3

颜色D4 = =（D3 + 1）* C4 / B4

D列的当前值取决于先前的值。 / p>

结果如下：

  D 
 0 
 0.115358884 
 0.52281017 
 0.672397915 
 1.02955022

我可以使用 for循环和loc 解决它，但是要花很多时间。我可以用更有效的pythonic方法解决它吗？

解决方案

递归计算不可矢量化，因为使用了改进的性能 numba ：

 从numba import jit 
 
 @jit（nopython = True）
 def f（a，b ，c）：
d = np.empty（a.shape）
d [0] = 0 
对于i在range（1，a.shape [0]）：
d [ i] =（d [i-1] + 1）* c [i] / b [i] 
 return d 
 
 df ['D'] = f（df ['A '] .to_numpy（），df ['B']。to_numpy（），df ['C']。to_numpy（））
打印（df）
 ABCD 
 0 13.06 12.95- 0.11 0.000000 
 1 92.56 104.63 12.07 0.115359 
 2 116.49 219.27 102.78 0.522810 
 3 272.11 487.26 215.15 0.672398 
 4300.11 780.75 480.64 1.029550

I have a data-frame which looks like:

A         B       C
13.06   12.95   -0.11
92.56   104.63  12.07
116.49  219.27  102.78
272.11  487.26  215.15
300.11  780.75  480.64

There are like 1 million records.

I want to create a column D which is calcualted as below:

First value of column D will be 0 and then:

Col D3= =(D2+1)*C3/B3

Col D4= =(D3+1)*C4/B4

Column D present value depends on previous value.

Here is the result:

D
0
0.115358884
0.52281017
0.672397915
1.02955022

I can solve it using for loop and loc but its taking lot of time. Can I solve it in more effective pythonic way?

解决方案

Recursive calculations are not vectorisable, for improve performance is used numba:

from numba import jit

@jit(nopython=True)
def f(a, b, c):
    d = np.empty(a.shape)
    d[0] = 0
    for i in range(1, a.shape[0]):
        d[i] = (d[i-1] + 1) * c[i] / b[i]
    return d

df['D'] = f(df['A'].to_numpy(), df['B'].to_numpy(), df['C'].to_numpy())
print (df)
        A       B       C         D
0   13.06   12.95   -0.11  0.000000
1   92.56  104.63   12.07  0.115359
2  116.49  219.27  102.78  0.522810
3  272.11  487.26  215.15  0.672398
4  300.11  780.75  480.64  1.029550

这篇关于如何使用python在 pandas 数据框中有效地遍历行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用python在 pandas 数据框中有效地遍历行 [英] How to iterate over rows effectively in pandas data-frame using python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用python在 pandas 数据框中有效地遍历行 [英] How to iterate over rows effectively in pandas data-frame using python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭