如何在数据框中创建新列，这将是其他列和条件函数的功能，而无需使用for循环遍历行? [英] How to create a new column in dataframe, which will be a function of other columns and conditionals without iteratng over the rows with a for loop?

查看：38 发布时间：2020/5/24 3:54:53 python pandas

本文介绍了如何在数据框中创建新列，这将是其他列和条件函数的功能，而无需使用for循环遍历行?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个相对较大的数据框(8737行和16列，包括所有变量类型，字符串，整数，布尔值等)，我想根据一个方程式和一些条件创建一个新列.基本上，我想遍历一个特定的列，获取其值，然后乘以，求和等.创建一个新值，然后检查是否满足某些条件(> =或<到设置值).如果满足条件，那么我需要保留计算的输出，否则分配一个固定值.

I have a relatively large data frame (8737 rows and 16 columns of all variable types, strings, integers, booleans etc.) and I want to create a new column based on an equation and some conditionals. Basically, I want to iterate over one particular column, take its values and after multiplications, sums etc. create a new value which then I check if it satisfies some conditions (>= or < to a set value). If it satisfies the conditionals then I need to keep the output of the calculation, else assign a fixed value.

我这样做是通过for循环遍历整个数据集，这需要花费大量时间.我是python的新手，除了在没有for循环的情况下交替使用现有列之外，我在网上找不到任何类似的问题解决方案.

I am doing that by looping over the entire dataset with a for loop, which takes a huge amount of time. I am quite new to python and couldn't quite find any similar problem solution online, other than alternating existing columns without a for loop.

为了简单起见，我将这个数据帧称为df_test:

Lets say for the sake of simplicity I have this data frame called df_test:

          A         B         C          D    S
0  0.001568  0.321316 -0.269841   3.232037  5.0
1  1.926186 -1.111863 -0.387165   5.541699  NaN
2  2.110923 -0.403940 -0.029895  -9.688968  NaN
3  0.609391  1.697205 -1.827488  -1.273713  NaN
4 -0.577739  0.394475 -1.524400  16.505185  NaN
5  0.456884 -1.238733  0.453586  -4.868735  NaN

其中S是我需要计算的列，从设置值开始. S的下一个值我需要是S的上一个值，再加上诸如此类的一些计算:

where S is the column I need to calculate, starting from a set value. Next value of S I need to be the previous value of S plus some calculation like so:

df_test.S[1]=df_test.S[0]+df_test.D[1]*abs(df_test.C[1])*0.5

然后，应按条件评估此值.如果它大于等于例如10，则为它分配10(而不是计算)，如果它小于或等于5，则为其分配5.

Then this value should be evaluated by a conditional. If it is larger than equal to, for example 10, then assign 10 to it (instead of the calculation) and if its less or equal to 5 then assign 5 to it.

我在数据集上使用了for循环，并为每个元素运行了所需的方程式.基本上它是这样的:

I use a for loop over the data set and for every element I run the equation that I need. Basically it works like this:

for i in range (1,df_test.shape[0]):
    df_test.S[i]=df_test.S[i-1]+df_test.D[i]*abs(df_test.C[i])*0.5
    if df_test.S[i]<5:
        df_test.S[i]=5
    elif df_test.S[i]>10:
        df_test.S[i]=10

此用于8737行的代码大约需要20分钟才能完成.

This code for 8737 rows takes around 20 mins to complete.

如果您需要任何说明，请问我.预先谢谢你.

If you need any clarifications, please ask me. Thank you in advance.

如何在数据框中创建新列，这将是其他列和条件函数的功能，而无需使用for循环遍历行? [英] How to create a new column in dataframe, which will be a function of other columns and conditionals without iteratng over the rows with a for loop?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在数据框中创建新列，这将是其他列和条件函数的功能，而无需使用for循环遍历行? [英] How to create a new column in dataframe, which will be a function of other columns and conditionals without iteratng over the rows with a for loop?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭