将我的自定义函数应用于数据框python [英] Applying my custom function to a data frame python

查看：115 发布时间：2020/5/24 2:43:15 python python-3.x pandas dataframe binning

本文介绍了将我的自定义函数应用于数据框python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，其中包含一个称为Signal的列.我想向该数据框添加一个新列，并应用我已构建的自定义函数.我对此很陌生，在将要从数据帧列中移出的值传递给函数时，似乎遇到了麻烦，因此，对我的语法错误或推理的任何帮助将不胜感激！ /p>

I have a dataframe with a column called Signal. I want to add a new column to that dataframe and apply a custom function i've built. I'm very new at this and I seem to be having trouble when it comes to passing values that I'm getting out of a data frame column into a function so any help as to my syntax errors or reasoningg would be greatly appreciated!

Signal
3.98
3.78
-6.67
-17.6
-18.05
-14.48
-12.25
-13.9
-16.89
-13.3
-13.19
-18.63
-26.36
-26.23
-22.94
-23.23
-15.7

这是我的简单功能

def slope_test(x):
    if x >0 and x<20:
        return 'Long'
    elif x<0 and x>-20:
        return 'Short'
    else:
        return 'Flat'

我不断收到此错误: ValueError:系列的真值不明确.使用a.empty，a.bool()，a.item()，a.any()或a.all().

I keep getting this error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

这是我尝试过的代码:

data['Position'] = data.apply(slope_test(data['Signal']))

还有:

data['Position'] = data['Signal'].apply(slope_test(data['Signal']))

推荐答案

您可以将numpy.select用于矢量化解决方案:

You can use numpy.select for a vectorised solution:

import numpy as np

conditions = [df['Signal'].between(0, 20, inclusive=False),
              df['Signal'].between(-20, 0, inclusive=False)]

values = ['Long', 'Short']

df['Cat'] = np.select(conditions, values, 'Flat')

说明

您正在尝试对一个序列进行操作，就好像它是一个标量一样.由于您的错误中说明的原因，这将不起作用.另外，您的pd.Series.apply逻辑不正确.此方法将 function 作为输入.因此，您只需使用df['Signal'].apply(slope_test).

You are attempting to perform operations on a series as if it were a scalar. This won't work for the reason explained in your error. In addition, your logic for pd.Series.apply is incorrect. This method takes a function as an input. Therefore, you can simply use df['Signal'].apply(slope_test).

但是pd.Series.apply是光荣的，低效的循环.您应该利用Pandas数据框下面的NumPy数组提供的矢量化功能.实际上，这是首先使用熊猫的一个很好的理由.

But pd.Series.apply is a glorified, inefficient loop. You should utilise the vectorised functionality available with NumPy arrays underlying your Pandas dataframe. In fact, this a good reason for using Pandas in the first place.

这篇关于将我的自定义函数应用于数据框python的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将我的自定义函数应用于数据框python [英] Applying my custom function to a data frame python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将我的自定义函数应用于数据框python [英] Applying my custom function to a data frame python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭