如何在DataFrame中的每一行上运行一个函数,并将结果附加到一个新的DataFrame [英] How to run a function on each row in DataFrame and append the result to a new DataFrame

查看:2797
本文介绍了如何在DataFrame中的每一行上运行一个函数,并将结果附加到一个新的DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

NB我的代码运行如果复制了

NB My code runs if copied

我写了一个简单的脚本来使用poloniex API来重新测试加密货币。

I wrote a simple script to backtest cryptocurrencies using the poloniex API.

首先,我要求API中的数据,并将其转换为数据框数据

First I request the data from the API and turn it into a dataframe data.

然后我拿取我想要的数据,并将新的df称为 df

Then I take the data I want and make new df called df

然后必须在 df 中的每行上运行函数 trade ,如果价格高于滚动平均线购买和出售如果以下,这些数据然后保存在日志

A function trade must then be run on each line in df, simple put if the price is above the rolling mean it buys and sells if below, this data is then saved in log.

我无法将此功能应用于 df 中的每一行。

I am having trouble applying this function on each row in df.

我使用线条 log = df.apply(lambda x:trade(x ['date'],x ['close '],x ['MA']),轴= 1)但是当BTC_ETH用于API调用而不是为其他使用时,BTC_FCT或BTC_DOGE,尽管数据的形式相同,但令人惊讶的是它是有效的。使用ETH导致创建DataFrame(这是我想要的)DOGE和FCT创建一个系列

I had great success using the line log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1) BUT surprising it works when BTC_ETH is used in the API call and not for others ie BTC_FCT or BTC_DOGE despite the data being identical in form. Using ETH results in the creation of DataFrame (which is what i want) DOGE and FCT creates a Series

第一个问题,我如何运行我的在每行上交易函数,并创建一个新的df 日志,结果

First question, how can I run my trade function on each row and create a new df log with the results

奖金问题,即使数据类型是相同的为什么它为ETH而不是为DOGE / FCT工作?

Bonus question, even though the data types are the same why does it work for ETH but not for DOGE/FCT ?

import numpy as np
from pandas import Series, DataFrame
import pandas as pd

API = 'https://poloniex.com/public?command=returnChartData&currencyPair=BTC_FCT&start=1435699200&end=9999999999&period=86400'
data = pd.read_json(API)

df = pd.DataFrame(columns = {'date','close','MA'})

df.MA = pd.rolling_mean(data.close, 30)
df.close = data.close
df.date = data.date

df = df.truncate(before=29)

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    print(x)
    pd.reset_option('display.max_rows')

log = pd.DataFrame(columns = ['Date', 'type', 'profit', 'port_value'])
port = {'coin': 0, 'BTC':1}

def trade(date, close, MA):

    if MA < close and port['coin'] == 0 :

        coins_bought = port['BTC']/MA

        port['BTC'] = 0
        port['coin'] = coins_bought

        d = {'Date':date, 'type':'buy', 'coin_value': port['coin'], 'btc_value':port['BTC']}
        return pd.Series(d) 

    elif MA > close and port['BTC'] == 0 :

        coins_sold = port['coin']*MA

        port['coin'] = 0
        port['BTC'] = coins_sold

        d = {'Date':date, 'type':'sell', 'coin_value': port['coin'], 'btc_value':port['BTC']}
        print()
        return pd.Series(d) 

log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)

log = log.dropna()

print_full(log)

编辑:

我解决了这个问题,该列表然后使用df.from_dict()方法创建日志数据框,我的代码只是为了澄清。

I solved the problem, I fixed it by appending the dicts to list and then using the df.from_dict() method to create the log dataframe, my code just to clarify.

def trade(date, close, MA):#, port):
    #d = {'Data': close}
    #test_log = test_log.append(d, ignore_index=True)

    if MA < close and port['coin'] == 0 :

        coins_bought = port['BTC']/MA

        port['BTC'] = 0
        port['coin'] = coins_bought

        d = {'Date':date, 'type':'buy', 'coin_value': port['coin'], 'btc_value':port['BTC']}
        data_list.append(d)

        #return pd.Series(d) 

    elif MA > close and port['BTC'] == 0 :

        coins_sold = port['coin']*MA

        port['coin'] = 0
        port['BTC'] = coins_sold

        d = {'Date':date, 'type':'sell', 'coin_value': port['coin'], 'btc_value':port['BTC']}

        data_list.append(d)

        #return pd.Series(d) 


df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)

log = log.dropna()

for key,value in port.items():
    print(key, value )

log.from_dict(data_list)


推荐答案

问题是您并不总是在 trade ,这是令人困惑的熊猫。尝试这样:

The problem is that you are not always returning a value in trade, which is confusing Pandas. Try this:

import numpy as np
from pandas import Series, DataFrame
import pandas as pd

API = 'https://poloniex.com/public?command=returnChartData&currencyPair=BTC_FCT&start=1435699200&end=9999999999&period=86400'
data = pd.read_json(API)

df = pd.DataFrame(columns = {'date','close','MA'})

df.MA = pd.rolling_mean(data.close, 30)
df.close = data.close
df.date = data.date

df = df.truncate(before=29)

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    print(x)
    pd.reset_option('display.max_rows')

log = pd.DataFrame(columns = ['Date', 'type', 'profit', 'port_value'])
port = {'coin': 0, 'BTC':1}

port = {'coin': 0, 'BTC':1}

def trade(date, close, MA):
    d = {'Date': date, 'type':'', 'coin_value': np.nan, 'btc_value': np.nan}

    if MA < close and port['coin'] == 0 :
        coins_bought = port['BTC']/MA
        port['BTC'] = 0
        port['coin'] = coins_bought
        d['type'] = 'buy'
        d['coin_value'] = port['coin']
        d['btc_value'] = port['BTC']

    elif MA > close and port['BTC'] == 0 :
        coins_sold = port['coin']*MA
        port['coin'] = 0
        port['BTC'] = coins_sold
        d['type'] = 'sell'
        d['coin_value'] = port['coin']
        d['btc_value'] = port['BTC']

    return pd.Series(d)

log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)

log = log.dropna()

print_full(log)

但是,正如我在评论中提到的,将副作用的函数传递给应用不是一个好主意根据文档,并在事实上,我认为这可能不会在你的情况下产生正确的结果。

However, as I mentioned in the comment, passing a function with side-effects to apply is not a good idea according to the documentation, and in fact I think it may not produce the correct result in your case.

这篇关于如何在DataFrame中的每一行上运行一个函数,并将结果附加到一个新的DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆