如何在DataFrame中的每一行上运行一个函数,并将结果附加到一个新的DataFrame [英] How to run a function on each row in DataFrame and append the result to a new DataFrame
问题描述
NB我的代码运行如果复制了
NB My code runs if copied
我写了一个简单的脚本来使用poloniex API来重新测试加密货币。
I wrote a simple script to backtest cryptocurrencies using the poloniex API.
首先,我要求API中的数据,并将其转换为数据框数据
。
First I request the data from the API and turn it into a dataframe data
.
然后我拿取我想要的数据,并将新的df称为 df
Then I take the data I want and make new df called df
然后必须在 df
中的每行上运行函数 trade
,如果价格高于滚动平均线购买和出售如果以下,这些数据然后保存在日志
。
A function trade
must then be run on each line in df
, simple put if the price is above the rolling mean it buys and sells if below, this data is then saved in log
.
我无法将此功能应用于 df
中的每一行。
I am having trouble applying this function on each row in df
.
我使用线条 log = df.apply(lambda x:trade(x ['date'],x ['close '],x ['MA']),轴= 1)
但是当BTC_ETH用于API调用而不是为其他使用时,BTC_FCT或BTC_DOGE,尽管数据的形式相同,但令人惊讶的是它是有效的。使用ETH导致创建DataFrame(这是我想要的)DOGE和FCT创建一个系列
I had great success using the line log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)
BUT surprising it works when BTC_ETH is used in the API call and not for others ie BTC_FCT or BTC_DOGE despite the data being identical in form. Using ETH results in the creation of DataFrame (which is what i want) DOGE and FCT creates a Series
第一个问题,我如何运行我的在每行上交易
函数,并创建一个新的df 日志
,结果
First question, how can I run my trade
function on each row and create a new df log
with the results
奖金问题,即使数据类型是相同的为什么它为ETH而不是为DOGE / FCT工作?
Bonus question, even though the data types are the same why does it work for ETH but not for DOGE/FCT ?
import numpy as np
from pandas import Series, DataFrame
import pandas as pd
API = 'https://poloniex.com/public?command=returnChartData¤cyPair=BTC_FCT&start=1435699200&end=9999999999&period=86400'
data = pd.read_json(API)
df = pd.DataFrame(columns = {'date','close','MA'})
df.MA = pd.rolling_mean(data.close, 30)
df.close = data.close
df.date = data.date
df = df.truncate(before=29)
def print_full(x):
pd.set_option('display.max_rows', len(x))
print(x)
pd.reset_option('display.max_rows')
log = pd.DataFrame(columns = ['Date', 'type', 'profit', 'port_value'])
port = {'coin': 0, 'BTC':1}
def trade(date, close, MA):
if MA < close and port['coin'] == 0 :
coins_bought = port['BTC']/MA
port['BTC'] = 0
port['coin'] = coins_bought
d = {'Date':date, 'type':'buy', 'coin_value': port['coin'], 'btc_value':port['BTC']}
return pd.Series(d)
elif MA > close and port['BTC'] == 0 :
coins_sold = port['coin']*MA
port['coin'] = 0
port['BTC'] = coins_sold
d = {'Date':date, 'type':'sell', 'coin_value': port['coin'], 'btc_value':port['BTC']}
print()
return pd.Series(d)
log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)
log = log.dropna()
print_full(log)
编辑:
我解决了这个问题,该列表然后使用df.from_dict()方法创建日志数据框,我的代码只是为了澄清。
I solved the problem, I fixed it by appending the dicts to list and then using the df.from_dict() method to create the log dataframe, my code just to clarify.
def trade(date, close, MA):#, port):
#d = {'Data': close}
#test_log = test_log.append(d, ignore_index=True)
if MA < close and port['coin'] == 0 :
coins_bought = port['BTC']/MA
port['BTC'] = 0
port['coin'] = coins_bought
d = {'Date':date, 'type':'buy', 'coin_value': port['coin'], 'btc_value':port['BTC']}
data_list.append(d)
#return pd.Series(d)
elif MA > close and port['BTC'] == 0 :
coins_sold = port['coin']*MA
port['coin'] = 0
port['BTC'] = coins_sold
d = {'Date':date, 'type':'sell', 'coin_value': port['coin'], 'btc_value':port['BTC']}
data_list.append(d)
#return pd.Series(d)
df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)
log = log.dropna()
for key,value in port.items():
print(key, value )
log.from_dict(data_list)
推荐答案
问题是您并不总是在 trade
,这是令人困惑的熊猫。尝试这样:
The problem is that you are not always returning a value in trade
, which is confusing Pandas. Try this:
import numpy as np
from pandas import Series, DataFrame
import pandas as pd
API = 'https://poloniex.com/public?command=returnChartData¤cyPair=BTC_FCT&start=1435699200&end=9999999999&period=86400'
data = pd.read_json(API)
df = pd.DataFrame(columns = {'date','close','MA'})
df.MA = pd.rolling_mean(data.close, 30)
df.close = data.close
df.date = data.date
df = df.truncate(before=29)
def print_full(x):
pd.set_option('display.max_rows', len(x))
print(x)
pd.reset_option('display.max_rows')
log = pd.DataFrame(columns = ['Date', 'type', 'profit', 'port_value'])
port = {'coin': 0, 'BTC':1}
port = {'coin': 0, 'BTC':1}
def trade(date, close, MA):
d = {'Date': date, 'type':'', 'coin_value': np.nan, 'btc_value': np.nan}
if MA < close and port['coin'] == 0 :
coins_bought = port['BTC']/MA
port['BTC'] = 0
port['coin'] = coins_bought
d['type'] = 'buy'
d['coin_value'] = port['coin']
d['btc_value'] = port['BTC']
elif MA > close and port['BTC'] == 0 :
coins_sold = port['coin']*MA
port['coin'] = 0
port['BTC'] = coins_sold
d['type'] = 'sell'
d['coin_value'] = port['coin']
d['btc_value'] = port['BTC']
return pd.Series(d)
log = df.apply(lambda x: trade(x['date'], x['close'], x['MA']), axis=1)
log = log.dropna()
print_full(log)
但是,正如我在评论中提到的,将副作用的函数传递给应用
不是一个好主意根据文档,并在事实上,我认为这可能不会在你的情况下产生正确的结果。
However, as I mentioned in the comment, passing a function with side-effects to apply
is not a good idea according to the documentation, and in fact I think it may not produce the correct result in your case.
这篇关于如何在DataFrame中的每一行上运行一个函数,并将结果附加到一个新的DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!