循环通过 pandas 数据框 [英] looping through a pandas dataframe

查看:124
本文介绍了循环通过 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试图回溯一下投资策略。我遇到困难,循环使用DataFrame来重新创建15年前的策略。当我尝试循环通过df ['Average_Diff']我不断得到错误列表索引必须是整数或片,而不是numpy.float64。由于[Average_Diff]的值是如何计算的,但是一旦我确定我碰到了另外一个问题,我一直在努力处理在列的开头出现的nan。那么我如何循环通过df ['Average_Diff']创建买入或卖出信号,并循环通过,以指示我是基于信号进入市场还是在市场之外?

 将pandas导入pd 
导入pandas.io.data
从pandas import Series,DataFrame
import datetime
从pandas import ExcelWriter
import os
import matplotlib.pyplot as plt
import math
import numpy as np
from numpy import *
now = datetime.datetime.now()


start_of_interval = datetime.datetime(now.year - 15,now.month,now.day)
end_of_interval = datetime.datetime(now.year,now.month,now.day)
df = pd.io.data.get_data_yahoo(Spy,start = start_of_interval,end = end_of_interval,interval =d)[ 'Adj Close']

df = DataFrame(df)
df ['Returns'] = df.pct_change()
df ['Average_200'] = pd.rolling_mean df ['Adj Close'],200)
df ['Average_50'] = pd.rolling_mean(df ['Adj Close'],50)
df ['Date'] = df.index

df ['Average_Diff'] = df ['Average_50'] - df ['Average_200']
df ['Average_Diff'] = df ['Average_Dif f']。在df ['Average_Diff']中,
print(df)
print(df)
如果df ['Average_Diff'] [i] == int (2):
df [Signal] =Hold
df [Market] = 1
如果df ['Average_Diff'] [i-1]> 0和['Average_Diff'] [i]< 0:
df [Signal] =Buy
df ['Market'] = 1
elif df ['Average_Diff'] [i-1] 0和['Average_Diff'] [i]> 0:
df [Signal] =Sell
df [Signal] = 0
else:
df [Signal] =Hold

为df [Market]
如果df [Signal] [i] ==Sell:
df [Market2] = 0
elif df ['Signal'] [i] ==Holdand df ['Market'] [i-1] == 0:
df ['Market2'] = 0
elif df ['Signal'] [i] ==Holdand df ['Market'] [i-1] == 1:
df ['Market2'] = 1
elif df [信号'] [i] ==购买:
df ['Market2'] = 1
else:
df [Market2] = 1


解决方案

这里有几种可以尝试的选择:

  l = len(df)
对于范围(len)中的i:
如果df.loc [i,'Average_Diff'] == int(2):
df.loc [i,'Signal'] ='Hold'
df.loc [i,'Market'] = 1

或(更喜欢这个,超过上述)

  for d in df.index.v如果df.loc [i,'Average_Diff'] == int(2):
df.loc [i,'Signal'] ='Hold'
df.loc [我,'市场'] = 1

编辑

  l = df.index.values 
for i in range(1,len(l)):
如果df。 loc [l [i],'Average_Diff'] == int(2):
df.loc [l [i],'Signal'] ='Hold'
df.loc [l [i ],'市场'] = 1
#即使i-1将以相同的方式工作:l [i-1]






与评论相反:


强>永远不要修改你正在迭代的东西。这不是
保证在所有情况下工作。根据数据类型,
迭代器返回副本,而不是视图,并且写入它将不会有
效果。 1



I am attempting to back-test an investment strategy. I am having trouble looping through the DataFrame to "re-create" how the strategy would have done starting 15 years ago. When I try to loop through the df['Average_Diff'] I keep getting the error "list indices must be integers or slices, not numpy.float64". I've struggled dealing with the nan that would occur in the beginning of the column due to how the values for ['Average_Diff'] were calculated, but once I fixed that I ran into this other problem. So how can I loop through the df['Average_Diff'] to create the "Buy or Sell" Signal and also loop through to indicate whether I'm in the market or out of the market based on the "Signals"?

import pandas as pd 
import pandas.io.data 
from pandas import Series, DataFrame
import datetime
from pandas import ExcelWriter 
import os 
import matplotlib.pyplot as plt 
import math
import numpy as np 
from numpy import *
now = datetime.datetime.now()


start_of_interval = datetime.datetime(now.year - 15, now.month, now.day)
end_of_interval = datetime.datetime(now.year, now.month, now.day)       
df = pd.io.data.get_data_yahoo("Spy", start = start_of_interval, end = end_of_interval, interval = "d")['Adj Close']

df = DataFrame(df) 
df['Returns'] = df.pct_change()
df['Average_200'] = pd.rolling_mean(df['Adj Close'],200)
df['Average_50'] = pd.rolling_mean(df['Adj Close'],50) 
df['Date'] = df.index 

df['Average_Diff'] = df['Average_50'] - df['Average_200']  
df['Average_Diff'] = df['Average_Diff'].fillna(int(2)) 
print(df) 
for i in df['Average_Diff']:
    if df['Average_Diff'][i] == int(2):
    df["Signal"] = "Hold"
    df["Market"] = 1
if df['Average_Diff'][i-1] > 0 and ['Average_Diff'][i] < 0:
    df["Signal"] = "Buy"
    df['Market'] = 1
elif df['Average_Diff'][i-1] < 0 and ['Average_Diff'][i] > 0:
    df["Signal"] = "Sell"
    df["Signal"] = 0 
else:
    df["Signal"] = "Hold" 

for i in df["Market"]:
    if df["Signal"][i] == "Sell":
        df["Market2"] = 0
    elif df['Signal'][i] == "Hold" and df['Market'][i-1] == 0:
        df['Market2'] = 0
    elif df['Signal'][i] == "Hold" and df['Market'][i-1] == 1:  
        df['Market2'] = 1 
    elif df['Signal'][i] == "Buy":
        df['Market2'] = 1 
    else:
        df["Market2"] = 1

解决方案

Here are a couple of alternatives you can try:

l = len(df)
for i in range(len):
    if df.loc[i, 'Average_Diff'] == int(2):
        df.loc[i, 'Signal'] = 'Hold'
        df.loc[i, 'Market'] = 1

Or (prefer this, over the one above)

for i in df.index.values:
    if df.loc[i, 'Average_Diff'] == int(2):
        df.loc[i, 'Signal'] = 'Hold'
        df.loc[i, 'Market'] = 1

EDIT

l = df.index.values
for i in range(1, len(l)):
    if df.loc[l[i], 'Average_Diff'] == int(2):
        df.loc[l[i], 'Signal'] = 'Hold'
        df.loc[l[i], 'Market'] = 1
    # Even i-1 will work in the same way: l[i-1] 


Contrary to the comments:

You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect. 1

这篇关于循环通过 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆