循环通过 pandas 数据框 [英] looping through a pandas dataframe
问题描述
将pandas导入pd
导入pandas.io.data
从pandas import Series,DataFrame
import datetime
从pandas import ExcelWriter
import os
import matplotlib.pyplot as plt
import math
import numpy as np
from numpy import *
now = datetime.datetime.now()
start_of_interval = datetime.datetime(now.year - 15,now.month,now.day)
end_of_interval = datetime.datetime(now.year,now.month,now.day)
df = pd.io.data.get_data_yahoo(Spy,start = start_of_interval,end = end_of_interval,interval =d)[ 'Adj Close']
df = DataFrame(df)
df ['Returns'] = df.pct_change()
df ['Average_200'] = pd.rolling_mean df ['Adj Close'],200)
df ['Average_50'] = pd.rolling_mean(df ['Adj Close'],50)
df ['Date'] = df.index
df ['Average_Diff'] = df ['Average_50'] - df ['Average_200']
df ['Average_Diff'] = df ['Average_Dif f']。在df ['Average_Diff']中,
print(df)
print(df)
如果df ['Average_Diff'] [i] == int (2):
df [Signal] =Hold
df [Market] = 1
如果df ['Average_Diff'] [i-1]> 0和['Average_Diff'] [i]< 0:
df [Signal] =Buy
df ['Market'] = 1
elif df ['Average_Diff'] [i-1] 0和['Average_Diff'] [i]> 0:
df [Signal] =Sell
df [Signal] = 0
else:
df [Signal] =Hold
为df [Market]
如果df [Signal] [i] ==Sell:
df [Market2] = 0
elif df ['Signal'] [i] ==Holdand df ['Market'] [i-1] == 0:
df ['Market2'] = 0
elif df ['Signal'] [i] ==Holdand df ['Market'] [i-1] == 1:
df ['Market2'] = 1
elif df [信号'] [i] ==购买:
df ['Market2'] = 1
else:
df [Market2] = 1
这里有几种可以尝试的选择:
l = len(df)
对于范围(len)中的i:
如果df.loc [i,'Average_Diff'] == int(2):
df.loc [i,'Signal'] ='Hold'
df.loc [i,'Market'] = 1
或(更喜欢这个,超过上述)
for d in df.index.v如果df.loc [i,'Average_Diff'] == int(2):
df.loc [i,'Signal'] ='Hold'
df.loc [我,'市场'] = 1
编辑
l = df.index.values
for i in range(1,len(l)):
如果df。 loc [l [i],'Average_Diff'] == int(2):
df.loc [l [i],'Signal'] ='Hold'
df.loc [l [i ],'市场'] = 1
#即使i-1将以相同的方式工作:l [i-1]
与评论相反:
强>永远不要修改你正在迭代的东西。这不是
保证在所有情况下工作。根据数据类型,
迭代器返回副本,而不是视图,并且写入它将不会有
效果。 1
I am attempting to back-test an investment strategy. I am having trouble looping through the DataFrame to "re-create" how the strategy would have done starting 15 years ago. When I try to loop through the df['Average_Diff'] I keep getting the error "list indices must be integers or slices, not numpy.float64". I've struggled dealing with the nan that would occur in the beginning of the column due to how the values for ['Average_Diff'] were calculated, but once I fixed that I ran into this other problem. So how can I loop through the df['Average_Diff'] to create the "Buy or Sell" Signal and also loop through to indicate whether I'm in the market or out of the market based on the "Signals"?
import pandas as pd
import pandas.io.data
from pandas import Series, DataFrame
import datetime
from pandas import ExcelWriter
import os
import matplotlib.pyplot as plt
import math
import numpy as np
from numpy import *
now = datetime.datetime.now()
start_of_interval = datetime.datetime(now.year - 15, now.month, now.day)
end_of_interval = datetime.datetime(now.year, now.month, now.day)
df = pd.io.data.get_data_yahoo("Spy", start = start_of_interval, end = end_of_interval, interval = "d")['Adj Close']
df = DataFrame(df)
df['Returns'] = df.pct_change()
df['Average_200'] = pd.rolling_mean(df['Adj Close'],200)
df['Average_50'] = pd.rolling_mean(df['Adj Close'],50)
df['Date'] = df.index
df['Average_Diff'] = df['Average_50'] - df['Average_200']
df['Average_Diff'] = df['Average_Diff'].fillna(int(2))
print(df)
for i in df['Average_Diff']:
if df['Average_Diff'][i] == int(2):
df["Signal"] = "Hold"
df["Market"] = 1
if df['Average_Diff'][i-1] > 0 and ['Average_Diff'][i] < 0:
df["Signal"] = "Buy"
df['Market'] = 1
elif df['Average_Diff'][i-1] < 0 and ['Average_Diff'][i] > 0:
df["Signal"] = "Sell"
df["Signal"] = 0
else:
df["Signal"] = "Hold"
for i in df["Market"]:
if df["Signal"][i] == "Sell":
df["Market2"] = 0
elif df['Signal'][i] == "Hold" and df['Market'][i-1] == 0:
df['Market2'] = 0
elif df['Signal'][i] == "Hold" and df['Market'][i-1] == 1:
df['Market2'] = 1
elif df['Signal'][i] == "Buy":
df['Market2'] = 1
else:
df["Market2"] = 1
Here are a couple of alternatives you can try:
l = len(df)
for i in range(len):
if df.loc[i, 'Average_Diff'] == int(2):
df.loc[i, 'Signal'] = 'Hold'
df.loc[i, 'Market'] = 1
Or (prefer this, over the one above)
for i in df.index.values:
if df.loc[i, 'Average_Diff'] == int(2):
df.loc[i, 'Signal'] = 'Hold'
df.loc[i, 'Market'] = 1
EDIT
l = df.index.values
for i in range(1, len(l)):
if df.loc[l[i], 'Average_Diff'] == int(2):
df.loc[l[i], 'Signal'] = 'Hold'
df.loc[l[i], 'Market'] = 1
# Even i-1 will work in the same way: l[i-1]
Contrary to the comments:
You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect. 1
这篇关于循环通过 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!