AttributeError:"DataFrame"对象在Python中没有属性"colmap" [英] AttributeError: 'DataFrame' object has no attribute 'colmap' in Python
问题描述
I am a python beginner and I try to use the following code from this source: Portfolio rebalancing with bandwidth method in python
到目前为止,代码运行良好.
The code works well so far.
问题是,如果我不想像rebalance(df, tol)
那样像往常一样调用该函数,而是要从数据帧中的某个位置调用该函数,例如:rebalance(df[500:], tol)
,则会出现以下错误:
The problem is that if I want to call the function not as usual like rebalance(df, tol)
, but from a certain location in the dataframe on, like: rebalance(df[500:], tol)
, I get the following error:
AttributeError: 'DataFrame' object has no attribute 'colmap'
.所以我的问题是:为了使之成为可能,我该如何调整代码?
AttributeError: 'DataFrame' object has no attribute 'colmap'
. So my question is: how do I have to adjust the code in order to make this possible?
这是代码:
import datetime as DT
import numpy as np
import pandas as pd
import pandas.io.data as PID
def setup_df():
df1 = PID.get_data_yahoo("IBM",
start=DT.datetime(1970, 1, 1),
end=DT.datetime.today())
df1.rename(columns={'Adj Close': 'ibm'}, inplace=True)
df2 = PID.get_data_yahoo("F",
start=DT.datetime(1970, 1, 1),
end=DT.datetime.today())
df2.rename(columns={'Adj Close': 'ford'}, inplace=True)
df = df1.join(df2.ford, how='inner')
df = df[['ibm', 'ford']]
df['sh ibm'] = 0
df['sh ford'] = 0
df['ibm value'] = 0
df['ford value'] = 0
df['ratio'] = 0
# This is useful in conjunction with iloc for referencing column names by
# index number
df.colmap = dict([(col, i) for i,col in enumerate(df.columns)])
return df
def invest(df, i, amount):
"""
Invest amount dollars evenly between ibm and ford
starting at ordinal index i.
This modifies df.
"""
c = df.colmap
halfvalue = amount/2
df.iloc[i:, c['sh ibm']] = halfvalue / df.iloc[i, c['ibm']]
df.iloc[i:, c['sh ford']] = halfvalue / df.iloc[i, c['ford']]
df.iloc[i:, c['ibm value']] = (
df.iloc[i:, c['ibm']] * df.iloc[i:, c['sh ibm']])
df.iloc[i:, c['ford value']] = (
df.iloc[i:, c['ford']] * df.iloc[i:, c['sh ford']])
df.iloc[i:, c['ratio']] = (
df.iloc[i:, c['ibm value']] / df.iloc[i:, c['ford value']])
def rebalance(df, tol):
"""
Rebalance df whenever the ratio falls outside the tolerance range.
This modifies df.
"""
i = 0
amount = 100
c = df.colmap
while True:
invest(df, i, amount)
mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
# ignore prior locations where the ratio falls outside tol range
mask[:i] = False
try:
# Move i one index past the first index where mask is True
# Note that this means the ratio at i will remain outside tol range
i = np.where(mask)[0][0] + 1
except IndexError:
break
amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])
return df
df = setup_df()
tol = 0.05 #setting the bandwidth tolerance
rebalance(df, tol)
df['portfolio value'] = df['ibm value'] + df['ford value']
df["ibm_weight"] = df['ibm value']/df['portfolio value']
df["ford_weight"] = df['ford value']/df['portfolio value']
print df['ibm_weight'].min()
print df['ibm_weight'].max()
print df['ford_weight'].min()
print df['ford_weight'].max()
# This shows the rows which trigger rebalancing
mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
print(df.loc[mask])
推荐答案
您遇到的问题是由于我的设计决策不当所致.
colmap
是在setup_df
中的df
上定义的属性:
The problem you encountered is due to a poor design decision on my part.
colmap
is an attribute defined on df
in setup_df
:
df.colmap = dict([(col, i) for i,col in enumerate(df.columns)])
它不是DataFrame的标准属性.
It is not a standard attribute of a DataFrame.
df[500:]
返回一个新的DataFrame,该数据是通过将数据从df
复制到新的DataFrame中生成的.由于colmap
不是标准属性,因此不会将其复制到新的DataFrame中.
df[500:]
returns a new DataFrame which is generated by copying data from df
into the new DataFrame. Since colmap
is not a standard attribute, it is not copied into the new DataFrame.
要在除setup_df
返回的数据帧之外的数据帧上调用rebalance
,请用
To call rebalance
on a DataFrame other than the one returned by setup_df
, replace c = df.colmap
with
c = dict([(col, j) for j,col in enumerate(df.columns)])
我也在原始帖子中进行了此更改.
PS.在另一个问题中,我选择在df
本身上定义colmap
不必每次调用rebalance
时都重新计算该dict
和invest
.
PS. In the other question, I had chosen to define colmap
on df
itself so
that this dict would not have to be recomputed with every call to rebalance
and invest
.
您的问题告诉我,进行这些次要优化不值得进行这些优化
函数的功能取决于所返回的DataFrame的特殊性
setup_df
.
Your question shows me that this minor optimization is not worth making these
functions so dependent on the specialness of the DataFrame returned by
setup_df
.
使用rebalance(df[500:], tol)
还会遇到第二个问题:
There is a second problem you will encounter using rebalance(df[500:], tol)
:
由于df[500:]
返回部分df
的副本,因此rebalance(df[500:], tol)
将进行修改
此副本,而不是原始的df
.如果对象df[500:]
在rebalance(df[500:], tol)
之外没有引用,这将是垃圾
对rebalance
的调用完成后收集.所以整个计算
会迷路的.因此rebalance(df[500:], tol)
没有用.
Since df[500:]
returns a copy of a portion of df
, rebalance(df[500:], tol)
will modify
this copy and not the original df
. If the object, df[500:]
,
has no reference outside of rebalance(df[500:], tol)
, it will be garbage
collected after the call to rebalance
is completed. So the entire computation
would be lost. Therefore rebalance(df[500:], tol)
is not useful.
相反,您可以修改rebalance
以接受i
作为参数:
Instead, you could modify rebalance
to accept i
as a parameter:
def rebalance(df, tol, i=0):
"""
Rebalance df whenever the ratio falls outside the tolerance range.
This modifies df.
"""
c = dict([(col, j) for j, col in enumerate(df.columns)])
while True:
mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
# ignore prior locations where the ratio falls outside tol range
mask[:i] = False
try:
# Move i one index past the first index where mask is True
# Note that this means the ratio at i will remain outside tol range
i = np.where(mask)[0][0] + 1
except IndexError:
break
amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])
invest(df, i, amount)
return df
然后您可以使用第500行开始重新平衡df
Then you can rebalance df
starting at the 500th row using
rebalance(df, tol, i=500)
请注意,这会找到i = 500上或之后的第一行
重新平衡.它不一定会在i = 500时重新平衡.这样,您就可以为任意i
调用rebalance(df, tol, i)
,而不必事先确定行i
上是否需要重新平衡.
Note that this finds the first row on or after i=500 that needs
rebalancing. It does not necessarily rebalance at i=500 itself. This allows you to call rebalance(df, tol, i)
for arbitrary i
without having to determine in advance if rebalancing is required on row i
.
这篇关于AttributeError:"DataFrame"对象在Python中没有属性"colmap"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!