AttributeError:"DataFrame"对象在Python中没有属性"colmap" [英] AttributeError: 'DataFrame' object has no attribute 'colmap' in Python

查看:82
本文介绍了AttributeError:"DataFrame"对象在Python中没有属性"colmap"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python初学者,我尝试使用此来源中的以下代码:

I am a python beginner and I try to use the following code from this source: Portfolio rebalancing with bandwidth method in python

到目前为止,代码运行良好.

The code works well so far.

问题是,如果我不想像rebalance(df, tol)那样像往常一样调用该函数,而是要从数据帧中的某个位置调用该函数,例如:rebalance(df[500:], tol),则会出现以下错误:

The problem is that if I want to call the function not as usual like rebalance(df, tol), but from a certain location in the dataframe on, like: rebalance(df[500:], tol), I get the following error:

AttributeError: 'DataFrame' object has no attribute 'colmap'.所以我的问题是:为了使之成为可能,我该如何调整代码?

AttributeError: 'DataFrame' object has no attribute 'colmap'. So my question is: how do I have to adjust the code in order to make this possible?

这是代码:

import datetime as DT
import numpy as np
import pandas as pd
import pandas.io.data as PID

def setup_df():
    df1 = PID.get_data_yahoo("IBM", 
                             start=DT.datetime(1970, 1, 1), 
                             end=DT.datetime.today())
    df1.rename(columns={'Adj Close': 'ibm'}, inplace=True)

    df2 = PID.get_data_yahoo("F", 
                             start=DT.datetime(1970, 1, 1), 
                             end=DT.datetime.today())
    df2.rename(columns={'Adj Close': 'ford'}, inplace=True)

    df = df1.join(df2.ford, how='inner')
    df = df[['ibm', 'ford']]
    df['sh ibm'] = 0
    df['sh ford'] = 0
    df['ibm value'] = 0
    df['ford value'] = 0
    df['ratio'] = 0
    # This is useful in conjunction with iloc for referencing column names by
    # index number
    df.colmap = dict([(col, i) for i,col in enumerate(df.columns)])
    return df

def invest(df, i, amount):
    """
    Invest amount dollars evenly between ibm and ford
    starting at ordinal index i.
    This modifies df.
    """
    c = df.colmap
    halfvalue = amount/2
    df.iloc[i:, c['sh ibm']] = halfvalue / df.iloc[i, c['ibm']]
    df.iloc[i:, c['sh ford']] = halfvalue / df.iloc[i, c['ford']]

    df.iloc[i:, c['ibm value']] = (
        df.iloc[i:, c['ibm']] * df.iloc[i:, c['sh ibm']])
    df.iloc[i:, c['ford value']] = (
        df.iloc[i:, c['ford']] * df.iloc[i:, c['sh ford']])
    df.iloc[i:, c['ratio']] = (
        df.iloc[i:, c['ibm value']] / df.iloc[i:, c['ford value']])

def rebalance(df, tol):
    """
    Rebalance df whenever the ratio falls outside the tolerance range.
    This modifies df.
    """
    i = 0
    amount = 100
    c = df.colmap
    while True:
        invest(df, i, amount)
        mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
        # ignore prior locations where the ratio falls outside tol range
        mask[:i] = False
        try:
            # Move i one index past the first index where mask is True
            # Note that this means the ratio at i will remain outside tol range
            i = np.where(mask)[0][0] + 1
        except IndexError:
            break
        amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])
    return df

df = setup_df()
tol = 0.05 #setting the bandwidth tolerance
rebalance(df, tol)

df['portfolio value'] = df['ibm value'] + df['ford value']
df["ibm_weight"] = df['ibm value']/df['portfolio value']
df["ford_weight"] = df['ford value']/df['portfolio value']

print df['ibm_weight'].min()
print df['ibm_weight'].max()
print df['ford_weight'].min()
print df['ford_weight'].max()

# This shows the rows which trigger rebalancing
mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
print(df.loc[mask])

推荐答案

您遇到的问题是由于我的设计决策不当所致. colmap是在setup_df中的df上定义的属性:

The problem you encountered is due to a poor design decision on my part. colmap is an attribute defined on df in setup_df:

df.colmap = dict([(col, i) for i,col in enumerate(df.columns)])

它不是DataFrame的标准属性.

It is not a standard attribute of a DataFrame.

df[500:]返回一个新的DataFrame,该数据是通过将数据从df复制到新的DataFrame中生成的.由于colmap不是标准属性,因此不会将其复制到新的DataFrame中.

df[500:] returns a new DataFrame which is generated by copying data from df into the new DataFrame. Since colmap is not a standard attribute, it is not copied into the new DataFrame.

要在除setup_df返回的数据帧之外的数据帧上调用rebalance,请用

To call rebalance on a DataFrame other than the one returned by setup_df, replace c = df.colmap with

c = dict([(col, j) for j,col in enumerate(df.columns)])

我也在原始帖子中进行了此更改.

PS.在另一个问题中,我选择在df本身上定义colmap 不必每次调用rebalance时都重新计算该dict 和invest.

PS. In the other question, I had chosen to define colmap on df itself so that this dict would not have to be recomputed with every call to rebalance and invest.

您的问题告诉我,进行这些次要优化不值得进行这些优化 函数的功能取决于所返回的DataFrame的特殊性 setup_df.

Your question shows me that this minor optimization is not worth making these functions so dependent on the specialness of the DataFrame returned by setup_df.

使用rebalance(df[500:], tol)还会遇到第二个问题:

There is a second problem you will encounter using rebalance(df[500:], tol):

由于df[500:]返回部分df副本,因此rebalance(df[500:], tol)将进行修改 此副本,而不是原始的df.如果对象df[500:]rebalance(df[500:], tol)之外没有引用,这将是垃圾 对rebalance的调用完成后收集.所以整个计算 会迷路的.因此rebalance(df[500:], tol)没有用.

Since df[500:] returns a copy of a portion of df, rebalance(df[500:], tol) will modify this copy and not the original df. If the object, df[500:], has no reference outside of rebalance(df[500:], tol), it will be garbage collected after the call to rebalance is completed. So the entire computation would be lost. Therefore rebalance(df[500:], tol) is not useful.

相反,您可以修改rebalance以接受i作为参数:

Instead, you could modify rebalance to accept i as a parameter:

def rebalance(df, tol, i=0):
    """
    Rebalance df whenever the ratio falls outside the tolerance range.
    This modifies df.
    """
    c = dict([(col, j) for j, col in enumerate(df.columns)])
    while True:
        mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
        # ignore prior locations where the ratio falls outside tol range
        mask[:i] = False
        try:
            # Move i one index past the first index where mask is True
            # Note that this means the ratio at i will remain outside tol range
            i = np.where(mask)[0][0] + 1
        except IndexError:
            break
        amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])
        invest(df, i, amount)
    return df

然后您可以使用第500行开始重新平衡df

Then you can rebalance df starting at the 500th row using

rebalance(df, tol, i=500)

请注意,这会找到i = 500上或之后的第一行 重新平衡.它不一定会在i = 500时重新平衡.这样,您就可以为任意i调用rebalance(df, tol, i),而不必事先确定行i上是否需要重新平衡.

Note that this finds the first row on or after i=500 that needs rebalancing. It does not necessarily rebalance at i=500 itself. This allows you to call rebalance(df, tol, i) for arbitrary i without having to determine in advance if rebalancing is required on row i.

这篇关于AttributeError:"DataFrame"对象在Python中没有属性"colmap"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆