在循环中动态添加方法时作用域陷阱 [英] Scope gotcha when dynamically adding methods in a loop

查看:47
本文介绍了在循环中动态添加方法时作用域陷阱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用于分析我的运动数据的API(我刮过 runkeeper '的网站).

I have an API for analysing my exercise data (which I scrape runkeeper's website).

我的主类是pandas.DataFrame的子类,它基本上是表格数据的容器.它支持按列名建立索引,并返回列值的数组.

My main class is a subclass of a pandas.DataFrame, which is basically a container for tabular data. It supports indexing by column name, returning an array of the column values.

我想根据数据中存在的健身活动"的类型添加一些便利属性.例如,我想添加一个属性"running":

I would like to add some convenience properties based on the types of 'fitness activities' that are present in the data. So for example I'd like to add a property 'running':

@property
def running(self):
    return self[self['type'] == 'running']

这将返回DataFrame中在类型"列中具有正在运行"的所有行.

Which would return all rows of the DataFrame which have 'running' in the 'type' column.

我尝试对数据中存在的所有类型动态地执行此操作.这是我天真地做的事情:

I tried to do this dynamically for all types present in the data. Here's what I naively did:

class Activities(pandas.DataFrame):
    def __init__(self,data):
        pandas.DataFrame.__init__(self,data)
        # The set of unique types in the 'type' column:
        types = set(self['type'])
        for type in types:
            method = property(lambda self: self[self['type'] == type])
            setattr(self.__class__,type,method)

结果是所有这些属性最终返回了针对同一类型活动(步行")的数据表.

The result was that all of these properties ended up returning tables of data for the same type of activity ('walking').

正在发生的事情是,当访问属性时,将调用lambda,并且它们会在为类型"名称定义的范围内查找.他们发现它绑定到字符串"walking",因为那是for循环的最后一次迭代. for循环的每个迭代都没有自己的命名空间,因此所有的lambda只看到最后一个迭代,而不是"type"在实际定义时所具有的值.

What's happening is that when the properties are accessed, the lambdas are called and they look in the scope they were defined in for the name 'type'. They find that it is bound to the string 'walking', since that was the last iteration of the for loop. Each iteration of the for loop doesn't have its own namespace, so all the lambdas see only the last iteration, rather than the value that 'type' had when they were actually defined.

有人能解决这个问题吗?我可以想到两个,但它们似乎并不是特别理想:

Can anyone thing of a way around this? I can think of two, but they don't seem particularly ideal:

  1. 定义__getattr__以检查该属性是否为活动类型,并返回适当的行.

  1. define __getattr__ to check that the attribute is an activity type and return the appropriate rows.

使用递归函数调用而不是for循环,以便每个递归级别都有其自己的命名空间.

use a recursive function call instead of a for loop, so that each level of recursion has its own namespace.

这两个都不太适合我的口味,并且pandas.DataFrame已经有一个__getattr__,如果我也做一个,我必须谨慎地与之交互.递归是可行的,但由于类型集没有任何内在的树状结构,因此感觉很不对劲.它是扁平的,在代码中应该看起来扁平!

Both of these are a little too clever for my tastes, and pandas.DataFrame already has a __getattr__ that I'd have to cautiously interact with if I made one too. And recursion would work, but feels very wrong since the set of types doesn't have any intrinsic tree-like structure to it. It's flat, and should look flat in the code!

推荐答案

修改lambda将值拉入新范围.

method = property(lambda self=self, type=type: self[self['type'] == type])

这篇关于在循环中动态添加方法时作用域陷阱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆