仅使用传递的参数子集来创建namedtuple对象 [英] Creating a namedtuple object using only a subset of arguments passed

查看:112
本文介绍了仅使用传递的参数子集来创建namedtuple对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下方法从MySQL数据库中将字典中的行作为字典(使用SSDictCursor)并进行一些处理:

I am pulling rows from a MySQL database as dictionaries (using SSDictCursor) and doing some processing, using the following approach:

from collections import namedtuple

class Foo(namedtuple('Foo', ['id', 'name', 'age'])):
    __slots__ = ()

    def __init__(self, *args):
        super(Foo, self).__init__(self, *args)

    # ...some class methods below here

class Bar(namedtuple('Bar', ['id', 'address', 'city', 'state']):
    __slots__ = ()

    def __init__(self, *args):
        super(Bar, self).__init__(self, *args)

    # some class methods here...

# more classes for distinct processing tasks...

要使用namedtuple,我必须事先确切知道我想要的字段,这很好.但是,我想允许用户向我的程序中输入一个简单的SELECT *语句,然后该语句将迭代结果集的行,并使用这些不同的类执行多个任务.为了使这项工作有效,我的类必须以某种方式检查从游标进入的N个字段,并且仅采用特定的子集M< M< M<<<<<<& N与namedtuple定义所期望的名称相对应.

To use namedtuple, I have to know exactly the fields I want beforehand, which is fine. However, I would like to allow the user to feed a simple SELECT * statement into my program, which will then iterate through the rows of the result set, performing multiple tasks using these different classes. In order to make this work, my classes have to somehow examine the N fields coming in from the cursor and take only the particular subset M < N corresponding to the names expected by the namedtuple definition.

我的第一个想法是尝试编写一个可以应用于每个类的装饰器,该装饰器将检查该类以查看其期望的字段,并仅将适当的参数传递给新对象.但是过去几天我才刚刚开始阅读有关装饰的文章,但我对它们还没有那么自信.

My first thought was to try writing a single decorator that I could apply to each of my classes, which would examine the class to see what fields it was expecting, and pass only the appropriate arguments to the new object. But I've just started reading about decorators in the past few days, and I'm not that confident yet with them.

所以我的问题分为两个部分:

So my question is in two parts:

  1. 是否可以使用单个装饰器来确定要装饰的特定类需要哪些字段?
  2. 是否存在具有相同功能且更易于使用,修改和理解的替代方案?

我有太多潜在的表和字段排列,每个结果集中有数百万行,因此只编写一个通用的namedtuple子类来处理每个不同的任务.查询时间和可用内存已被证明是限制因素.

I have too many potential permutations of tables and fields, with millions of rows in each result set, to just write one all-purpose namedtuple subclass to deal with each different task. Query time and available memory have proven to be limiting factors.

如果需要:

>>> sys.version
'2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)]'

推荐答案

首先,您必须重写__new__才能自定义namedtuple创建,因为namedtuple__new__方法会在检查其参数之前您甚至可以访问__init__.

First, you have to override __new__ in order to customize namedtuple creation, because a namedtuple's __new__ method checks its arguments before you even get to __init__.

第二,如果您的目标是接受和过滤关键字参数,则需要使用**kwargs并进行过滤并将其传递,而不仅仅是*args.

Second, if your goal is to accept and filter keyword arguments, you need to take **kwargs and filter and pass that through, not just *args.

因此,将它们放在一起:

So, putting it together:

class Foo(namedtuple('Foo', ['id', 'name', 'age'])):
    __slots__ = ()

    def __new__(cls, *args, **kwargs):
        kwargs = {k: v for k, v in kwargs.items() if k in cls._fields}
        return super(Foo, cls).__new__(cls, *args, **kwargs)


您可以用itemgetter替换该dict理解,但是每次我使用带有多个键的itemgetter时,没人知道它的含义,因此我很不情愿地停止使用它.


You could replace that dict comprehension with itemgetter, but every time I use itemgetter with multiple keys, nobody understands what it means, so I've reluctantly stopped using it.

如果有理由也可以覆盖__init__,因为一旦__new__返回Foo实例,它将被调用.

You can also override __init__ if you have a reason to do so, because it will be called as soon as __new__ returns a Foo instance.

但是您并不需要仅仅这样做,因为namedtuple的__init__不会接受任何参数或执行任何操作.值已在__new__中设置(与tuple和其他不可变类型一样).看起来在CPython 2.7中,实际上您可以 super(Foo, self).__init__(*args, **kwargs)它将被忽略,但是在PyPy 1.9和CPython 3.3中,您会遇到TypeError.无论如何,没有理由通过它们,也没有人说它应该起作用,因此即使在CPython 2.7中也不要这样做.

But you don't need to just for this, because the namedtuple's __init__ doesn't take any arguments or do anything; the values have already been set in __new__ (just as with tuple, and other immutable types). It looks like with CPython 2.7, you actually can super(Foo, self).__init__(*args, **kwargs) and it'll just be ignored, but with PyPy 1.9 and CPython 3.3, you get a TypeError. At any rate, there's no reason to pass them, and nothing saying it should work, so don't do it even in CPython 2.7.

请注意,您__init__将获得未过滤的kwargs.如果要更改此设置,可以在__new__中原位更改kwargs,而不是创建新词典.但是我相信仍然不能保证做任何事情.只是保证实现是经过过滤的args还是未经过滤的,而不是保证未经过滤的.

Note that you __init__ will get the unfiltered kwargs. If you want to change that, you could mutate kwargs in-place in __new__, instead of making a new dictionary. But I believe that still isn't guaranteed to do anything; it just makes it implementation-defined whether you get the filtered args or unfiltered, instead of guaranteeing the unfiltered.

那么,您能把这个总结一下吗?当然可以!

So, can you wrap this up? Sure!

def LenientNamedTuple(name, fields):
    class Wrapper(namedtuple(name, fields)):
        __slots__ = ()
        def __new__(cls, *args, **kwargs):
            args = args[:len(fields)]
            kwargs = {k: v for k, v in kwargs.items() if k in fields}
            return super(Wrapper, cls).__new__(cls, *args, **kwargs)
    return Wrapper

请注意,这样做的好处是不必使用准私有/半文档化的_fields类属性,因为我们已经有fields作为参数.

Note that this has the advantage of not having to use the quasi-private/semi-documented _fields class attribute, because we already have fields as a parameter.

另外,正如我们在评论中所建议的那样,我在其中添加了一行以抛弃所有多余的位置参数.

Also, while we're at it, I added a line to toss away any excess positional arguments, as suggested in a comment.

现在,您只需像使用namedtuple一样使用它,它将自动忽略任何多余的参数:

Now you just use it as you'd use namedtuple, and it automatically ignores any excess arguments:

class Foo(LenientNamedTuple('Foo', ['id', 'name', 'age'])):
    pass

print(Foo(id=1, name=2, age=3, spam=4))

print(Foo(1、2、3、4、5)) 打印(Foo(1,age = 3,name = 2,egg = 4))

    print(Foo(1, 2, 3, 4, 5))     print(Foo(1, age=3, name=2, eggs=4))

我已经上传了一个测试,在2.6x兼容性的genexpr上用dict()替换了dict理解( 2.6是带有namedtuple的最早版本,但没有args被截断.它在CPython 2.6.7、2.7.2、2.7.5、3.2.3、3.3.0和3.3.1,PyPy 1.9.0中可与位置,关键字和混合参数(包括乱序关键字)一起使用和2.0b1,以及Jython 2.7b.

I've uploaded a test, replacing the dict comprehension with dict() on a genexpr for 2.6 compatibility (2.6 is the earliest version with namedtuple), but without the args truncating. It works with positional, keyword, and mixed args, including out-of-order keywords, in CPython 2.6.7, 2.7.2, 2.7.5, 3.2.3, 3.3.0, and 3.3.1, PyPy 1.9.0 and 2.0b1, and Jython 2.7b.

这篇关于仅使用传递的参数子集来创建namedtuple对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆