数据类与输入.NamedTuple主要用例 [英] Data Classes vs typing.NamedTuple primary use cases

查看:106
本文介绍了数据类与输入.NamedTuple主要用例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PEP-557 将数据类引入了Python标准库,基本上可以扮演与collections.namedtupletyping.NamedTuple相同的角色.现在,我想知道如何将用例命名为仍然更好的解决方案分开.

PEP-557 introduced data classes into Python standard library, that basically can fill the same role as collections.namedtuple and typing.NamedTuple. And now I'm wondering how to separate the use cases in which namedtuple is still a better solution.

当然,如果需要的话,所有功劳归于dataclass:

Of course, all the credit goes to dataclass if we need:

  • 可变对象
  • 继承支持
  • property装饰器,可管理的属性
  • 开箱即用生成的方法定义或可自定义的方法定义
  • mutable objects
  • inheritance support
  • property decorators, manageable attributes
  • generated method definitions out of the box or customizable method definitions

在同一PEP中简要说明了数据类的优点:为什么不只使用namedtuple .

Data classes advantages are briefly explained in the same PEP: Why not just use namedtuple.

但是对namedtuples却提出了一个相反的问题:为什么不只使用数据类呢? 从性能的角度来看,我认为namedtuple可能更好,但尚未对此进行确认.

But how about an opposite question for namedtuples: why not just use dataclass? I guess probably namedtuple is better from the performance standpoint but found no confirmation on that yet.

让我们考虑以下情况:

我们将页面尺寸存储在一个小型容器中,该容器具有静态定义的字段,类型提示和命名访问权限.无需进一步的哈希,比较等.

We are going to store pages dimensions in a small container with statically defined fields, type hinting and named access. No further hashing, comparing and so on are needed.

NamedTuple方法:

from typing import NamedTuple

PageDimensions = NamedTuple("PageDimensions", [('width', int), ('height', int)])

DataClass方法:

from dataclasses import dataclass

@dataclass
class PageDimensions:
    width: int
    height: int

哪种解决方案更可取,为什么?

Which solution is preferable and why?

P.S.该问题绝不是以任何方式重复那个人,因为在这里我要问的是案件 strong>其中namedtuple更好,而不是区别(我问过之前已经检查过文档和来源)

P.S. the question isn't a duplicate of that one in any way, because here I'm asking about the cases in which namedtuple is better, not about the difference (I've checked docs and sources before asking)

推荐答案

这取决于您的需求.他们每个人都有自己的利益.

It depends on your needs. Each of them has own benefits.

这是对PyCon 2018上的数据类的很好的解释 Raymond Hettinger-数据类:结束所有代码生成器的代码生成器

Here is a good explanation of Dataclasses on PyCon 2018 Raymond Hettinger - Dataclasses: The code generator to end all code generators

Dataclass中,所有实现都是用Python编写的,与NamedTuple中一样,所有这些行为都是免费的,因为NamedTuple是从tuple继承的.并且 tuple结构是用C编写的,这就是为什么NamedTuple中的标准方法(哈希,比较等)更快的原因.

In Dataclass all implementation is written in Python, as in NamedTuple, all of these behaviors come for free because NamedTuple is inherited from tuple. And tuple structure is written in C, that's why standard methods are faster in NamedTuple (hash, comparing and etc).

但是 Dataclass基于dict ,而 NamedTuple基于tuple .据此,使用这些结构具有优缺点.例如,NamedTuple中的空间使用量较小,而Dataclass中的时间访问速度更快.

But Dataclass is based on dict as NamedTuple based on tuple. According to this, you have advantages and disadvantages of using these structures. For example, space usage is smaller in NamedTuple, but time access is faster in Dataclass.

请看我的实验

In [33]: a = PageDimensionsDC(width=10, height=10)

In [34]: sys.getsizeof(a) + sys.getsizeof(vars(a))
Out[34]: 168

In [35]: %timeit a.width
43.2 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [36]: a = PageDimensionsNT(width=10, height=10)

In [37]: sys.getsizeof(a)
Out[37]: 64

In [38]: %timeit a.width
63.6 ns ± 1.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

但是随着NamedTuple属性数量的增加,访问时间仍然很小,因为它会为每个属性创建一个带有该属性名称的属性.例如,对于我们来说,新类的名称空间部分将如下所示:

But with increasing the number of attributes of NamedTuple access time remains the same small, because for each attribute it creates a property with the name of the attribute. For example, for our case the part of the namespace of the new class will look like:

from operator import itemgetter

class_namespace = {
...
    'width': property(itemgetter(0, doc="Alias for field number 0")),
    'height': property(itemgetter(0, doc="Alias for field number 1"))**
}

在哪些情况下,namedtuple仍然是更好的选择?

In which cases namedtuple is still a better choice?

当您的数据结构需要/可以不可变,可哈希化,可迭代,不可打包,可比较时,可以使用NamedTuple .如果您的数据结构需要更复杂的东西(例如,继承的可能性),请使用Dataclass.

When your data structure needs to/can be immutable, hashable, iterable, unpackable, comparable then you can use NamedTuple. If you need something more complicated, for example, a possibility of inheritance for your data structure then use Dataclass.

这篇关于数据类与输入.NamedTuple主要用例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆