NumPy“记录数组"或“结构化阵列"或"recarray" [英] NumPy "record array" or "structured array" or "recarray"
问题描述
NumPy结构化数组",记录数组"和"recarray"之间的区别是什么?
What, if any, is the difference between a NumPy "structured array", a "record array" and a "recarray"?
NumPy文档暗示前两个是相同的:如果它们是,是该对象的首选术语?
The NumPy docs imply that the first two are the same: if they are, which is the prefered term for this object?
同一文档说(在页面底部): 您可以在此处找到有关Recarray和结构化数组的更多信息(包括两者之间的区别).是否有这种区别的简单解释?
The same documentation says (at the bottom of the page): You can find some more information on recarrays and structured arrays (including the difference between the two) here. Is there a simple explanation of this difference?
推荐答案
记录/记录数组是在
https://github.com/numpy/numpy/blob/master/numpy/core/records.py
此文件中的一些相关引号
Some relevant quotes from this file
记录数组 记录数组将结构化数组的字段公开为属性. recarray几乎与标准数组(支持 已经命名的字段)最大的区别是它可以使用 属性查找以查找字段,它是使用以下方法构造的 一条记录.
Record Arrays Record arrays expose the fields of structured arrays as properties. The recarray is almost identical to a standard array (which supports named fields already) The biggest difference is that it can use attribute-lookup to find the fields and it is constructed using a record.
recarray
是ndarray
的子类(与matrix
和masked arrays
相同).但是请注意,它的构造函数与np.array
不同.它更像是np.empty(size, dtype)
.
recarray
is a subclass of ndarray
(in the same way that matrix
and masked arrays
are). But note that it's constructor is different from np.array
. It is more like np.empty(size, dtype)
.
class recarray(ndarray):
"""Construct an ndarray that allows field access using attributes.
This constructor can be compared to ``empty``: it creates a new record
array but does not fill it with data.
将唯一字段作为属性行为实现的关键功能是__getattribute__
(__getitem__
实现索引):
The key function for implementing the unique field as attribute behavior is __getattribute__
(__getitem__
implements indexing):
def __getattribute__(self, attr):
# See if ndarray has this attr, and return it if so. (note that this
# means a field with the same name as an ndarray attr cannot be
# accessed by attribute).
try:
return object.__getattribute__(self, attr)
except AttributeError: # attr must be a fieldname
pass
# look for a field with this name
fielddict = ndarray.__getattribute__(self, 'dtype').fields
try:
res = fielddict[attr][:2]
except (TypeError, KeyError):
raise AttributeError("recarray has no attribute %s" % attr)
obj = self.getfield(*res)
# At this point obj will always be a recarray, since (see
# PyArray_GetField) the type of obj is inherited. Next, if obj.dtype is
# non-structured, convert it to an ndarray. If obj is structured leave
# it as a recarray, but make sure to convert to the same dtype.type (eg
# to preserve numpy.record type if present), since nested structured
# fields do not inherit type.
if obj.dtype.fields:
return obj.view(dtype=(self.dtype.type, obj.dtype.fields))
else:
return obj.view(ndarray)
首先,它尝试获取常规属性-如.shape
,.strides
,.data
以及所有方法(.sum
,.reshape
等).如果失败,它将在dtype
字段名称中查找名称.因此,它实际上只是带有一些重新定义的访问方法的结构化数组.
It first it tries to get a regular attribute - things like .shape
, .strides
, .data
, as well as all the methods (.sum
, .reshape
, etc). Failing that it then looks up the name in the dtype
field names. So it is really just a structured array with some redefined access methods.
最好告诉我record array
和recarray
是相同的.
As best I can tell record array
and recarray
are the same.
另一个文件显示了一些历史记录
Another file shows something of the history
https://github.com/numpy/numpy/blob/master/numpy/lib/recfunctions.py
用于处理结构化数组的实用程序的集合. 其中大多数功能最初是由约翰·亨特(John Hunter)为 matplotlib.为了方便起见,对它们进行了重写和扩展.
Collection of utilities to manipulate structured arrays. Most of these functions were initially implemented by John Hunter for matplotlib. They have been rewritten and extended for convenience.
此文件中的许多功能以:
Many of the functions in this file end with:
if asrecarray:
output = output.view(recarray)
您可以将数组作为recarray
视图返回的事实表明了该层的厚度.
The fact that you can return an array as recarray
view shows how 'thin' this layer is.
numpy
历史悠久,并且合并了几个独立的项目.我的印象是recarray
是一个较旧的想法,结构化数组是基于通用dtype
构建的当前实现的.与任何新开发版本相比,recarrays
似乎是为了方便起见和向后兼容而保留的.但是我必须研究github
文件的历史记录,以及所有近期的问题/请求才能确定.
numpy
has a long history, and merges several independent projects. My impression is that recarray
is an older idea, and structured arrays the current implementation that built on a generalized dtype
. recarrays
seem to be kept for convenience and backward compatibility than any new development. But I'd have to study the github
file history, and any recent issues/pull requests to be sure.
这篇关于NumPy“记录数组"或“结构化阵列"或"recarray"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!