R中的列表和配对列表有什么区别? [英] What is the difference between a list and a pairlist in R?

查看:197
本文介绍了R中的列表和配对列表有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读列表文档时,我发现了对列表的引用,但我不清楚它们与列表有何不同.

In reading the documentation for lists, I found references to pairlists, but it wasn't clear to me how they were different from lists.

推荐答案

每天R的配对列表

在日常R中,配对列表通常会出现在两个地方.一个是形式函数:

Pairlists in day to day R

There are two places that pair lists will show up commonly in day to day R. One is as function formals:

str(formals(var))

另一个是作为语言对象.例如:

The other is as language objects. For example:

quote(1 + 1)

生成类型语言的配对列表(内部为LANGSXP).甚至还要注意这一点的主要原因是,由于内部存储对表,诸如length(<language object>)language_object[[x]]之类的操作可能很慢(尽管长对表语言对象很少见;注释表达式不是成对表).

produces a pairlist of type language (LANGSXP internally). The principal reason why you would even care about being aware of this is that operations such as length(<language object>) or language_object[[x]] can be slow because of how pairlist are stored internally (though long pairlist language objects are somewhat rare; note expressions are not pairlists).

请注意,空元素只是零长度的符号,如果稍作弊,您实际上可以将它们存储在列表中(尽管您可能不应该这样做):

Note that empty elements are just zero length symbols, and you can actually store them in lists if you cheat a bit (though you probably shouldn't do this):

list(x=substitute(x, alist(x=)))  # hack alert

总而言之,OP是正确的,除非您正在编写用于R的C代码,否则您无需过多担心结对列表.

All that said, for the most part, OP is correct that you don't need to worry about pairlists too much unless you are writing C code for use in R.

配对列表和列表的主要区别在于它们的存储结构.配对表存储为节点链,其中每个节点除指向节点的内容和节点的名称"外,还指向下一个节点的位置(请参见

Pairlists and list are different principally in their storage structure. Pairlists are stored as a chain of nodes, where each node points to the location of the next node in addition to the node's contents and the node's "name" (See CAR/CDR wiki article for generic discussion). Among other things this means you can't know how many elements there are in a pairlist unless you know what element is the first one, and you then traverse the entire list.

配对列表在R的内部结构中得到了广泛的使用,并且确实存在于正常的R使用中,但是大多数时间都是通过打印或访问方法来伪装的,并且/或者在访问时被强制转换为列表.

Pairlists are used extensively in the R internals, and do exist in normal R use, but most of the time are disguised by the print or access methods and/or coerced to lists when accessed.

列表也是地址列表,但是与成对列表不同,所有地址都存储在一个连续的内存位置中,并且跟踪总长度.这使按位置访问列表的任意成员变得容易,因为您只需在内存表中查找地址即可.使用成对表时,您将不得不从一个节点跳到另一个节点,直到最终到达所需的节点为止.名称也存储为适当的列表属性,而不是附加到对列表的每个节点上.

Lists are also a list of addresses, but unlike pairlists, all the addresses are stored in one contiguous memory location and the total length is tracked. This makes it easy to access any arbitrary member of the list by location since you can just look up the address in the memory table. With a pairlist, you would have to jump from node to node until you eventually got to the desired node. Names are also stored as attributes of the list proper, instead of being attached to each node of a pairlist.

成对列表的一个优点(通常很小)是,您可以以最小的开销将它们添加到它们中,因为您最多只需要修改两个节点(新节点之前的节点和新节点本身),而只需一个列表即可您可能需要随着大小的增加重新分配整个地址表(这通常不成问题,因为与表所指向的数据大小相比,地址表通常很小).还有许多专门用于成对列表处理的算法(例如排序,索引等),但是也可以移植到普通列表中.

One (generally small) benefit of pairlists is that you can add to them with minimal overhead since you only need modify at most two nodes (the node ahead of the new node, and the new node itself), whereas with a list you may need to re-allocate the entire address table with an increase in size (this is typically not much of an issue since the address table is usually very small compared to the size of the data the table points to). There are also many algorithms that specialize in pairlist manipulation (e.g. sorting, indexing, etc.), but those can be ported to normal lists as well.

由于您只能在内部使用,因此与日常使用无关紧要,因此从编程的角度来看,通过更改任意元素指向的内容来修改列表非常容易.

Less relevant for day-to-day use since you can only do this in internals, it is very easy to modify list from a programming perspective by changing what any arbitrary element points to.

与上述内容松散相关,当您具有高度嵌套的对象时,配对列表可能会更有效.列表可以很容易地复制这种结构,但是每个列表和嵌套列表都将附加额外的内存地址表.这可能是因为成对列表用于嵌套/元素比例很高的语言对象的原因.

Loosely related to the above, pairlists are likely be more efficient when you have highly nested objects. lists can easily replicate this structure, but each list and nested list will be saddled with the extra memory address table. This is likely the reason pairlists are used for language objects that very likely have a high nesting / element ratio.

有关更多详细信息,请参见 R Internals (在链接位置分别查找LISTSXP和VECSXP,成对列表和列表).

For more details see R Internals (look for LISTSXP and VECSXP, pairlists and lists respectively, in the linked location).

edit:有趣的是,将列表的内存占用量与对列表进行比较的实验显示对列表更大,因此存储效率参数可能不正确(不确定object.size是否可以在此受信任):

edit: interestingly an experiment to compare the memory footprint of a list to a pairlist shows the pairlist to be larger, so the storage efficiency argument may be incorrect (not sure if object.size can be trusted here):

> plist_to_list <- function(x) {
+   if(is.call(x)) x <- as.list(x)
+   if(length(x) > 1) for(i in 2:length(x)) x[[i]] <- Recall(x[[i]])
+   x
+ }
> add_quote <- function(x, y) call("+", x, y)
> x <- Reduce(add_quote, lapply(letters, as.name))
> object.size(x)
7056 bytes
> y <- plist_to_list(x)
> object.size(y)
4656 bytes

这篇关于R中的列表和配对列表有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆