向量上的 Copy-on-modify 语义不会附加到循环中.为什么? [英] Copy-on-modify semantic on a vector does not append in a loop. Why?

查看:68
本文介绍了向量上的 Copy-on-modify 语义不会附加到循环中.为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题听起来是部分回答此处 但这对我来说还不够具体.我想更好地理解何时通过引用更新对象以及何时复制对象.

This question sounds to be partially answered here but this is not enough specific to me. I would like to understand better when an object is updated by reference and when it is copied.

更简单的例子是向量增长.以下代码在 R 中效率极低,因为在循环之前没有分配内存,并且在每次迭代时都会进行复制.

The simpler example is vector growing. The following code is blazingly inefficient in R because the memory is not allocated before the loop and a copy is made at each iteration.

  x = runif(10)
  y = c() 

  for(i in 2:length(x))
    y = c(y, x[i] - x[i-1])

分配内存可以保留一些内存,而无需在每次迭代时重新分配内存.因此,这段代码要快得多,尤其是对于长向量.

Allocating the memory enable to reserve some memory without reallocating the memory at each iteration. Thus this code is drastically faster especially with long vectors.

  x = runif(10)
  y = numeric(length(x))

  for(i in 2:length(x))
    y[i] = x[i] - x[i-1]

我的问题来了.实际上,当一个向量被更新时,它确实移动了.有一个如下图所示的副本.

And here comes my question. Actually when a vector is updated it does move. There is a copy that is made as shown below.

a = 1:10
pryr::tracemem(a)
[1] "<0xf34a268>"
a[1] <- 0L
tracemem[0xf34a268 -> 0x4ab0c3f8]:
a[3] <-0L
tracemem[0x4ab0c3f8 -> 0xf2b0a48]:  

但在循环中不会发生此副本

But in a loop this copy does not occur

y = numeric(length(x))
for(i in 2:length(x))
{
   y[i] = x[i] - x[i-1]
   print(address(y))
}

给予

[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0"
[1] "0xe849dc0" 

我理解为什么代码会随着内存分配而变慢或变快,但我不理解 R 逻辑.为什么以及如何,对于同一语句,在一种情况下通过引用进行更新,而在另一种情况下通过复制进行更新.在一般情况下,我们怎么知道会发生什么.

I understand why a code is slow or fast as a function of the memory allocations but I don't understand the R logic. Why and how, for the same statement, in a case the update is made by reference and in the other case the update in made by copy. In the general case how can we know what will happen.

推荐答案

我完成了@MikeH.使用此代码的 awnser

I complete the @MikeH. awnser with this code

library(pryr)

x = runif(10)
y = numeric(length(x))
print(c(address(y), refs(y)))

for(i in 2:length(x))
{
  y[i] = x[i] - x[i-1]
  print(c(address(y), refs(y)))
}

print(c(address(y), refs(y)))

输出清楚地显示发生了什么

The output shows clearly what happened

[1] "0x7872180" "2"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1"        
[1] "0x765b860" "1" 
[1] "0x765b860" "2"  

在第一次迭代时有一个副本.确实因为 Rstudio 有 2 个 refs.但是在这第一个副本之后 y 属于循环并且不可用于全局环境.然后,Rstudio 不会创建任何额外的引用,因此在下次更新期间不会进行复制.y 通过引用更新.循环退出 y 在全局环境中变得可用.Rstudio 创建了一个额外的 refs,但这个动作并没有明显改变地址.

There is a copy at the first iteration. Indeed because of Rstudio there are 2 refs. But after this first copy y belongs in the loops and is not available into the global environment. Then, Rstudio does not create any additional refs and thus no copy is made during the next updates. y is updated by reference. On loop exit y become available in the global environment. Rstudio creates an extra refs but this action does not change the address obviously.

这篇关于向量上的 Copy-on-modify 语义不会附加到循环中.为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆