如何避免循环 [英] how to avoid loops

查看:65
本文介绍了如何避免循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好, 我是R的新手.

我有两个面板数据文件,其列为"id","date"和"ret"

文件A比文件B具有更多的数据, 但我主要是处理文件B数据.

"id"和"date"的组合是不正确的标识符.

是否有一种查找B中每个(id,日期)的优雅方法,我需要从文件A中获取过去10天的信息,然后将它们存储回B中?

我天真的做法是循环遍历B中的所有行,

for i in 1:length(B) {
    B$past10d[i] <- prod(1+A$ret[which(A$id == B$id[i] & A$date > B$date[i]-10 & A$date < B$date[i])])-1
}

,但是循环需要永远的时间.

真的很感谢您的想法.

非常感谢您.

我认为关键是矢量化并使用%in%运算符对数据帧A进行子集化.而且,我知道价格不是随机数,但我不想编写随机游标...我使用paste创建了一个股票-日期索引,但是我确定您可以使用plm库中,这是我发现的有关面板数据的最佳记录.

A  <- data.frame(stock=rep(1:10, each=100), date=rep(Sys.Date()-99:0, 10), price=rnorm(1000))
B <- A[seq(from=100, to=1000, by=100), ]
A <- cbind(paste(A$stock, A$date, sep="-"), A)
B <- cbind(paste(B$stock, B$date, sep="-"), B)
colnames(A) <- colnames(B) <- c("index", "stock", "date", "price")
index <- which(A[, 1] %in% B[, 1])
returns <- (A$price[index] - A$price[index-10]) / A$price[index-10]
B <- cbind(B, returns)

HI All, I'm new to R.

I have two panel data files, with columns "id", "date" and "ret"

file A has a lot more data than file B, but i'm primarily working with file B data.

Combination of "id" and "date" is unqiue indentifier.

Is there an elegent way of looking up for each (id, date) in B, I need to get the past 10 days ret from file A, and store them back into B?

my naive way of doing it is to loop for all rows in B,

for i in 1:length(B) {
    B$past10d[i] <- prod(1+A$ret[which(A$id == B$id[i] & A$date > B$date[i]-10 & A$date < B$date[i])])-1
}

but the loops takes forever.

Really appreciate your thoughts.

Thank you very much.

解决方案

I think the key is to vectorize and use the %in% operator to subset data frame A. And, I know, prices are not random numbers, but I didn't want to code a random walk... I created a stock-date index using paste, but I'm sure you could use the index from pdata.frame in the plm library, which is the best I've found for panel data.

A  <- data.frame(stock=rep(1:10, each=100), date=rep(Sys.Date()-99:0, 10), price=rnorm(1000))
B <- A[seq(from=100, to=1000, by=100), ]
A <- cbind(paste(A$stock, A$date, sep="-"), A)
B <- cbind(paste(B$stock, B$date, sep="-"), B)
colnames(A) <- colnames(B) <- c("index", "stock", "date", "price")
index <- which(A[, 1] %in% B[, 1])
returns <- (A$price[index] - A$price[index-10]) / A$price[index-10]
B <- cbind(B, returns)

这篇关于如何避免循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆