使用列表在R中存储双循环(for循环)的结果 [英] Using a list to store results of a double loop (for-loop) in R

查看:545
本文介绍了使用列表在R中存储双循环(for循环)的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用for循环对单个行的元素进行计算. 我有两个data.frames

I want to make calculations for elements of individual rows using a for-loop. I have two data.frames

  1. df:包含所有交易日股票的数据
  2. 事件:仅包含股票事件日的数据

尽管对于该特定示例而言,可能有一种更简单的方法,但我想知道如何使用一个循环(for-loops)来完成这样的任务.

Even though there might be a much easier approach for this specific example, I’d like to know how to do such a task with a loop in a loop (for-loops).

首先,我的data.frames:

First, my data.frames:

comp1 <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
date1 <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
ret <- c(1.2,2.2,-0.5,0.98,0.73,-1.3,-0.02,0.3,1.1,2.0,1.9,-0.98,1.45,1.71,0.03)
df <- data.frame(comp1,date1,ret)
comp2 <- c(1,1,2,2,2,3,3)
date2 <- c(2,4,1,2,5,4,5)
q <- paste("")
events <- data.frame(comp2,date2,q)

df

#    comp1 date1   ret
# 1      1     1  1.20
# 2      1     2  2.20
# 3      1     3 -0.50
# 4      1     4  0.98
# 5      1     5  0.73
# 6      2     1 -1.30
# 7      2     2 -0.02
# 8      2     3  0.30
# 9      2     4  1.10
# 10     2     5  2.00
# 11     3     1  1.90
# 12     3     2 -0.98
# 13     3     3  1.45
# 14     3     4  1.71
# 15     3     5  0.03

events

#   comp2 date2 q
# 1     1     2  
# 2     1     4  
# 3     2     1  
# 4     2     2  
# 5     2     5  
# 6     3     4  
# 7     3     5  

我要计算df $ ret.举个例子,我们只取2 * df $ ret.每个活动日的结果应存储在mylist中.最终输出应该是data.frame事件",其中我希望将计算结果存储在列"q"中.

I want to make calculations of df$ret. As an example let's just take 2 * df$ret. The results for each event-day should be stored in mylist. The final output should be the data.frame "events" with a column "q" where I want the results of the calculation to be stored.

# important objects:
companies <- as.vector(unique(df$comp1)) # all the companies (here: 1, 2, 3)
days <- as.vector(unique(df$date1)) # all the trading-days (here: 1, 2, 3, 4, 5)
mylist <- vector('list', length(companies)) # a list where the results should be stored for each company

我想出了一些无效的代码.但是我仍然认为它应该看起来像这样:

I came up with some piece of code which doesn't work. But I still think it should look something like this:

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2


  }
  mylist[i] <- events_k   
}

我不了解如何在另一个循环中设置循环以及如何将结果存储在mylist中.任何帮助表示赞赏!

I don't understand how to set up the loop inside the other loop and how to store the results in mylist. Any help appreciated!!

谢谢!

推荐答案

不要感到难过.您所有的问题都是常见的难题.首先,尝试更改

Don't feel bad. All of your problems are common R gotchas. First, try changing

events <- data.frame(comp2,date2,q,stringsAsFactors=FALSE)

之前.您的列q被隐式转换为一个因数,稍后将不允许进行算术* 2运算.

earlier instead. Your column q is being converted to a factor implicitly, disallowing the arithmetic * 2 operation later.

接下来,让我们考虑固定循环

Next, let's consider the fixed loop

for(i in 1:nrow(events)) {
  events_k <- events[which(comp1==companies[i]),] # data of all event days of company i
  df_k <- df[which(comp2==companies[i]),] # data of all trading days of company i

  for(j in 1:nrow(df_k)) {
    events_k[j, "q"] <-
      if (0 == length(tmp <- df_k[which(days==events_k[j,"date2"]), "ret"] * 2)) NA
      else tmp
  }
  mylist[[i]] <- events_k
}

您的第一个问题是在最后一行,您使用[而不是[[(在R中,前者意味着总是用列表包装,而后者实际上访问了列表中的值).

Your first problem was on the last line, where you used [ instead of [[ (in R, the former means always wrapped with a list, whereas the latter actually accessed the value in the list).

您的第二个问题是,有时which(days==events_k[j,"date2"])numeric(0)(即没有匹配的事件日期).该代码将起作用,但是您仍然会拥有许多带有NA的数据框.要删除这些内容,您可以执行以下操作:

Your second problem is that sometimes which(days==events_k[j,"date2"]) is numeric(0) (i.e., there is no matching event date). The code will then work, but you'll still have a lot of dataframes with NAs. To remove those, you could do something like:

mylist <- Filter(function(df) nrow(df) > 0,
  lapply(mylist, function(df) df[apply(df, 1, function(row) !all(is.na(row))), ]))

它将过滤出具有空数据框的列表元素,并过滤出具有所有NA的数据框中的行.

which will filter out list elements with empty dataframes, and rows in dataframes with all NA.

这篇关于使用列表在R中存储双循环(for循环)的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆