如何为我的问题使用Apply Family而不是嵌套的for循环 [英] how to use apply family instead of nested for loop for my problem
问题描述
我想根据旧数据框架 dfnew1
的条件填充新数据框架 hd5
.
I want to fill a new data frame called hd5
based on a conditions from a old data frame called dfnew1
.
我可以不用嵌套的 for
循环吗?
Can I do it without a nested for
loop ?
for( j in 2 : length(hd6) )
{
for( i in 1: length(hd5$DATE) )
{
abcd= dfnew1 %>%
filter( (Date == hd5$DATE[i]) , (StrikePrice== hd6[j]) , (OptionType== "CE")) %>%
arrange( dte )
hd5[i,j]= abcd[1,9]
}
}
hd6 = [13900,14000,14100,14200]
hd6= [13900,14000,14100,14200]
dfnew1看起来像这样
dfnew1 looks like this
Date expiry optiontype strikeprice closeprice dte
1/1/2019 31/1/2019 ce 13900 700 30
1/1/2019 31/1/2019 ce 14000 650 30
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
1/2/2019 31/1/2019 ce 13900 800 29
1/2/2019 31/1/2019 ce 14000 750 29
1/2/2019 31/1/2019 ce 14100 700 29
我想通过处理日期,strtkeprice和optiontype来填充来自dfnew1数据帧的新数据帧hd5
i want to fill my new dataframe hd5 from this dfnew1 dataframe by maching the date and strtkeprice and optiontype
hd5应该看起来像
Date 13900 14000 14100 14200
1/1/2019 700 650 600 550
1/2/2019 800 750 700 650
推荐答案
这是一个tidyverse选项:
Here's a tidyverse option:
library(dplyr)
# library(tidyr)
dat %>%
group_by(Date, strikeprice) %>%
summarize(closeprice = min(closeprice)) %>%
ungroup() %>%
tidyr::pivot_wider(names_from = "strikeprice", values_from = "closeprice")
# # A tibble: 2 x 4
# Date `13900` `14000` `14100`
# <chr> <int> <int> <int>
# 1 1/1/2019 700 650 600
# 2 1/2/2019 800 750 700
(您可能会看到引用 tidyr :: spread
的在线教程.在这里,它的作用实际上是相同的,但是已经 https://tidyr.tidyverse.org/reference/spread.html ,以及 tidyr :: gather
),因此通常建议新代码使用 pivot _ *
函数.)
(You might see online tutorials referencing tidyr::spread
. It does effectively the same thing here, but has been retired (source: https://tidyr.tidyverse.org/reference/spread.html, along with tidyr::gather
), so it is generally recommended that new code should use the pivot_*
functions.)
注意:根据您的预期输出,看来您花了
Note: based on your expected output, it looks like you took the minimum for
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
我可能更倾向于(当涉及价格"时)使用 sum
,但这在很大程度上取决于您的实际意图和使用.将 min
替换为您选择的聚合,例如 max
, sum
或其他.
I might be more inclined (when "price" is involved) to use sum
instead, but it depends heavily on your actual intent and use. Replace min
with your aggregation of choice, be it max
, sum
, or something else.
我会注意到,具有数字列名是有点不标准,并且会引起混淆( dat [,14100]
将失败, dat [,\
14100`] 或 dat [,"14100"]`通常应该可以正常工作.)
I'll note that having numeric column names is a little non-standard, and can cause confusion (dat[,14100]
will fail, dat[,\
14100`]or
dat[,"14100"]` should generally work).
您可能会发现具有数字列标题对于某些比较和描述表是有意义的,但是如果您打算绘制图形(例如,使用 ggplot2
),通常,最好使用更长的版本(尽管有您的原始布局,但要进行总结).
You may find that having numeric column headers makes sense for some comparisons and for depicting a table, but if you plan on plotting things (e.g., using ggplot2
), often a longer version (your original layout, summarizing notwithstanding) might be preferred.
数据:
dat <- read.table(header = TRUE, stringsAsFactors = FALSE, text = "
Date expiry optiontype strikeprice closeprice dte
1/1/2019 31/1/2019 ce 13900 700 30
1/1/2019 31/1/2019 ce 14000 650 30
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
1/2/2019 31/1/2019 ce 13900 800 29
1/2/2019 31/1/2019 ce 14000 750 29
1/2/2019 31/1/2019 ce 14100 700 29")
这篇关于如何为我的问题使用Apply Family而不是嵌套的for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!