如何使用 apply family 而不是嵌套 for 循环来解决我的问题 [英] how to use apply family instead of nested for loop for my problem
问题描述
我想根据名为 dfnew1
的旧数据框的条件填充名为 hd5
的新数据框.
I want to fill a new data frame called hd5
based on a conditions from a old data frame called dfnew1
.
我可以不用嵌套的 for
循环吗?
Can I do it without a nested for
loop ?
for( j in 2 : length(hd6) )
{
for( i in 1: length(hd5$DATE) )
{
abcd= dfnew1 %>%
filter( (Date == hd5$DATE[i]) , (StrikePrice== hd6[j]) , (OptionType== "CE")) %>%
arrange( dte )
hd5[i,j]= abcd[1,9]
}
}
hd6= [13900,14000,14100,14200]
hd6= [13900,14000,14100,14200]
dfnew1 看起来像这样
dfnew1 looks like this
Date expiry optiontype strikeprice closeprice dte
1/1/2019 31/1/2019 ce 13900 700 30
1/1/2019 31/1/2019 ce 14000 650 30
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
1/2/2019 31/1/2019 ce 13900 800 29
1/2/2019 31/1/2019 ce 14000 750 29
1/2/2019 31/1/2019 ce 14100 700 29
我想通过调整日期、strtkeprice 和 optiontype 来从这个 dfnew1 数据帧填充我的新数据帧 hd5
i want to fill my new dataframe hd5 from this dfnew1 dataframe by maching the date and strtkeprice and optiontype
我想要填充的 hd5 应该看起来像
hd5 which i want to filled should look like
Date 13900 14000 14100 14200
1/1/2019 700 650 600 550
1/2/2019 800 750 700 650
推荐答案
这里有一个 tidyverse 选项:
Here's a tidyverse option:
library(dplyr)
# library(tidyr)
dat %>%
group_by(Date, strikeprice) %>%
summarize(closeprice = min(closeprice)) %>%
ungroup() %>%
tidyr::pivot_wider(names_from = "strikeprice", values_from = "closeprice")
# # A tibble: 2 x 4
# Date `13900` `14000` `14100`
# <chr> <int> <int> <int>
# 1 1/1/2019 700 650 600
# 2 1/2/2019 800 750 700
(您可能会看到引用 tidyr::spread
的在线教程.它在这里有效地做了同样的事情,但是 退休(来源:https://tidyr.tidyverse.org/reference/spread.html,以及tidyr::gather
),所以一般建议新代码使用pivot_*
功能.)
(You might see online tutorials referencing tidyr::spread
. It does effectively the same thing here, but has been retired (source: https://tidyr.tidyverse.org/reference/spread.html, along with tidyr::gather
), so it is generally recommended that new code should use the pivot_*
functions.)
注意:根据您的预期输出,您似乎取了
Note: based on your expected output, it looks like you took the minimum for
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
我可能更倾向于(当涉及价格"时)使用 sum
代替,但这在很大程度上取决于您的实际意图和用途.将 min
替换为您选择的聚合,可以是 max
、sum
或其他内容.
I might be more inclined (when "price" is involved) to use sum
instead, but it depends heavily on your actual intent and use. Replace min
with your aggregation of choice, be it max
, sum
, or something else.
我会注意到,使用数字列名有点不标准,可能会引起混淆(dat[,14100]
会失败,dat[,\
14100`]or
dat[,"14100"]` 通常应该可以工作).
I'll note that having numeric column names is a little non-standard, and can cause confusion (dat[,14100]
will fail, dat[,\
14100`]or
dat[,"14100"]` should generally work).
您可能会发现具有数字列标题对于某些比较和描述表格是有意义的,但是如果您计划绘制事物(例如,使用 ggplot2
),通常可能更喜欢更长的版本(您的原始布局,尽管进行了总结).
You may find that having numeric column headers makes sense for some comparisons and for depicting a table, but if you plan on plotting things (e.g., using ggplot2
), often a longer version (your original layout, summarizing notwithstanding) might be preferred.
数据:
dat <- read.table(header = TRUE, stringsAsFactors = FALSE, text = "
Date expiry optiontype strikeprice closeprice dte
1/1/2019 31/1/2019 ce 13900 700 30
1/1/2019 31/1/2019 ce 14000 650 30
1/1/2019 31/1/2019 ce 14100 600 30
1/1/2019 31/2/2019 ce 14100 900 58
1/2/2019 31/1/2019 ce 13900 800 29
1/2/2019 31/1/2019 ce 14000 750 29
1/2/2019 31/1/2019 ce 14100 700 29")
这篇关于如何使用 apply family 而不是嵌套 for 循环来解决我的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!