将数据框拆分为嵌套列表的问题 [英] Problems splitting data frame into a nested list

查看:20
本文介绍了将数据框拆分为嵌套列表的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 R 的新手,我在将非常大的数据框拆分为嵌套列表时遇到了问题.我试图在互联网上寻求帮助,但没有成功.

I am a newbie to R and I have problem splitting a very large data frame into a nested list. I tried to look for help on the internet, but I was unsuccessful.

我有一个关于如何组织数据的简化示例:

I have a simplified example on how my data are organized:

标题是:

1 "station" (number)
2. "date.str" (date string)
3. "member"
4. "forecast time"
5. "data"

我不确定我的数据示例是否会正确显示,但如果是这样,它看起来像这样:

I am not sure my data example will show up rightly, but if so, it look like this:

1. station date.str member forecast.time data1
2. 6019 20110805 mbr000 06 77
3. 6031 20110805 mbr000 06 28
4. 6071 20110805 mbr000 06 45
5. 6019 20110805 mbr001 12 22
6. 6019 20110806 mbr024 18 66

我想将大数据框拆分为站"、成员"、date.str"和forecast.time"之后的嵌套列表.因此 mylist[[c(s,m,d,t)]] 包含一个数据框,其中包含站s"的数据和成员m"的日期.strd"和预测时间t"的数据,保存值s、m、d 和 t.

I want to split the large data frame into a nested list after "station", "member", "date.str" and "forecast.time". So that mylist[[c(s,m,d,t)]] contains a data frame with data for station "s" and member "m" for date.str "d" and for forecast time "t" conserving the values of s, m, d and t.

data.st <- list()
data.st.member <- list()
data.st.member.dato <- list()

data.st. <- split(mydata, mydata$station)
data.st.member <- lapply(data.st, FUN = fsplit.member)

(我创建了一个函数在成员"之后拆分)

(I created a function to split after "member")

#Loop over station number:
for (s in 1:S){

#Loop over members:
for (m in 1:length(members){
tmp <- split( data.st.member[[s]][[m]], data.st.member[[s]][[m]]$dato.str )

#Loop over number of different "date.str"s
for (t in 1:length(no.date.str) ){
data.st.member.dato[[s]][[m]][[t]] <- tmp}
} #end m loop
} #end s loop

我也想按照预测时间分:forec.time,但是我没到那个程度.

I would also like to split according to the forecast time: forec.time, but I didn't get that far.

我在循环中尝试了几种不同的配置,所以我目前没有一致的错误消息.我无法弄清楚,我在做什么或想错了什么.

I have tried a couple of different configurations within the loops, so I don't at the moment have a consistent error message. I can't figure out, what I am doing or thinking wrong.

非常感谢任何帮助!

问候西塞

推荐答案

我也想和其他人一样,这种递归数据结构将很难使用,而且可能有更好的方法.请查看 Richie 建议的拆分-应用-组合方法.但是,约束可能是外部的,因此这里是使用 plyr 库的答案.

I also want to echo the others in that this recursive data structure is going to be difficult to work with and probably there are better ways. Do look at the split-apply-combine approach as Richie suggested. However, the constraints may be external, so here is an answer using the plyr library.

mylist <- dlply(mydata, .(station), dlply, .(memeber), dlply, .(date.str), dlply, .(forecast.time), identity)

使用您为 mydata 提供的数据片段,

Using the snippet of data you gave for mydata,

> mylist[[c("6019","mbr000","20110805","6")]]
  station date.str member forecast.time data1
1    6019 20110805 mbr000             6    77

这篇关于将数据框拆分为嵌套列表的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆