将数据框拆分为嵌套列表时出现问题 [英] Problems splitting data frame into a nested list

查看:63
本文介绍了将数据框拆分为嵌套列表时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的新手,在将很大的数据帧拆分为嵌套列表时遇到问题.我试图在互联网上寻求帮助,但没有成功.

I am a newbie to R and I have problem splitting a very large data frame into a nested list. I tried to look for help on the internet, but I was unsuccessful.

我有一个有关如何组织数据的简化示例:

I have a simplified example on how my data are organized:

标题为:

1 "station" (number)
2. "date.str" (date string)
3. "member"
4. "forecast time"
5. "data"

我不确定我的数据示例是否正确显示,但是如果是这样,它看起来像这样:

I am not sure my data example will show up rightly, but if so, it look like this:

1. station date.str member forecast.time data1
2. 6019 20110805 mbr000 06 77
3. 6031 20110805 mbr000 06 28
4. 6071 20110805 mbr000 06 45
5. 6019 20110805 mbr001 12 22
6. 6019 20110806 mbr024 18 66

我想将大数据框拆分为站",成员","date.str"和"forecast.time"之后的嵌套列表.因此,mylist [[c(s,m,d,t)]]包含一个数据帧,其中包含日期为"s"和成员"m"为date.str"d"和预测时间为"t"的数据,并保留这些值s,m,d和t.

I want to split the large data frame into a nested list after "station", "member", "date.str" and "forecast.time". So that mylist[[c(s,m,d,t)]] contains a data frame with data for station "s" and member "m" for date.str "d" and for forecast time "t" conserving the values of s, m, d and t.

data.st <- list()
data.st.member <- list()
data.st.member.dato <- list()

data.st. <- split(mydata, mydata$station)
data.st.member <- lapply(data.st, FUN = fsplit.member)

(我创建了一个在成员"之后拆分的函数)

(I created a function to split after "member")

#Loop over station number:
for (s in 1:S){

#Loop over members:
for (m in 1:length(members){
tmp <- split( data.st.member[[s]][[m]], data.st.member[[s]][[m]]$dato.str )

#Loop over number of different "date.str"s
for (t in 1:length(no.date.str) ){
data.st.member.dato[[s]][[m]][[t]] <- tmp}
} #end m loop
} #end s loop

我也想根据预测时间进行划分:forec.time,但是我还没走那么远.

I would also like to split according to the forecast time: forec.time, but I didn't get that far.

我在循环中尝试了几种不同的配置,所以目前我没有一致的错误消息.我不知道自己在做什么或在想错.

I have tried a couple of different configurations within the loops, so I don't at the moment have a consistent error message. I can't figure out, what I am doing or thinking wrong.

非常感谢您的帮助!

问候 西塞(Sisse)

Regards Sisse

推荐答案

我也想回应其他人,因为这种递归数据结构将很难使用,并且可能有更好的方法.一定要像Richie所建议的那样看一看拆分应用组合方法.但是,约束可能是外部的,因此这是使用plyr库的答案.

I also want to echo the others in that this recursive data structure is going to be difficult to work with and probably there are better ways. Do look at the split-apply-combine approach as Richie suggested. However, the constraints may be external, so here is an answer using the plyr library.

mylist <- dlply(mydata, .(station), dlply, .(memeber), dlply, .(date.str), dlply, .(forecast.time), identity)

使用您为mydata提供的数据段,

Using the snippet of data you gave for mydata,

> mylist[[c("6019","mbr000","20110805","6")]]
  station date.str member forecast.time data1
1    6019 20110805 mbr000             6    77

这篇关于将数据框拆分为嵌套列表时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆