在R数据框中解压缩列表 [英] Unpacking a list in an R dataframe

查看:42
本文介绍了在R数据框中解压缩列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 dataframe ,其中一个字段包含不同长度的列表.我想将此字段中列表的每个元素提取到其自己的字段中,以便将结果与每个id的每个列表元素一起收集到一个长的 dataframe 中.

I have a dataframe of which one field comprises lists of varying lengths. I would like to extract each element of the list in this field to its own field so that I can gather the results into a long dataframe with each list element per id.

这是一个示例 dataframe

dat <- structure(list(id = c("509935", "727889", "864607", "1234243", 
        "1020959", "221975"), some_date = c("2/09/1967", "28/04/1976", 
        "22/12/2017", "7/02/2006", "10/03/2019", "21/10/1935"), df_list = list(
            "018084131", c("062197171", "062171593"), c("064601923", 
            "068994009", "069831651"), c("071141584", "073129537"), c("061498574", 
            "065859718", "067251995", "069447806"), "064623976")), class = c("tbl_df", 
        "tbl", "data.frame"), row.names = c(NA, -6L))

我已经提供了代码来实现我想要的最终结果,但是,我还没有这样做.这是我尝试过的.

I have come with code to achieve what I want the final result to look like, however, I have not done this the DRY way. Here is what I have tried.

res_n 是如下功能:

res_n <- function(field, n) {
    field[n]
}

dat <- dat %>% mutate(res1 = map(df_list, res_n, 1))
dat <- dat %>% mutate(res2 = map(df_list, res_n, 2))
dat <- dat %>% mutate(res3 = map(df_list, res_n, 3))

这将返回一个数据帧,其中包含 df_list 中的三个列表元素中的每个元素在其各自的列中.

This returns a data frame with each of the three list elements from df_list in their own columns.

由此,我可以实现我打算要做的事情,并生成最终的结果 dataframe ,如下所示:

From this I can achieve what I set out to do and produce a final dataframe of results, as follows:

dat_final <- gather(dat, test, labno, -df_list, -some_date, -id) %>% 
    select(-df_list) %>% 
    mutate(labno = as.integer(labno)) %>% 
    filter(!is.na(labno))

为了避免我使用的DRY方法,我求助于for循环来尝试消除重复的代码.我正在努力以达到最终结果的方式来实现这一目标.这是我尝试过的for循环.

To avoid the DRY approach I used I resorted to a for loop to try and eliminate the repetitive code. I'm struggling to get this to work in the way I need to achieve the final result. This is the for loop I tried.

 for (i in 3) {
     dat %>% mutate(paste(res, i, sep = '_') = map(results, res_n, i)) }

有人可以帮助我完善代码以消除产生结果的重复行.

Can someone help me to refine the code to elimiate the repeitive lines that produce the result.

推荐答案

如果最终目标是获取长格式的数据,则可以使用 tidyr

If the final goal is to get data in long format, we can use unnest from tidyr

tidyr::unnest(dat, cols = df_list)

#   id      some_date  df_list  
#   <chr>   <chr>      <chr>    
# 1 509935  2/09/1967  018084131
# 2 727889  28/04/1976 062197171
# 3 727889  28/04/1976 062171593
# 4 864607  22/12/2017 064601923
# 5 864607  22/12/2017 068994009
# 6 864607  22/12/2017 069831651
# 7 1234243 7/02/2006  071141584
# 8 1234243 7/02/2006  073129537
# 9 1020959 10/03/2019 061498574
#10 1020959 10/03/2019 065859718
#11 1020959 10/03/2019 067251995
#12 1020959 10/03/2019 069447806
#13 221975  21/10/1935 064623976

这篇关于在R数据框中解压缩列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆