将tidyr unnest与NULL值一起使用 [英] using tidyr unnest with NULL values

查看:122
本文介绍了将tidyr unnest与NULL值一起使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将JSON文件转换为具有嵌套列表结构的data.frame,我想取消嵌套和拼合.列表中的某些值为NULL,unnest不接受.如果将NULL值替换为仅具有NA值的data.frame结构,则会得到所需的结果.

I converted a JSON file into a data.frame with a a nested list structure, which I would like to unnest and flatten. Some of the values in the list are NULL, which unnest does not accept. If I replace the NULL values with a data.frame structure that has only NA values, I get the desired result.

下面是我的问题的简化示例.我试图用NA data.frame替换NULL值,但由于嵌套结构而无法管理.如何获得理想的结果?

Below is a simplified example of my problem. I have tried to replace the NULL values with the NA data.frame but did not manage because of the the nested structure. How can I achieve the desired result?

示例

library(tidyr)
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))

df <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input2))  
unnest(df)

给出错误错误:每一列必须是矢量列表或数据帧列表[b]"

gives the error "Error: Each column must either be a list of vectors or a list of data frames [b]"

df2 <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input3))  
unnest(df2)

提供所需的输出.

推荐答案

我们可以在此处使用purrr中的map_lgl.如果您不关心带有NULL的那些行,则可以简单地使用filterunnest删除它们:

We can use map_lgl from purrr here. If you don't care about those rows with a NULL, you could simply remove them with filter and unnest:

library(tidyverse)

df %>% 
  filter(!map_lgl(b, is.null)) %>% 
  unnest() 
#> # A tibble: 3 x 3
#>       a     id value
#>   <dbl> <fctr> <dbl>
#> 1     1      c     7
#> 2     1      d     8
#> 3     1      e     9

如果要保留这些行,可以在取消嵌套后使用right_join将它们带回来:

In case you want to keep those rows, you could bring them back with right_join after unnesting:

df %>% 
  filter(!map_lgl(b, is.null)) %>% 
  unnest() %>% 
  right_join(select(df, a))
#> Joining, by = "a"
#> # A tibble: 4 x 3
#>       a     id value
#>   <dbl> <fctr> <dbl>
#> 1     1      c     7
#> 2     1      d     8
#> 3     1      e     9
#> 4     2   <NA>    NA

数据

input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))

df <- dplyr::tibble(
  a = c(1, 2),
  b = list(a = input1, c = input2)
)  

这篇关于将tidyr unnest与NULL值一起使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆