从R个数据框列构造JSON列 [英] Construct JSON column from R dataframe columns

查看:17
本文介绍了从R个数据框列构造JSON列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的数据帧:

df <- data.frame(item = c("Box 1", "Tape", "Roll 1"), Length = c(2, 
 10, 6), Width = c(4,3,3), Height  = c(6, NA, NA), Length_units =
 c("in", "ft", "yd"), option_1 = c("item_length", "item_length",
 "item_length"), option_2 = c("item_width", "item_width",
 "item_width"), option_3 = c("item_height", "color", NA), option_4 =
 c("thickness", NA, NA), width_units = c("in", "in", "in"),
 height_units = c("in", "in", NA), color = c(NA, "clear", NA),
 thickness = c( "200#", NA, NA ))
    item Length Width Height Length_units    option_1   option_2    option_3  option_4 width_units height_units color thickness
1  Box 1      2     4      6           in item_length item_width item_height thickness          in           in  <NA>      200# 
2   Tape     10     3     NA           ft item_length item_width       color      <NA>          in           in clear      <NA>
3 Roll 1      6     3     NA           yd item_length item_width        <NA>      <NA>          in         <NA>  <NA>      <NA>
我希望将下面的JSON专栏附加到df的末尾。请注意,JSON中的选项标签根据DF选项列,并且JSON根据选项列是动态的。

{"dimensions":"2 in x 4 in x 6 in","thickness":"200#"}
{"dimensions":"10 ft x 3 in","color":"clear"}
{"dimensions":"6 yd x 3 in"}

我非常感谢dplyr解决方案。

已编辑:我希望我的数据框如下所示

    item Length Width Height Length_units    option_1   option_2    option_3  option_4 width_units height_units color thickness options_json
1  Box 1      2     4      6           in item_length item_width item_height thickness          in           in  <NA>      200# {"dimensions":"2 in x 4 in x 6 in","thickness":"200#"} 
2   Tape     10     3     NA           ft item_length item_width       color      <NA>          in           in clear      <NA> {"dimensions":"10 ft x 3 in","color":"clear"}
3 Roll 1      6     3     NA           yd item_length item_width        <NA>      <NA>          in         <NA>  <NA>      <NA> {"dimensions":"6 yd x 3 in"}

推荐答案

我们可以考虑paste/unite列,然后用toJSON转换成json-由于列的大小写不同,标准化可能更好(tolower-更容易从_units列中get得到对应的列值)。循环across列(length:height),paste(str_c),当列中的值为非NA(使用case_when)时,相应的‘_unit’列值,则unite这些列为单列,select感兴趣的列转换为JSON(toJSON)

library(dplyr)
library(stringr)
library(jsonlite)
library(tidyr)
out <- df %>% 
 rename_with(tolower, everything()) %>%
  mutate(across(length:height, 
   ~ case_when(!is.na(.) ~ str_c(.x, get(str_c(cur_column(), "_units")), 
      sep = " ")))) %>%
  unite(dimensions, length, width, height, sep = " x ", na.rm = TRUE) %>% 
  select(dimensions, color, thickness) %>%
  toJSON(pretty = TRUE)

-输出

out
[
  {
    "dimensions": "2 in x 4 in x 6 in",
    "thickness": "200#"
  },
  {
    "dimensions": "10 ft x 3 in",
    "color": "clear"
  },
  {
    "dimensions": "6 yd x 3 in"
  }
] 

如果要创建列,请使用mutaterowwise

df <- df %>% 
 rename_with(tolower, everything()) %>%
  mutate(across(length:height, 
   ~ case_when(!is.na(.) ~ str_c(.x, get(str_c(cur_column(), "_units")), 
      sep = " ")))) %>%
  unite(dimensions, length, width, height, sep = " x ", na.rm = TRUE) %>% 
  select(dimensions, color, thickness) %>% 
  rowwise %>%
  transmute(options_json = toJSON(cur_data())) %>%
  ungroup %>%
  bind_cols(df, .)

-输出

df
  item Length Width Height Length_units    option_1   option_2    option_3  option_4 width_units height_units color thickness
1  Box 1      2     4      6           in item_length item_width item_height thickness          in           in  <NA>      200#
2   Tape     10     3     NA           ft item_length item_width       color      <NA>          in           in clear      <NA>
3 Roll 1      6     3     NA           yd item_length item_width        <NA>      <NA>          in         <NA>  <NA>      <NA>
                                              options_json
1 [{"dimensions":"2 in x 4 in x 6 in","thickness":"200#"}]
2          [{"dimensions":"10 ft x 3 in","color":"clear"}]
3                           [{"dimensions":"6 yd x 3 in"}]

json格式包括左方括号/右方括号。我们可以使用str_remove

将其删除
df <- df %>% 
 rename_with(tolower, everything()) %>%
  mutate(across(length:height, 
   ~ case_when(!is.na(.) ~ str_c(.x, get(str_c(cur_column(), "_units")), 
      sep = " ")))) %>%
  unite(dimensions, length, width, height, sep = " x ", na.rm = TRUE) %>% 
  select(dimensions, color, thickness) %>%
 rowwise %>% 
 transmute(options_json = str_remove_all(toJSON(cur_data()), "\[|\]")) %>% 
 ungroup %>%
  bind_cols(df, .)

-输出

  item Length Width Height Length_units    option_1   option_2    option_3  option_4 width_units height_units color thickness
1  Box 1      2     4      6           in item_length item_width item_height thickness          in           in  <NA>      200#
2   Tape     10     3     NA           ft item_length item_width       color      <NA>          in           in clear      <NA>
3 Roll 1      6     3     NA           yd item_length item_width        <NA>      <NA>          in         <NA>  <NA>      <NA>
                                            options_json
1 {"dimensions":"2 in x 4 in x 6 in","thickness":"200#"}
2          {"dimensions":"10 ft x 3 in","color":"clear"}
3                           {"dimensions":"6 yd x 3 in"}

或可能unclass之后rowwise

df <- df %>% 
 rename_with(tolower, everything()) %>%
  mutate(across(length:height, 
   ~ case_when(!is.na(.) ~ str_c(.x, get(str_c(cur_column(), "_units")), 
      sep = " ")))) %>%
  unite(dimensions, length, width, height, sep = " x ", na.rm = TRUE) %>% 
  select(dimensions, color, thickness) %>% 
  rowwise %>%
  transmute(options_json = toJSON(keep(unclass(cur_data()), 
     complete.cases), auto_unbox = TRUE)) %>% 
  ungroup %>% 
  bind_cols(df, .)

-输出

> df
  item Length Width Height Length_units    option_1   option_2    option_3  option_4 width_units height_units color thickness
1  Box 1      2     4      6           in item_length item_width item_height thickness          in           in  <NA>      200#
2   Tape     10     3     NA           ft item_length item_width       color      <NA>          in           in clear      <NA>
3 Roll 1      6     3     NA           yd item_length item_width        <NA>      <NA>          in         <NA>  <NA>      <NA>
                                            options_json
1 {"dimensions":"2 in x 4 in x 6 in","thickness":"200#"}
2          {"dimensions":"10 ft x 3 in","color":"clear"}
3                           {"dimensions":"6 yd x 3 in"}

> str(df)
'data.frame':   3 obs. of  14 variables:
 $ item        : chr  "Box 1" "Tape" "Roll 1"
 $ Length      : num  2 10 6
 $ Width       : num  4 3 3
 $ Height      : num  6 NA NA
 $ Length_units: chr  "in" "ft" "yd"
 $ option_1    : chr  "item_length" "item_length" "item_length"
 $ option_2    : chr  "item_width" "item_width" "item_width"
 $ option_3    : chr  "item_height" "color" NA
 $ option_4    : chr  "thickness" NA NA
 $ width_units : chr  "in" "in" "in"
 $ height_units: chr  "in" "in" NA
 $ color       : chr  NA "clear" NA
 $ thickness   : chr  "200#" NA NA
 $ options_json: 'json' chr  "{"dimensions":"2 in x 4 in x 6 in","thickness":"200#"}" "{"dimensions":"10 ft x 3 in","color":"clear"}" "{"dimensions":"6 yd x 3 in"}"

这篇关于从R个数据框列构造JSON列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆