R SparkR-相当于融化功能 [英] R SparkR - equivalent to melt function

查看:103
本文介绍了R SparkR-相当于融化功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SparkR库中是否有类似于 melt 的功能?

Is there a function similar to melt in SparkR library?

将1行50列的数据转换为50行3列的数据吗?

Transform data with 1 row and 50 columns to 50 rows and 3 columns?

推荐答案

SparkR中没有提供类似功能的内置函数.您可以使用 explode

There is no built-in function that provides a similar functionality in SparkR. You can built your own with explode

library(magrittr)

df <- createDataFrame(data.frame(
  A = c('a', 'b', 'c'),
  B = c(1, 3, 5),
  C = c(2, 4, 6)
))

melt <- function(df, id.vars, measure.vars, 
                 variable.name = "key", value.name = "value") {

   measure.vars.exploded <- purrr::map(
       measure.vars, function(c) list(lit(c), column(c))) %>% 
     purrr::flatten() %>% 
     (function(x) do.call(create_map, x)) %>% 
     explode()
   id.vars <- id.vars %>% purrr::map(column)

   do.call(select, c(df, id.vars, measure.vars.exploded)) %>%
     withColumnRenamed("key", variable.name) %>%
     withColumnRenamed("value", value.name)
}

melt(df, c("A"), c("B", "C")) %>% head()

  A key value                                                                   
1 a   B     1
2 a   C     2
3 b   B     3
4 b   C     4
5 c   B     5
6 c   C     6

或在Hive的 stack UDF中使用SQL:

or use SQL with Hive's stack UDF:

stack <- function(df, id.vars, measure.vars, 
                  variable.name = "key", value.name = "value") { 
  measure.vars.exploded <- glue::glue('"{measure.vars}", `{measure.vars}`') %>%  
    glue::glue_collapse(" , ") %>%
    (function(x) glue::glue(
      "stack({length(measure.vars)}, {x}) as ({variable.name}, {value.name})"
    )) %>%
    as.character()
    do.call(selectExpr, c(df, id.vars, measure.vars.exploded))
}

stack(df, c("A"), c("B", "C")) %>% head()

  A key value
1 a   B     1
2 a   C     2
3 b   B     3
4 b   C     4
5 c   B     5
6 c   C     6

相关问题:

这篇关于R SparkR-相当于融化功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆