聚集在闪闪发光的 [英] Gather in sparklyr

查看:18
本文介绍了聚集在闪闪发光的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 sparklyr 来处理一些数据.给定一个,

I am using sparklyr to manipulate some data. Given a,

a<-tibble(id = rep(c(1,10), each = 10),
          attribute1 = rep(c("This", "That", 'These', 'Those', "The", "Other", "Test", "End", "Start", 'Beginning'), 2),
          value = rep(seq(10,100, by = 10),2),
          average = rep(c(50,100),each = 10),
          upper_bound = rep(c(80, 130), each =10),
          lower_bound = rep(c(20, 70), each =10))

我想使用收集"来操作数据,如下所示:

I would like use "gather" to manipulate the data, like this:

b<- a %>% 
     gather(key = type_data, value = value_data, -c(id:attribute1))

但是,gather"在 sparklyr 上不可用.我见过一些人使用 sdf_pivot 来模仿收集"(例如 如何在 sparklyr 中使用 sdf_pivot() 并连接字符串?) 但我看不出在这种情况下如何使用它.

However, "gather" is not available on sparklyr. I have seen some people using sdf_pivot to mimic "gather" (eg How to use sdf_pivot() in sparklyr and concatenate strings?) but I can’t see how to use it in this case.

有人有想法吗?

干杯!

推荐答案

这是一个在 sparklyr 中模拟 gather 的函数.这将收集给定的列,同时保持其他所有内容完好无损,但如果需要,可以轻松扩展.

Here's a function to mimic gather in sparklyr. This would gather the given columns while keeping everything else intact, but it can easily be extended if required.

# Function
sdf_gather <- function(tbl, gather_cols){

  other_cols <- colnames(tbl)[!colnames(tbl) %in% gather_cols]

  lapply(gather_cols, function(col_nm){
    tbl %>% 
      select(c(other_cols, col_nm)) %>% 
      mutate(key = col_nm) %>%
      rename(value = col_nm)  
  }) %>% 
    sdf_bind_rows() %>% 
    select(c(other_cols, 'key', 'value'))
}

# Example
spark_df %>% 
  select(col_1, col_2, col_3, col_4) %>% 
  sdf_gather(c('col_3', 'col_4'))

这篇关于聚集在闪闪发光的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆