如何将变量传递给spark_apply()中调用的函数? [英] How to pass variables to functions called in spark_apply()?

查看:75
本文介绍了如何将变量传递给spark_apply()中调用的函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够将额外的变量传递给sparklyr中spark_apply调用的函数.

I would like to be able to pass extra variables to functions that are called by spark_apply in sparklyr.

例如:

# setup
library(sparklyr)
sc <- spark_connect(master='local', packages=TRUE)
iris2 <- iris[,1:(ncol(iris) - 1)]
df1 <- sdf_copy_to(sc, iris2, repartition=5, overwrite=T)

# This works fine
res <- spark_apply(df1, function(x) kmeans(x, 3)$centers)

# This does not
k <- 3
res <- spark_apply(df1, function(x) kmeans(x, k)$centers)

作为一个丑陋的解决方法,我可以通过将值保存到R包中,然后引用它们来做我想做的事情.即

As an ugly workaround, I can do what I want by saving values into R packages, and then referencing them. i.e

> myPackage::k_equals_three == 3
[1] TRUE

# This also works
res <- spark_apply(df1, function(x) kmeans(x, myPackage::k_equals_three)$centers)

有更好的方法吗?

推荐答案

我没有进行测试的火花,但是您可以创建一个闭包吗?

I don't have spark set up to test, but can you just create a closure?

kmeanswithk <- function(k) {force(k); function(x) kmeans(x, k)$centers})
k <- 3
res <- spark_apply(df1, kmeanswithk(k))

基本上只是创建一个函数以返回一个函数,然后使用它.

Basically just create a function to return a function then use that.

这篇关于如何将变量传递给spark_apply()中调用的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆