如何在Sparkr中进行交叉验证 [英] How to do Cross validation in sparkr

查看：181 发布时间：2020/10/11 20:02:33 r cross-validation sparkr

本文介绍了如何在Sparkr中进行交叉验证的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用电影镜头数据集，我将用户ID的矩阵（m X n）作为行，将电影ID的列作为列，并且我已经进行了降维技术和矩阵分解以缩小我的稀疏矩阵（m X k其中k ＜n）。我想使用k最近邻居算法（不是库，我自己的代码）来评估性能。我正在使用sparkR 1.6.2。我不知道如何在sparkR中将数据集分为训练数据和测试数据。我尝试了本机R函数（样本，子集，CARET），但是它与spark数据框不兼容。请提供一些建议，以执行交叉验证并使用我自己用sparkR编写的函数训练分类器。

I am working with movie lens dataset, I have a matrix(m X n) of user id as row and movie id as columns and I have done dimension reduction technique and matrix factorization to reduce my sparse matrix (m X k, where k < n ). I want to evaluate the performance using the k-nearest neighbor algorithm (not library , my own code) . I am using sparkR 1.6.2. I don't know how to split my dataset into training data and test data in sparkR. I have tried native R function (sample, subset,CARET) but it is not compatible with spark data frame. kindly give some suggestion for performing cross-validation and training the classifier using my own function written in sparkR

推荐答案

sparklyr（ https://spark.rstudio.com/ ）包提供了用于分区数据的简单功能。例如，如果我们在Spark中有一个名为 df 的数据框，则可以使用 compute（）创建它的副本然后使用 sdf_partition（）对其进行分区。

The sparklyr (https://spark.rstudio.com/) package provides simple functionality for partitioning data. For example, if we have a data frame called df in Spark we could create a copy of it with compute() then partition it with sdf_partition().

df_part <- df %>%
  compute("df_part") %>%
  sdf_partition(test = 0.2, train = 0.8, seed = 2017)

df_part 然后将成为Spark DataFrame的连接。我们可以使用 collect（）将Spark DataFrame复制到R数据框中。

df_part Would then be a connection to a Spark DataFrame. We could use collect() to copy the Spark DataFrame into an R dataframe.

这篇关于如何在Sparkr中进行交叉验证的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Sparkr中进行交叉验证 [英] How to do Cross validation in sparkr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在Sparkr中进行交叉验证 [英] How to do Cross validation in sparkr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭