Spark MLlib ALS中的非整数ID [英] Non-integer ids in Spark MLlib ALS

查看：157 发布时间：2020/9/4 2:37:45 scala apache-spark apache-spark-mllib

本文介绍了Spark MLlib ALS中的非整数ID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用

val ratings = data.map(_.split(',') match {
      case Array(user,item,rate)
      =>
        Rating(user.toInt,item.toInt,rate.toFloat)
    })
val model =  ALS.train(ratings,rank,numIterations,alpha)

但是，我得到的用户数据存储为Long.切换为int时，可能会产生错误. 我该怎么解决这个问题?

However, the user data i get are stored as Long. When switched to int, it may produce error. How can i do to solve the problem?

推荐答案

您可以使用支持Long标签的ML实现之一. RDD版本，与其他实现相比，它的用户友好性大大降低:

You can use one of ML implementations which support Long labels. RDD version it is significantly less user friendly compared to other implementations:

import org.apache.spark.ml.recommendation.ALS
import org.apache.spark.ml.recommendation.ALS.Rating

val ratings = sc.parallelize(Seq(Rating(1L, 2L, 3.0f), Rating(2L, 3L, 5.0f)))

val (userFactors, itemFactors) = ALS.train(ratings)

仅返回因子，而DataFrame版本返回模型:

and returns only factors but DataFrame version returns a model:

val ratingsDF= ratings.toDF

val alsModel = new ALS().fit(ratingsDF)

这篇关于Spark MLlib ALS中的非整数ID的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Spark MLlib ALS中的非整数ID [英] Non-integer ids in Spark MLlib ALS

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark MLlib ALS中的非整数ID [英] Non-integer ids in Spark MLlib ALS

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭