如何迭代scala wrapperArray? (火花) [英] How to iterate scala wrappedArray? (Spark)

查看:141
本文介绍了如何迭代scala wrapperArray? (火花)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我执行以下操作:

val tempDict = sqlContext.sql("select words.pName_token,collect_set(words.pID) as docids 
                               from words
                               group by words.pName_token").toDF()

val wordDocs = tempDict.filter(newDict("pName_token")===word)

val listDocs = wordDocs.map(t => t(1)).collect()

listDocs: Array

[Any] = Array(WrappedArray(123, 234, 205876618, 456))

我的问题是如何遍历这个包装好的数组或将其转换为列表?

My question is how do I iterate over this wrapped array or convert this into a list?

我为listDocs获得的选项是applyasInstanceOfcloneisInstanceOflengthtoStringupdate.

The options I get for the listDocs are apply, asInstanceOf, clone, isInstanceOf, length, toString, and update.

我该如何进行?

推荐答案

这是解决此问题的一种方法.

Here is one way to solve this.

import org.apache.spark.sql.Row
import org.apache.spark.sql.functions._
import scala.collection.mutable.WrappedArray

val data = Seq((Seq(1,2,3),Seq(4,5,6),Seq(7,8,9)))
val df = sqlContext.createDataFrame(data)
val first = df.first

// use a pattern match to deferral the type
val mapped = first.getAs[WrappedArray[Int]](0)

// now we can use it like normal collection
mapped.mkString("\n")

// get rows where has array
val rows = df.collect.map {
    case Row(a: Seq[Any], b: Seq[Any], c: Seq[Any]) => 
        (a, b, c)
}
rows.mkString("\n")

这篇关于如何迭代scala wrapperArray? (火花)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆