Spark的Column.isin函数不带List [英] Spark's Column.isin function does not take List

查看:363
本文介绍了Spark的Column.isin函数不带List的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试过滤Spark Dataframe中的行。

I am trying to filter out rows from my Spark Dataframe.

val sequence = Seq(1,2,3,4,5)
df.filter(df("column").isin(sequence))

不幸的是,我得到一个不受支持的文字类型错误

Unfortunately, I get an unsupported literal type error

java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(1,2,3,4,5)

根据文档它需要一个scala.collection.Seq列表

according to the documentation it takes a scala.collection.Seq list

我想我不想要文字?然后我可以接受什么样的包装类呢?

I guess I don't want a literal? Then what can I take in, some sort of wrapper class?

推荐答案

@ JustinPihony的答案是正确的,但它不完整。 isin 函数接受重复参数作为参数,因此您需要将其传递给:

@JustinPihony's answer is correct but it's incomplete. The isin function takes a repeated parameter for argument, so you'll need to pass it as so :

scala> val df = sc.parallelize(Seq(1,2,3,4,5,6,7,8,9)).toDF("column")
// df: org.apache.spark.sql.DataFrame = [column: int]

scala> val sequence = Seq(1,2,3,4,5)
// sequence: Seq[Int] = List(1, 2, 3, 4, 5)

scala> val result = df.filter(df("column").isin(sequence : _*))
// result: org.apache.spark.sql.DataFrame = [column: int]

scala> result.show
// +------+
// |column|
// +------+
// |     1|
// |     2|
// |     3|
// |     4|
// |     5|
// +------+

这篇关于Spark的Column.isin函数不带List的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆