在 Scala Spark 中按数据类型删除列 [英] Dropping columns by data type in Scala Spark

查看：72 发布时间：2021/7/15 21:14:49 scala apache-spark

本文介绍了在 Scala Spark 中按数据类型删除列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

df1.printSchema() 打印出列名和它们拥有的数据类型.

df1.printSchema() prints out the column names and the data type that they possess.

df1.drop($"colName") 将按名称删除列.

有没有办法修改这个命令来代替数据类型?

Is there a way to adapt this command to drop by the data-type instead?

推荐答案

如果您希望根据类型删除数据框中的特定列，那么下面的代码段会有所帮助.在这个例子中，我有一个数据框，其中有两列分别是 String 和 Int 类型.我正在根据其类型从架构中删除我的 String(所有类型为 String 的字段都将被删除)字段.

If you are looking to drop specific columns in the dataframe based on the types, then the below snippet would help. In this example, I have a dataframe with two columns of type String and Int respectivly. I am dropping my String (all fields of type String would be dropped) field from the schema based on its type.

import sqlContext.implicits._

val df = sc.parallelize(('a' to 'l').map(_.toString) zip (1 to 10)).toDF("c1","c2")

df.schema.fields
    .collect({case x if x.dataType.typeName == "string" => x.name})
    .foldLeft(df)({case(dframe,field) => dframe.drop(field)})

newDf 的模式是 org.apache.spark.sql.DataFrame = [c2: int]

这篇关于在 Scala Spark 中按数据类型删除列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 Scala Spark 中按数据类型删除列 [英] Dropping columns by data type in Scala Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在 Scala Spark 中按数据类型删除列 [英] Dropping columns by data type in Scala Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭