根据另一个列将值映射到特定列 [英] Mapping a value into a specific column based on annother column

查看:72
本文介绍了根据另一个列将值映射到特定列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题:

  • 包含col1且字符串为ABCDataFrame.
  • 第二个col2Integer.
  • 以及其他三个列col3col4col5(这些列也分别命名为ABC).
  • A DataFrame containing col1 with strings A, B, or C.
  • A second col2 with an Integer.
  • And three other columns col3, col4 and col5 (these columns are also named A, B, and C).

因此

 col1 - col2 - A (col3) - B (col4) - C (col5)
|--------------------------------------------
   A      6
   B      5
   C      6

应该获得

 col1 - col2 - A (col3) - B (col4) - C (col5)
|--------------------------------------------
   A      6       6
   B      5                  5
   C      6                              6

现在,我想遍历每一行,并根据col1中的条目将col2中的整数分配给A,B或C列.

Now I would like to go through each row and assign the integer in col2 to the column A, B or C based on the entry in col1.

我该如何实现?

df.withColumn()我不能使用(或至少我不知道为什么),并且对val df2 = df.map(x => x )同样适用.

df.withColumn() I cannot use (or at least I do not know why) and the same holds for val df2 = df.map(x => x ).

期待您的帮助并提前致谢!

Looking forward to you help and thanks in advance!

最好,肯

推荐答案

在键和目标列之间创建映射:

Create a mapping between key and target column:

val mapping = Seq(("A", "col3"), ("B", "col4"), ("C", "col5"))

使用它来生成列序列:

import org.apache.spark.sql.functions.when

val exprs = mapping.map { case (key, target) => 
  when($"col1" === key, $"col2").alias(target) }

添加星标,然后选择:

val df = Seq(("A", 6), ("B", 5), ("C", 6)).toDF("col1", "col2")
df.select($"*" +: exprs: _*)

结果是:

+----+----+----+----+----+
|col1|col2|col3|col4|col5|
+----+----+----+----+----+
|   A|   6|   6|null|null|
|   B|   5|null|   5|null|
|   C|   6|null|null|   6|
+----+----+----+----+----+

这篇关于根据另一个列将值映射到特定列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆