如何将以管道分隔的列拆分为多行? [英] How to split pipe-separated column into multiple rows?

查看:78
本文介绍了如何将以管道分隔的列拆分为多行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下内容的数据框:

I have a dataframe that contains the following:

movieId / movieName / genre
1         example1    action|thriller|romance
2         example2    fantastic|action

我想从第二个数据帧中获取第二个数据帧,其中包含以下内容:

I would like to obtain a second dataframe (from the first one), that contains the following:

movieId / movieName / genre
1         example1    action
1         example1    thriller
1         example1    romance
2         example2    fantastic
2         example2    action

我该怎么做?

推荐答案

我将使用split标准函数.

scala> movies.show(truncate = false)
+-------+---------+-----------------------+
|movieId|movieName|genre                  |
+-------+---------+-----------------------+
|1      |example1 |action|thriller|romance|
|2      |example2 |fantastic|action       |
+-------+---------+-----------------------+

scala> movies.withColumn("genre", explode(split($"genre", "[|]"))).show
+-------+---------+---------+
|movieId|movieName|    genre|
+-------+---------+---------+
|      1| example1|   action|
|      1| example1| thriller|
|      1| example1|  romance|
|      2| example2|fantastic|
|      2| example2|   action|
+-------+---------+---------+

// You can use \\| for split instead
scala> movies.withColumn("genre", explode(split($"genre", "\\|"))).show
+-------+---------+---------+
|movieId|movieName|    genre|
+-------+---------+---------+
|      1| example1|   action|
|      1| example1| thriller|
|      1| example1|  romance|
|      2| example2|fantastic|
|      2| example2|   action|
+-------+---------+---------+

p.s.您可以使用Dataset.flatMap达到相同的结果,我确信Scala开发人员会更喜欢.

p.s. You could use Dataset.flatMap to achieve the same result which is something Scala devs would enjoy more I'm sure.

这篇关于如何将以管道分隔的列拆分为多行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆