将新列添加到数据框.我希望新列成为UUID生成器 [英] Add a new column to a Dataframe. New column i want it to be a UUID generator

查看:110
本文介绍了将新列添加到数据框.我希望新列成为UUID生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想向Dataframe(UUID生成器)添加一个新列.

UUID值类似于21534cf7-cff9-482a-a3a8-9e7244240da7

我的研究

我已经尝试过在火花中使用withColumn方法.

val DF2 = DF1.withColumn("newcolname", DF1("existingcolname" + 1)

因此DF2将在newcolname的所有列中添加一列,并在所有行中添加1.

根据我的要求,我想拥有一个可以生成UUID的新列.

解决方案

您应该尝试以下方法:

 val sc: SparkContext = ...
val sqlContext = new SQLContext(sc)

import sqlContext.implicits._

val generateUUID = udf(() => UUID.randomUUID().toString)
val df1 = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value")
val df2 = df1.withColumn("UUID", generateUUID())

df1.show()
df2.show()
 

输出将是:

+---+-----+
| id|value|
+---+-----+
|id1|    1|
|id2|    4|
|id3|    5|
+---+-----+

+---+-----+--------------------+
| id|value|                UUID|
+---+-----+--------------------+
|id1|    1|f0cfd0e2-fbbe-40f...|
|id2|    4|ec8db8b9-70db-46f...|
|id3|    5|e0e91292-1d90-45a...|
+---+-----+--------------------+

I want to add a new column to a Dataframe, a UUID generator.

UUID value will look something like 21534cf7-cff9-482a-a3a8-9e7244240da7

My Research:

I've tried with withColumn method in spark.

val DF2 = DF1.withColumn("newcolname", DF1("existingcolname" + 1)

So DF2 will have additional column with newcolname with 1 added to it in all rows.

By my requirement is that I want to have a new column which can generate the UUID.

解决方案

You should try something like this:

val sc: SparkContext = ...
val sqlContext = new SQLContext(sc)

import sqlContext.implicits._

val generateUUID = udf(() => UUID.randomUUID().toString)
val df1 = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value")
val df2 = df1.withColumn("UUID", generateUUID())

df1.show()
df2.show()

Output will be:

+---+-----+
| id|value|
+---+-----+
|id1|    1|
|id2|    4|
|id3|    5|
+---+-----+

+---+-----+--------------------+
| id|value|                UUID|
+---+-----+--------------------+
|id1|    1|f0cfd0e2-fbbe-40f...|
|id2|    4|ec8db8b9-70db-46f...|
|id3|    5|e0e91292-1d90-45a...|
+---+-----+--------------------+

这篇关于将新列添加到数据框.我希望新列成为UUID生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆