如何在pyspark中将struct dataType更改为Integer? [英] How to change struct dataType to Integer in pyspark?
本文介绍了如何在pyspark中将struct dataType更改为Integer?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框 df ,并且一列的数据类型为struct<long:bigint, string:string>
I have a dataframe df, and one column has data type of struct<long:bigint, string:string>
由于这种数据类型结构,我无法执行加法,减法等操作...
because of this data type structure, I can not perform addition, subtration etc...
如何将struct<long:bigint, string:string>
更改为IntegerType
??
推荐答案
您可以使用点语法来访问struct列的某些部分.
You can use a dot syntax to access parts of the struct column.
例如,如果您从此数据帧开始
For example if you start with this dataframe
df = spark.createDataFrame([(1,(3,'x')),(4,(8, 'y'))]).toDF("col1", "col2")
df.show()
df.printSchema()
+----+------+
|col1| col2|
+----+------+
| 1|[3, x]|
| 4|[8, y]|
+----+------+
root
|-- col1: long (nullable = true)
|-- col2: struct (nullable = true)
| |-- _1: long (nullable = true)
| |-- _2: string (nullable = true)
use可以选择struct列的第一部分,然后创建一个新列或替换一个现有列:
use can select the first part of the struct column and either create a new column or replace an existing one:
df.withColumn('col2', df['col2._1']).show()
打印
+----+----+
|col1|col2|
+----+----+
| 1| 3|
| 4| 8|
+----+----+
这篇关于如何在pyspark中将struct dataType更改为Integer?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文