如何在 Pyspark 中将列表拆分为多列? [英] How to split a list to multiple columns in Pyspark?
本文介绍了如何在 Pyspark 中将列表拆分为多列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有:
key value
a [1,2,3]
b [2,3,4]
我想要:
key value1 value2 value3
a 1 2 3
b 2 3 4
似乎在scala中我可以写:df.select($"value._1", $"value._2", $"value._3")
,但这是不可能的在python中.
It seems that in scala I can write:df.select($"value._1", $"value._2", $"value._3")
, but it is not possible in python.
那么有什么好的方法可以做到这一点吗?
So is there a good way to do this?
推荐答案
这取决于你的列表"的类型:
It depends on the type of your "list":
如果是
ArrayType()
类型:
df = hc.createDataFrame(sc.parallelize([['a', [1,2,3]], ['b', [2,3,4]]]), ["key", "value"])
df.printSchema()
df.show()
root
|-- key: string (nullable = true)
|-- value: array (nullable = true)
| |-- element: long (containsNull = true)
您可以使用 []
像使用 python 一样访问这些值:
you can access the values like you would with python using []
:
df.select("key", df.value[0], df.value[1], df.value[2]).show()
+---+--------+--------+--------+
|key|value[0]|value[1]|value[2]|
+---+--------+--------+--------+
| a| 1| 2| 3|
| b| 2| 3| 4|
+---+--------+--------+--------+
+---+-------+
|key| value|
+---+-------+
| a|[1,2,3]|
| b|[2,3,4]|
+---+-------+
如果它是 StructType()
类型:(也许你是通过读取 JSON 来构建数据框的)
If it is of type StructType()
: (maybe you built your dataframe by reading a JSON)
df2 = df.select("key", psf.struct(
df.value[0].alias("value1"),
df.value[1].alias("value2"),
df.value[2].alias("value3")
).alias("value"))
df2.printSchema()
df2.show()
root
|-- key: string (nullable = true)
|-- value: struct (nullable = false)
| |-- value1: long (nullable = true)
| |-- value2: long (nullable = true)
| |-- value3: long (nullable = true)
+---+-------+
|key| value|
+---+-------+
| a|[1,2,3]|
| b|[2,3,4]|
+---+-------+
您可以使用 *
直接拆分"列:
you can directly 'split' the column using *
:
df2.select('key', 'value.*').show()
+---+------+------+------+
|key|value1|value2|value3|
+---+------+------+------+
| a| 1| 2| 3|
| b| 2| 3| 4|
+---+------+------+------+
这篇关于如何在 Pyspark 中将列表拆分为多列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文