如何将向量列拆分为两列? [英] How to split column of vectors into two columns?

查看：92 发布时间：2021/4/8 19:50:43 apache-spark pyspark apache-spark-ml

本文介绍了如何将向量列拆分为两列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用PySpark.

I use PySpark.

Spark ML的随机森林"输出DataFrame具有概率"列，该列是具有两个值的向量.我只想在输出DataFrame中添加两列"prob1"和"prob2"，它们对应于向量中的第一个和第二个值.

Spark ML's Random Forest output DataFrame has a column "probability" which is a vector with two values. I just want to add two columns to the output DataFrame, "prob1" and "prob2", which correspond to the first and second values in the vector.

我尝试了以下操作:

output2 = output.withColumn('prob1', output.map(lambda r: r['probability'][0]))

但是我得到"col应该是列"的错误.

but I get the error that 'col should be Column'.

关于如何将向量列转换为其值列的任何建议?

Any suggestions on how to transform a column of vectors into columns of its values?

推荐答案

遇到了同样的问题，以下是针对具有n长度向量的情况进行调整的代码.

Got the same problem, below is the code adjusted for the situation when you have n-length vector.

splits = [udf(lambda value: value[i].item(), FloatType()) for i in range(n)]
out =  tstDF.select(*[s('features').alias("Column"+str(i)) for i, s in enumerate(splits)])

这篇关于如何将向量列拆分为两列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将向量列拆分为两列? [英] How to split column of vectors into two columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将向量列拆分为两列? [英] How to split column of vectors into two columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭