如何使用 PySpark 从 SPARK 中的 RDD 获取特定值 [英] How to get specific values from RDD in SPARK with PySpark

查看：52 发布时间：2021/6/25 18:34:37 python apache-spark pyspark

本文介绍了如何使用 PySpark 从 SPARK 中的 RDD 获取特定值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是我的RDD，有5个字段

The following is my RDD, there are 5 fields

[('sachin', 200, 10,4,True), ('Raju', 400, 40,4,True), ('Mike', 100, 50,4,False) ]

这里我只需要获取第 1、第 3 和第 5 个字段，如何在 PySpark 中执行.预期结果如下.我用好几种方式尝试了reduceByKey，都无法实现

Here I need to fetch 1st ,3rd and 5th Fields only , How to do in PySpark . Expected results as bellow . I tried reduceByKey in several ways, couldn't achieve it

Sachin,10,True
Raju,40,True
Mike,50,False

推荐答案

使用简单的地图?

rdd.map(lambda x: (x[0], x[2], x[4]))

这篇关于如何使用 PySpark 从 SPARK 中的 RDD 获取特定值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 PySpark 从 SPARK 中的 RDD 获取特定值 [英] How to get specific values from RDD in SPARK with PySpark

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 PySpark 从 SPARK 中的 RDD 获取特定值 [英] How to get specific values from RDD in SPARK with PySpark

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭