如何在python/pyspark数据框的所有列中添加后缀和前缀 [英] How to add suffix and prefix to all columns in python/pyspark dataframe
本文介绍了如何在python/pyspark数据框的所有列中添加后缀和前缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在pyspark中有一个数据框,其中包含100多个列.我要为所有列名做的事情是,我想在列名的开头和列名的末尾添加回号(`).
I have a data frame in pyspark with more than 100 columns. What I want to do is for all the column names I would like to add back ticks(`) at the start of the column name and end of column name.
例如:
column name is testing user. I want `testing user`
在pyspark/python中是否有执行此操作的方法.当我们应用代码时,它应该返回一个数据帧.
Is there a method to do this in pyspark/python. when we apply the code it should return a data frame.
推荐答案
您可以将数据框的withColumnRenamed
方法与na
结合使用以创建新的数据框
You can use withColumnRenamed
method of dataframe in combination with na
to create new dataframe
df.na.withColumnRenamed('testing user', '`testing user`')
edit:假设您具有列列表,则可以执行以下操作-
edit : suppose you have list of columns, you can do like -
old = "First Last Age"
new = ["`"+field+"`" for field in old.split()]
df.rdd.toDF(new)
输出:
DataFrame[`First`: string, `Last`: string, `Age`: string]
这篇关于如何在python/pyspark数据框的所有列中添加后缀和前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文