如何做,在数据帧中出现两次列名来选择? [英] How to do SELECT with column name that appears twice in DataFrame?

查看:224
本文介绍了如何做,在数据帧中出现两次列名来选择?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下的code ..

I have the following code..

DataFrame addressDF = sqlContext.read().parquet(addressParquetPath);
DataFrame propertyDF = sqlContext.read().parquet(propertyParquetPath);

DataFrame joinedFrame = addressDF.join(propertyDF, propertyDF.col("LOCID").equalTo(addressDF.col("locid")), "left");

joinedFrame.registerTempTable("joinedFrame");
DataFrame joinedFrameSelect = sqlContext.sql("SELECT LOCID,AddressID FROM joinedFrame");

在选择LOCID列出两次,我怎么挑地址,而不是财产LOCID。

in the Select LocID is listed twice, how do i pick the LocId of Address instead of property.

我可以对数据帧由列索引执行选择?

Can i execute select on the dataframe by column index?

推荐答案

我通常重命名列 - 你可以试试:

I usually rename the column -- you can either try:

...join(propertyDF.withColumnRenamed("LocID", "LocID_R"), ...

或者,如果你想改变所有的列名的数据帧一气呵成 - 比如添加一个 _R 为正确的每一个名字 - 你可以试试这个:

Or if you want to change all of the column names for a DataFrame in one go -- such as add an _R for "right" to every name -- you can try this:

df.toDF(df.columns.map(_ + "_R"):_*)

此,当你到自身加入一个数据帧后面是很有用的。

This is useful when you are joining a DataFrame back onto itself.

这篇关于如何做,在数据帧中出现两次列名来选择?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆