如何做,在数据帧中出现两次列名来选择? [英] How to do SELECT with column name that appears twice in DataFrame?
本文介绍了如何做,在数据帧中出现两次列名来选择?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下的code ..
I have the following code..
DataFrame addressDF = sqlContext.read().parquet(addressParquetPath);
DataFrame propertyDF = sqlContext.read().parquet(propertyParquetPath);
DataFrame joinedFrame = addressDF.join(propertyDF, propertyDF.col("LOCID").equalTo(addressDF.col("locid")), "left");
joinedFrame.registerTempTable("joinedFrame");
DataFrame joinedFrameSelect = sqlContext.sql("SELECT LOCID,AddressID FROM joinedFrame");
在选择LOCID列出两次,我怎么挑地址,而不是财产LOCID。
in the Select LocID is listed twice, how do i pick the LocId of Address instead of property.
我可以对数据帧由列索引执行选择?
Can i execute select on the dataframe by column index?
推荐答案
我通常重命名列 - 你可以试试:
I usually rename the column -- you can either try:
...join(propertyDF.withColumnRenamed("LocID", "LocID_R"), ...
或者,如果你想改变所有的列名的数据帧
一气呵成 - 比如添加一个 _R
为正确的每一个名字 - 你可以试试这个:
Or if you want to change all of the column names for a DataFrame
in one go -- such as add an _R
for "right" to every name -- you can try this:
df.toDF(df.columns.map(_ + "_R"):_*)
此,当你到自身加入一个数据帧
后面是很有用的。
This is useful when you are joining a DataFrame
back onto itself.
这篇关于如何做,在数据帧中出现两次列名来选择?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文