加入两个数据帧时，数据帧中的引用列会引发模棱两可的错误，其中一个数据帧具有一组引用键 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

查看：23 发布时间：2021/11/14 22:53:45 scala dataframe apache-spark join apache-spark-sql

本文介绍了加入两个数据帧时，数据帧中的引用列会引发模棱两可的错误，其中一个数据帧具有一组引用键的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个数据框如下

数据框一

+--------------------------------------------
|______subject_______________|______marks___|
| Maths                      |    89        |
| English                    |    90        |
| Religion                   |    80        |
---------------------------------------------

数据框二

+-------------------------------------------------------------
|______name__________________|______subject__________________|
| Liza                       |   [Maths]                     |
| Inter                      |   [Religion, English]         |
| Ovin                       |   [Maths, Religion, English]  |
--------------------------------------------------------------

预期输出

+-------------------------------------------------------------
|______name__________________|______marks____________________|
| Liza                       |   [89]                        |
| Inter                      |   [80, 90]                    |
| Religion                   |   [89, 80, 90]                |
--------------------------------------------------------------

要获得上述输出，我需要加入 dataframeOne 和 DataframeTwo.但是在 dataframeTwo 主题 Column 中有数组，而 dataframe one 有一个字符串值.我尝试了下面的代码，错误后跟

To get the above output I need to join dataframeOne and DataframeTwo. But in dataframeTwo subject Column is having arrays while dataframe one is having a string value. I tried the below code with the error followed by

val newDataframe = dataframeTwo.withColumn("myMarks", struct('marks))
    val studentMarksDataframe = dataframeOne.join(newDataframe, array_contains(subject, subject)).agg(collect_list('myMarks))

错误

线程main"中的异常org.apache.spark.sql.AnalysisException:引用 'unicode' 不明确，可能是:subject、subject

Exception in thread "main" org.apache.spark.sql.AnalysisException: Reference 'unicode' is ambiguous, could be: subject, subject

如何解决上述问题?

加入两个数据帧时，数据帧中的引用列会引发模棱两可的错误，其中一个数据帧具有一组引用键 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

加入两个数据帧时，数据帧中的引用列会引发模棱两可的错误，其中一个数据帧具有一组引用键 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭