联接两个数据帧(其中一个数据帧具有一组引用键)时，数据帧中的引用列引发歧义错误 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

查看：43 发布时间：2021/4/8 19:48:48 scala dataframe apache-spark join apache-spark-sql

本文介绍了联接两个数据帧(其中一个数据帧具有一组引用键)时，数据帧中的引用列引发歧义错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个数据框，如下所示

I have two dataframes as follows

dataframeOne

+--------------------------------------------
|______subject_______________|______marks___|
| Maths                      |    89        |
| English                    |    90        |
| Religion                   |    80        |
---------------------------------------------

dataframeTwo

+-------------------------------------------------------------
|______name__________________|______subject__________________|
| Liza                       |   [Maths]                     |
| Inter                      |   [Religion, English]         |
| Ovin                       |   [Maths, Religion, English]  |
--------------------------------------------------------------

预期产量

+-------------------------------------------------------------
|______name__________________|______marks____________________|
| Liza                       |   [89]                        |
| Inter                      |   [80, 90]                    |
| Religion                   |   [89, 80, 90]                |
--------------------------------------------------------------

要获得以上输出，我需要加入dataframeOne和DataframeTwo.但是在dataframeTwo主题中，Column具有数组，而dataframeone具有字符串值.我尝试了以下代码，并出现了错误

To get the above output I need to join dataframeOne and DataframeTwo. But in dataframeTwo subject Column is having arrays while dataframe one is having a string value. I tried the below code with the error followed by

val newDataframe = dataframeTwo.withColumn("myMarks", struct('marks))
    val studentMarksDataframe = dataframeOne.join(newDataframe, array_contains(subject, subject)).agg(collect_list('myMarks))

错误

线程"main"中的异常；org.apache.spark.sql.AnalysisException:参考"unicode"不明确，可能是:主题，主题

Exception in thread "main" org.apache.spark.sql.AnalysisException: Reference 'unicode' is ambiguous, could be: subject, subject

我该如何解决以上问题?

How can I solve the above issue?

联接两个数据帧(其中一个数据帧具有一组引用键)时，数据帧中的引用列引发歧义错误 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

联接两个数据帧(其中一个数据帧具有一组引用键)时，数据帧中的引用列引发歧义错误 [英] Reference columns in dataframes throwing ambiguous error when joining two dataframes where one dataframe has an array of reference keys

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭