无法解析“column_name"给定的输入列:SparkSQL [英] Cannot resolve 'column_name' given input columns: SparkSQL
问题描述
我这里有一段简单的代码:
I have a simple piece of code here:
query = """
select id, date, type from schema.camps
"""
df = spark.sql(query)
我收到一条错误消息:
> > "cannot resolve '`id`' given input columns:
> > [ecs_snapshot, ecs_version, ecs_bundle_type]; line 2
文件>>/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py",行>>767,在sql中>>返回 DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) 文件>>/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",>>第 1257 行,调用>>answer, self.gateway_client, self.target_id, self.name) 文件/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",>第 69 行,>>在装饰>>引发 AnalysisException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.AnalysisException:无法解决>>'id
' 给定输入列:[ecs_snapshot,>>ecs_version, ecs_bundle_type];第 2 行 pos 11;"
File
> > "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line
> > 767, in sql
> > return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) File
> > "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
> > line 1257, in call
> > answer, self.gateway_client, self.target_id, self.name) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
> line 69,
> > in deco
> > raise AnalysisException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.AnalysisException: "cannot resolve
> > 'id
' given input columns: [ecs_snapshot,
> > ecs_version, ecs_bundle_type]; line 2 pos 11;"
根据提供的解决方案尽我所能.有趣的部分是我对另一个工作正常的表有另一个查询.将不胜感激这方面的任何帮助.提前致谢.
Tried everything I could based on the solutions provided. Funny part is I have another query on another table that works just fine. Would appreciate any help regarding this. Thanks in advance.
这是表的架构:
camps(
id numeric(38,0) NOT NULL encode raw,
name varchar(765) NULL encode zstd,
type varchar(765) NULL encode zstd,
YYYY varchar(765) NULL encode zstd,
ZZZZ varchar(765) NULL encode zstd,
LLLL varchar(765) NULL encode zstd,
MMMM numeric(38,0) NULL encode zstd,
NNNN varchar(765) NULL encode zstd,
date timestamp without time zone NULL encode zstd,
PPPP numeric(38,0) NULL encode az64,
PRIMARY KEY (marketplace_id, campaign_id)
)
;
推荐答案
请尝试运行代码并显示结果.
Please, try run code and show result.
import spark.implicits._
val df1 = spark.table("ads.dim_campaigns")
df1.printSchema()
// Please, show result
val df2 = df1.select(
'campaign_id,
'external_id,
'start_date,
'program_type,
'advertiser_id
)
df2.printSchema()
// please, show result
这篇关于无法解析“column_name"给定的输入列:SparkSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!