Apache Spark SQL 是否支持 MERGE 子句? [英] Does Apache Spark SQL support MERGE clause?

查看:80
本文介绍了Apache Spark SQL 是否支持 MERGE 子句?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Apache Spark SQL 是否支持类似于 Oracle 的 MERGE SQL 子句的 MERGE 子句?

Does Apache Spark SQL support MERGE clause that's similar to Oracle's MERGE SQL clause?

MERGE into <table> using (
  select * from <table1>
    when matched then update...
       DELETE WHERE...
    when not matched then insert...
)

推荐答案

Spark 确实支持使用 Delta Lake 作为存储格式的 MERGE 操作.首先要做的是使用delta格式保存表格,以提供对事务功能的支持,并支持使用spark

Spark does support MERGE operation using Delta Lake as storage format. The first thing to do is to save the table using the delta format to provide support for transactional capabilities and support for DELETE/UPDATE/MERGE operations with spark

Python/Scala:df.write.format("delta").save("/data/events")

Python/scala: df.write.format("delta").save("/data/events")

SQL:CREATE TABLE events (eventId long, ...) USING delta

一旦表存在,你就可以运行你常用的 SQL Merge 命令:

Once the table exists, you can run your usual SQL Merge command:

MERGE INTO events
USING updates
ON events.eventId = updates.eventId
WHEN MATCHED THEN
  UPDATE SET events.data = updates.data
WHEN NOT MATCHED
  THEN INSERT (date, eventId, data) VALUES (date, eventId, data)

该命令在 Python/Scala 中也可用:

The command is also available in Python/Scala:

DeltaTable.forPath(spark, "/data/events/")
  .as("events")
  .merge(
    updatesDF.as("updates"),
    "events.eventId = updates.eventId")
  .whenMatched
  .updateExpr(
    Map("data" -> "updates.data"))
  .whenNotMatched
  .insertExpr(
    Map(
      "date" -> "updates.date",
      "eventId" -> "updates.eventId",
      "data" -> "updates.data"))
  .execute()

要支持 Delta Lake 格式,您还需要将 delta 包作为 Spark 作业中的依赖项:

To support Delta Lake format, you also need the delta package as dependency in your spark job:

<dependency>
  <groupId>io.delta</groupId>
  <artifactId>delta-core_x.xx</artifactId>
  <version>xxxx</version>
</dependency>

参见 https://docs.delta.io/latest/delta-update.html#upsert-into-a-table-using-merge 了解更多详情

这篇关于Apache Spark SQL 是否支持 MERGE 子句?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆