使用Azure Databricks和Scala从Azure Sql表中删除行 [英] Delete rows from Azure Sql table using Azure Databricks with Scala

查看:76
本文介绍了使用Azure Databricks和Scala从Azure Sql表中删除行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将Azure Databricks与Scala一起使用,我的目标是从Azure SQL表中删除一些行.

I am using Azure Databricks with Scala and my goal is to delete some rows from the Azure SQL table.

要实现此目的,我将JDBC与下推查询一起使用,如下所示:

To achieve this, I am using a pushdown query with JDBC as follows:

val pushdown_query = s"(DELETE FROM ${table_name} WHERE dump_date = '2020-01-07') temp"
val res = spark.read.jdbc(jdbcUrl, pushdown_query, connectionProperties)  

但是,出现以下错误:

com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause.

我在下推查询中添加了OUTPUT子句以解决此问题:

I added OUTPUT clause to the pushdown query to solve this:

val pushdown_query = s"(DELETE FROM ${table_name} OUTPUT DELETED.dump_date WHERE dump_date = '2020-01-07') temp"

但是现在出现以下错误:

But now I am getting the following error:

com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in a SELECT statement that is not the immediate source of rows for an INSERT statement.

我做错了什么?我怎样才能做到这一点?有更好的方法吗?

What am I doing wrong? How can I achieve this? Is there a better way?

谢谢.

推荐答案

我还没有找到使用Spark从Azure SQL中删除行的方法,但是我已经使用Java库在Scala中实现了自己的功能:

I haven't found a way to use Spark to delete rows from Azure SQL, but I have implemented my own function in Scala using Java libraries:

import java.util.Properties
import java.sql.Connection
import java.sql.DatabaseMetaData
import java.sql.DriverManager
import java.sql.SQLException
import java.sql.Date
import java.time.LocalDate

  
// Set credentials
var jdbcUsername = "X"
var jdbcPassword = dbutils.secrets.get("X", "Y")

// Chech that the JDBC driver is available
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")

// Create the JDBC URL
var jdbcHostname = "X"
var jdbcPort = 1433
var jdbcDatabase = "X"
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"

def delete_dump_date(table_name:String, dump_date:String){

  val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
  var connObj:Connection = null
  var number_of_rows_deleted:Int = 0
  try{
      Class.forName(driverClass);
      connObj = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword);
      val statement = connObj.prepareStatement(String.format("DELETE FROM %s WHERE dump_date=?", table_name))
      try{
          statement.setDate(1, Date.valueOf(LocalDate.parse(dump_date)));
          number_of_rows_deleted = statement.executeUpdate();
      }
      finally{
          statement.close();
          println(number_of_rows_deleted + " rows deleted.")
      }
  }
  catch {
      case e:SQLException => e.printStackTrace();
  }
  finally{
      connObj.close();
  }
}

您可以调用该函数:

delete_dump_date(table_name, '2020-01-07')

这篇关于使用Azure Databricks和Scala从Azure Sql表中删除行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆