使用Azure Databricks和Scala从Azure Sql表中删除行 [英] Delete rows from Azure Sql table using Azure Databricks with Scala
问题描述
我正在将Azure Databricks与Scala一起使用,我的目标是从Azure SQL表中删除一些行.
I am using Azure Databricks with Scala and my goal is to delete some rows from the Azure SQL table.
要实现此目的,我将JDBC与下推查询一起使用,如下所示:
To achieve this, I am using a pushdown query with JDBC as follows:
val pushdown_query = s"(DELETE FROM ${table_name} WHERE dump_date = '2020-01-07') temp"
val res = spark.read.jdbc(jdbcUrl, pushdown_query, connectionProperties)
但是,出现以下错误:
com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause.
我在下推查询中添加了OUTPUT子句以解决此问题:
I added OUTPUT clause to the pushdown query to solve this:
val pushdown_query = s"(DELETE FROM ${table_name} OUTPUT DELETED.dump_date WHERE dump_date = '2020-01-07') temp"
但是现在出现以下错误:
But now I am getting the following error:
com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in a SELECT statement that is not the immediate source of rows for an INSERT statement.
我做错了什么?我怎样才能做到这一点?有更好的方法吗?
What am I doing wrong? How can I achieve this? Is there a better way?
谢谢.
推荐答案
我还没有找到使用Spark从Azure SQL中删除行的方法,但是我已经使用Java库在Scala中实现了自己的功能:
I haven't found a way to use Spark to delete rows from Azure SQL, but I have implemented my own function in Scala using Java libraries:
import java.util.Properties
import java.sql.Connection
import java.sql.DatabaseMetaData
import java.sql.DriverManager
import java.sql.SQLException
import java.sql.Date
import java.time.LocalDate
// Set credentials
var jdbcUsername = "X"
var jdbcPassword = dbutils.secrets.get("X", "Y")
// Chech that the JDBC driver is available
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")
// Create the JDBC URL
var jdbcHostname = "X"
var jdbcPort = 1433
var jdbcDatabase = "X"
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
def delete_dump_date(table_name:String, dump_date:String){
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
var connObj:Connection = null
var number_of_rows_deleted:Int = 0
try{
Class.forName(driverClass);
connObj = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword);
val statement = connObj.prepareStatement(String.format("DELETE FROM %s WHERE dump_date=?", table_name))
try{
statement.setDate(1, Date.valueOf(LocalDate.parse(dump_date)));
number_of_rows_deleted = statement.executeUpdate();
}
finally{
statement.close();
println(number_of_rows_deleted + " rows deleted.")
}
}
catch {
case e:SQLException => e.printStackTrace();
}
finally{
connObj.close();
}
}
您可以调用该函数:
delete_dump_date(table_name, '2020-01-07')
这篇关于使用Azure Databricks和Scala从Azure Sql表中删除行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!