Pyspark连接到Microsoft SQL Server? [英] Pyspark connection to the Microsoft SQL server?

查看：81 发布时间：2021/5/20 18:39:29 sql-server jdbc pyspark odbc pyodbc

本文介绍了Pyspark连接到Microsoft SQL Server?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在SQL Server中有一个庞大的数据集，我想将SQL Server与python连接，然后使用pyspark运行查询.

I have a huge dataset in SQL server, I want to Connect the SQL server with python, then use pyspark to run the query.

我已经看过JDBC驱动程序，但是我没有找到实现的方法，我是用PYODBC来做的，但是没有火花.

I've seen the JDBC driver but I don't find the way to do it, I did it with PYODBC but not with a spark.

任何帮助将不胜感激.

推荐答案

请使用以下内容连接到Microsoft SQL:

Please use the following to connect to Microsoft SQL:

def connect_to_sql(
    spark, jdbc_hostname, jdbc_port, database, data_table, username, password
):
    jdbc_url = "jdbc:sqlserver://{0}:{1}/{2}".format(jdbc_hostname, jdbc_port, database)

    connection_details = {
        "user": username,
        "password": password,
        "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
    }

    df = spark.read.jdbc(url=jdbc_url, table=data_table, properties=connection_details)
    return df

spark 是一个 SparkSession 对象，其余的都很清楚.

spark is a SparkSession object, and the rest are pretty clear.

您还可以将下推查询传递给 read.jdbc

You can also pass pushdown queries to read.jdbc

这篇关于Pyspark连接到Microsoft SQL Server?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Pyspark连接到Microsoft SQL Server? [英] Pyspark connection to the Microsoft SQL server?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

Pyspark连接到Microsoft SQL Server? [英] Pyspark connection to the Microsoft SQL server?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭