尝试将 pandas 数据框插入临时表 [英] Trying to insert pandas dataframe to temporary table

查看:76
本文介绍了尝试将 pandas 数据框插入临时表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个临时表并向其中插入一些数据.我已经广泛使用 pyodbc 来提取数据,但我不熟悉从 python 环境将数据写入 SQL.我在工作中这样做,所以我没有能力创建表,但我可以创建临时表和全局临时表.我的意图是将一个相对较小的数据帧(150 行 x 4cols)插入到临时表中并在我的整个会话中引用它,我的程序结构使得会话中的全局变量不够用.我在尝试时收到以下错误下面的一块,我做错了什么?

I'm looking to create a temp table and insert a some data into it. I have used pyodbc extensively to pull data but I am not familiar with writing data to SQL from a python environment. I am doing this at work so I dont have the ability to create tables, but I can create temp and global temp tables. My intent is to insert a relatively small dataframe (150rows x 4cols)into a temp table and reference it throughout my session, my program structure makes it so that a global variable in the session will not suffice.I am getting the following error when trying the piece below, what am I doing wrong?

pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")

import numpy as np
import pandas as pd
import pyodbc


conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=SERVER;'
                      'Database=DATABASE;'
                      'Trusted_Connection=yes;')

cursor = conn.cursor()

temp_creator = '''CREATE TABLE #rankings (Col1 int, Col2 int)'''

cursor.execute(temp_creator)

df_insert = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2':[4,5,6]})
df_insert.to_sql(r'#rankings', conn, if_exists='append')
read_query = '''SELECT * FROM #rankings'''
df_back = pd.read_sql(read_query,conn)

推荐答案

Pandas.to_sql 在那里失败.但是对于 SQL Server 2016+/Azure SQL 数据库,无论如何都有更好的方法.不是让 Pandas 插入每一行,而是将整个数据帧以 JSON 格式发送到服务器并将其插入到单个语句中.像这样:

Pandas.to_sql is failing there. But for SQL Server 2016+/Azure SQL Database there's a better way in any case. Instead of having pandas insert each row, send the whole dataframe to the server in JSON format and insert it in a single statement. Like this:

import numpy as np
import pandas as pd
import pyodbc

conn = pyodbc.connect('Driver={Sql Server};'
                      'Server=localhost;'
                      'Database=tempdb;'
                      'Trusted_Connection=yes;')

cursor = conn.cursor()

temp_creator = '''CREATE TABLE #rankings (Col1 int, Col2 int);'''
cursor.execute(temp_creator)

df_insert = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2':[4,5,6]})

df_json = df_insert.to_json(orient='records')
print(df_json)

load_df = """\
insert into #rankings(Col1, Col2)
select Col1, Col2
from openjson(?)
with 
(
  Col1 int '$.Col1',
  Col2 int '$.Col2'
);
"""

cursor.execute(load_df,df_json)

#df_insert.to_sql(r'#rankings', conn, if_exists='append')
read_query = '''SELECT * FROM #rankings'''
df_back = pd.read_sql(read_query,conn)
print(df_back)

哪个输出

[{"Col1":1,"Col2":4},{"Col1":2,"Col2":5},{"Col1":3,"Col2":6}]
   Col1  Col2
0     1     4
1     2     5
2     3     6
Press any key to continue . . .

这篇关于尝试将 pandas 数据框插入临时表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆