从Pandas数据框生成SQL语句 [英] Generate SQL statements from a Pandas Dataframe
问题描述
我正在将来自各种来源(csv,xls,json等...)的数据加载到Pandas数据帧中,我想生成语句来创建并用此数据填充SQL数据库.有人知道这样做的方法吗?
I am loading data from various sources (csv, xls, json etc...) into Pandas dataframes and I would like to generate statements to create and fill a SQL database with this data. Does anyone know of a way to do this?
我知道pandas具有to_sql
函数,但是仅适用于数据库连接,不能生成字符串.
I know pandas has a to_sql
function, but that only works on a database connection, it can not generate a string.
我想要的是像这样取一个数据框:
What I would like is to take a dataframe like so:
import pandas as pd
import numpy as np
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
还有一个会生成此函数的函数(此示例为PostgreSQL,但任何一个都可以):
And a function that would generate this (this example is PostgreSQL but any would be fine):
CREATE TABLE data
(
index timestamp with time zone,
"A" double precision,
"B" double precision,
"C" double precision,
"D" double precision
)
推荐答案
如果只需要'CREATE TABLE'sql代码(而不是数据的插入),则可以使用熊猫的get_schema
函数. io.sql模块:
If you only want the 'CREATE TABLE' sql code (and not the insert of the data), you can use the get_schema
function of the pandas.io.sql module:
In [10]: print pd.io.sql.get_schema(df.reset_index(), 'data')
CREATE TABLE "data" (
"index" TIMESTAMP,
"A" REAL,
"B" REAL,
"C" REAL,
"D" REAL
)
一些注意事项:
- 我不得不使用
reset_index
,因为它否则不包括索引 - 如果您提供具有某种数据库风格的sqlalchemy引擎,则结果将调整为该风格(例如,数据类型名称).
- I had to use
reset_index
because it otherwise didn't include the index - If you provide an sqlalchemy engine of a certain database flavor, the result will be adjusted to that flavor (eg the data type names).
这篇关于从Pandas数据框生成SQL语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!