从Pandas数据框生成SQL语句 [英] Generate SQL statements from a Pandas Dataframe

查看:409
本文介绍了从Pandas数据框生成SQL语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将来自各种来源(csv,xls,json等...)的数据加载到Pandas数据帧中,我想生成语句来创建并用此数据填充SQL数据库.有人知道这样做的方法吗?

I am loading data from various sources (csv, xls, json etc...) into Pandas dataframes and I would like to generate statements to create and fill a SQL database with this data. Does anyone know of a way to do this?

我知道pandas具有to_sql函数,但是仅适用于数据库连接,不能生成字符串.

I know pandas has a to_sql function, but that only works on a database connection, it can not generate a string.

我想要的是像这样取一个数据框:

What I would like is to take a dataframe like so:

import pandas as pd
import numpy as np

dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))

还有一个会生成此函数的函数(此示例为PostgreSQL,但任何一个都可以):

And a function that would generate this (this example is PostgreSQL but any would be fine):

CREATE TABLE data
(
  index timestamp with time zone,
  "A" double precision,
  "B" double precision,
  "C" double precision,
  "D" double precision
)

推荐答案

如果只需要'CREATE TABLE'sql代码(而不是数据的插入),则可以使用熊猫的get_schema函数. io.sql模块:

If you only want the 'CREATE TABLE' sql code (and not the insert of the data), you can use the get_schema function of the pandas.io.sql module:

In [10]: print pd.io.sql.get_schema(df.reset_index(), 'data')
CREATE TABLE "data" (
  "index" TIMESTAMP,
  "A" REAL,
  "B" REAL,
  "C" REAL,
  "D" REAL
)

一些注意事项:

  • 我不得不使用reset_index,因为它否则不包括索引
  • 如果您提供具有某种数据库风格的sqlalchemy引擎,则结果将调整为该风格(例如,数据类型名称).
  • I had to use reset_index because it otherwise didn't include the index
  • If you provide an sqlalchemy engine of a certain database flavor, the result will be adjusted to that flavor (eg the data type names).

这篇关于从Pandas数据框生成SQL语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆