使用 sqlalchemy 从 PostgreSQL 查询返回 Pandas 数据帧 [英] Return Pandas dataframe from PostgreSQL query with sqlalchemy
问题描述
我想查询 PostgreSQL 数据库并将输出作为 Pandas 数据框返回.
我使用 'SqlAlchemy' 创建了一个到数据库的连接:
from sqlalchemy import create_engineengine = create_engine('postgresql://user@localhost:5432/mydb')
我将 Pandas 数据框写入数据库表:
i=pd.read_csv(path)i.to_sql('Stat_Table',engine,if_exists='replace')
基于文档, 看起来 pd.read_sql_query() 应该接受一个 SQLAlchemy 引擎:
a=pd.read_sql_query('select * from Stat_Table',con=engine)
但它抛出一个错误:
ProgrammingError: (ProgrammingError) 关系stat_table"不存在
我使用的是 Pandas 0.14.1 版.
这样做的正确方法是什么?
您被 PostgreSQL 的大小写(in)敏感性问题所困扰.如果您在查询中引用表名,它将起作用:
df = pd.read_sql_query('select * from "Stat_Table"',con=engine)
但就个人而言,我建议始终使用小写的表名(和列名),在将表写入数据库时也是如此,以防止出现此类问题.
<小时>来自 PostgreSQL 文档 (http:///www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS):
<块引用>引用标识符也使其区分大小写,而未引用的名称始终折叠为小写
再解释一下:您已经将名为 Stat_Table
的表写入数据库(并且 sqlalchemy 会引用此名称,因此在 postgres 数据库中将其写为Stat_Table").执行查询'select * from Stat_Table'
时,未加引号的表名将转换为小写stat_table
,因此您会收到找不到该表的消息.
I want to query a PostgreSQL database and return the output as a Pandas dataframe.
I created a connection to the database with 'SqlAlchemy':
from sqlalchemy import create_engine
engine = create_engine('postgresql://user@localhost:5432/mydb')
I write a Pandas dataframe to a database table:
i=pd.read_csv(path)
i.to_sql('Stat_Table',engine,if_exists='replace')
Based on the docs, looks like pd.read_sql_query() should accept a SQLAlchemy engine:
a=pd.read_sql_query('select * from Stat_Table',con=engine)
But it throws an error:
ProgrammingError: (ProgrammingError) relation "stat_table" does not exist
I'm using Pandas version 0.14.1.
What's the right way to do this?
You are bitten by the case (in)sensitivity issues with PostgreSQL. If you quote the table name in the query, it will work:
df = pd.read_sql_query('select * from "Stat_Table"',con=engine)
But personally, I would advise to just always use lower case table names (and column names), also when writing the table to the database to prevent such issues.
From the PostgreSQL docs (http://www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS):
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case
To explain a bit more: you have written a table with the name Stat_Table
to the database (and sqlalchemy will quote this name, so it will be written as "Stat_Table" in the postgres database). When doing the query 'select * from Stat_Table'
the unquoted table name will be converted to lower case stat_table
, and so you get the message that this table is not found.
See eg also Are PostgreSQL column names case-sensitive?
这篇关于使用 sqlalchemy 从 PostgreSQL 查询返回 Pandas 数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!