使用Pandas .to_sql将JSON列写入Postgres [英] Writing JSON column to Postgres using Pandas .to_sql
问题描述
在 ETL 过程中,我需要从中提取并加载JSON列一个Postgres数据库到另一个.我们之所以使用Pandas,是因为它具有多种方法来读取和写入来自不同源/目的地的数据,并且所有转换都可以使用Python和Pandas编写.我们对诚实的态度感到非常满意..但是我们遇到了问题.
During an ETL process I needed to extract and load a JSON column from one Postgres database to another. We use Pandas for this since it has so many ways to read and write data from different sources/destinations and all the transformations can be written using Python and Pandas. We're quite happy with the approach to be honest.. but we hit a problem.
通常,读取和写入数据非常容易.您只需使用 pandas.read_sql_table 来读取数据源代码和 pandas.to_sql 写入到目的地.但是,由于源表之一具有JSON类型的列(来自Postgres),因此to_sql
函数崩溃并显示以下错误消息.
Usually it's quite easy to read and write the data. You just use pandas.read_sql_table to read the data from the source and pandas.to_sql to write it to the destination. But, since one of the source tables had a column of type JSON (from Postgres) the to_sql
function crashed with the following error message.
df.to_sql(table_name, analytics_db)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/core/generic.py", line 1201, in to_sql
chunksize=chunksize, dtype=dtype)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 470, in to_sql
chunksize=chunksize, dtype=dtype)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 1147, in to_sql
table.insert(chunksize)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 663, in insert
self._execute_insert(conn, keys, chunk_iter)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 638, in _execute_insert
conn.execute(self.insert_statement(), data)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute
return meth(self, multiparams, params)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
compiled_sql, distilled_params
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
context)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
exc_info
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _execute_context
context)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 459, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'
推荐答案
我一直在网上寻找解决方案,但找不到任何解决方案,所以这就是我们想出的办法(也许有更好的方法,但至少这是一个开始,如果有人遇到这个问题.
I've been searching the web for a solution but couldn't find any so here is what we came up with (there might be better ways but at least this is a start if someone else runs into this).
在to_sql
中指定dtype
参数.
我们从:df.to_sql(table_name, analytics_db)
转到了df.to_sql(table_name, analytics_db, dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON})
,它就可以正常工作.
We went from:df.to_sql(table_name, analytics_db)
to df.to_sql(table_name, analytics_db, dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON})
and it just works.
这篇关于使用Pandas .to_sql将JSON列写入Postgres的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!