使用IDX AutoIncrement将数据框插入到PostgreSQL SQLAlchemy中 [英] Insert dataframe into postgresql sqlalchemy with idx autoincrement

查看:69
本文介绍了使用IDX AutoIncrement将数据框插入到PostgreSQL SQLAlchemy中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要requests.get()来获取一些json.之后,我想将数据插入到postgresql中.发生了一件非常有趣的事情,如果使用df.to_sql(index=False),则数据毫无问题地追加到了postgresql中,但是postgresql中的ID并没有创建autoincrement值.该列完全为空.如果消除df.to_sql()中的参数,则会出现以下错误... IntegrityError: (psycopg2.IntegrityError) duplicate key value violates unique constraint.这是我的代码...

I'm requests.get() to get some json. After that, I want to insert the data into postgresql. Something very interesting is happening, if I use the df.to_sql(index=False), the data gets appended into postgresql with no problem, but the Id in postgresql is not creating the autoincrement value; the column is totally empty. If I eliminate the parameter in df.to_sql() then I get the following error... IntegrityError: (psycopg2.IntegrityError) duplicate key value violates unique constraint. Here is my code...

import requests
import pandas as pd
import sqlalchemy

urls = ['https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22DIA%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22DIA%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-08%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22SPY%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22SPY%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-08%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22IWN%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22IWN%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-08%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=']
df_list = []
for url in urls:
    data = requests.get(url)
    data_json = data.json()
    df = pd.DataFrame(data_json['query']['results']['quote'])
    df_list.append(df)


quote_df = pd.concat(df_list)
engine = sqlalchemy.create_engine('postgresql://postgres:wpc,.2016@localhost:5432/stocks')
quote_df.to_sql('quotes', engine, if_exists='append')

我想用postgresql自动增量索引将df插入到postgresql中. 我该如何修复我的代码.

I would like to insert the df into postgresql with the postgresql autoincrement index. How can I fix my code to do so.

我添加以下代码来修复数据框中的索引...

I add the following code to fix the indexing in the data frame...

quote_df = pd.concat(df_list)
quote_df.index.name = 'Index'
quote_df = quote_df.reset_index()
quote_df['Index'] = quote_df.index

engine = create_engine('postgresql://postgres:wpc,.2016@localhost:5432/stocks')

quote_df.to_sql('quotes',engine,if_exists ='append',index = False) engine.dispose()

quote_df.to_sql('quotes', engine, if_exists = 'append', index=False) engine.dispose()

现在追加到postgresql时出现以下错误...

Now I'm having the following error when appending to postgresql...

ProgrammingError: (psycopg2.ProgrammingError) column "Index" of relation "quotes" does not exist LINE 1: INSERT INTO quotes ("Index", "Adj_Close", "Close", "Date", "... 

该列确实存在于数据库中.

The column does exists in the database.

推荐答案

回答我自己的问题11NOV2016 1112

我发现在df.reset_index()之后,我可以删除创建的多余列pandas并且原始索引列保持重置状态.现在,如果执行不带index=False的代码,则sqlalchemy会将索引插入到postgres中.这是解决我的问题的代码...

Answer to my own question 11NOV2016 1112

I figure out that after df.reset_index(), I can delete the extra column pandas create and the original index column stays reset. Now if I execute the code without index=False, sqlalchemy will insert the index into postgres. Here is the code that solve my problem...

import requests
import pandas as pd
from sqlalchemy import create_engine

urls = ['https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22DIA%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22DIA%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-11%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22SPY%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22SPY%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-11%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22IWN%22%20and%20startDate%20%3D%20%222015-01-01%22%20and%20endDate%20%3D%20%222015-12-31%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=',
    'https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.historicaldata%20where%20symbol%20%3D%20%22IWN%22%20and%20startDate%20%3D%20%222016-01-01%22%20and%20endDate%20%3D%20%222016-11-11%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=']

df_list = []
for url in urls:
    data = requests.get(url)
    data_json = data.json()
    df = pd.DataFrame(data_json['query']['results']['quote'])
    df_list.append(df)

quote_df = pd.concat(df_list)
quote_df = quote_df.reset_index()
quote_df = quote_df.drop('index', 1)


engine = create_engine('postgresql://postgres:wpc,.2016@localhost:5432/stocks')
quote_df.to_sql('quotes', engine, if_exists='append')

这篇关于使用IDX AutoIncrement将数据框插入到PostgreSQL SQLAlchemy中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆