将Pandas数据框上传到MySQL数据库后如何获取列的自动增量值 [英] How to get autoincrement values for a column after uploading a Pandas dataframe to a MySQL database

查看：185 发布时间：2020/8/11 1:20:36 python mysql pandas sqlalchemy

本文介绍了将Pandas数据框上传到MySQL数据库后如何获取列的自动增量值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Pandas DataFrame(称为df)，我想将其上传到MySql数据库. 数据框具有列[ A ， B ， C ]，数据库中的表具有列[ ID ， A ， B ， C ].数据库中的 ID 列是自动递增的主键.

I have a Pandas DataFrame (called df), which I would like to upload to a MySql database. The dataframe has columns [A, B, C] and the table in the database has columns [ID, A, B, C]. The ID column in the database is the auto-incrementing primary key.

我可以使用df.to_sql('table_name', engine)命令将数据帧上传到数据库.但是，这没有给我任何有关数据库分配给传入数据的 ID 列的值的信息.我唯一获得此信息的方法是使用列 A ， B ， C 的值查询数据库:

I can upload the dataframe to the database using the df.to_sql('table_name', engine) command. However, this does not give me any information about the values that the database assigned to the ID column of the incoming data. The only way I have of getting this information is by querying the database using the values for columns A, B, C:

select 
ID, A, B, C 
from db_table 
where (A, B, C) in ((x1, y1, z1), (x2, y2, z2), ...)

但是，当我插入大量数据时，此查询将花费很长时间.

However, this query takes a very long time when I am inserting a lot of data.

是否有更简单快捷的方法来获取数据库分配给传入数据的 ID 列的值?

Is there a simpler and quicker way of getting the values that the database assigned to the ID column of the incoming data?

修改1: 我可以根据下面的user3364098的答案自行分配 ID 列.但是，我的工作是并行运行的管道的一部分.如果我自己分配 ID 列，则有可能将相同的 id 值分配给同时上传的不同数据框.这就是为什么我想将 ID 分配任务委托给数据库的原因.

Edit 1: I can assign the ID column myself, as per user3364098's answer below. However, my job is part of a pipeline that is ran in parallel. If I assign the ID column myself, there is a chance that I may assign the same id values to different dataframes that are uploaded at the same time. This is why I would like to relegate the ID assignment task to the database.

解决方案: 我最终自己分配了 ID 列，并在上载数据时在表上发出了锁，以确保没有其他进程上载具有相同id值的数据.基本上:

Solution: I ended up assigning the ID column myself, and issuing a lock on the table while uploading the data in order to guarantee that no other process uploads data with the same id value. Basically:

try:
    engine.execute('lock tables `table_name` write')
    max_id_query = 'select max(ID) FROM `table_name`'
    max_id = int(pd.read_sql_query(max_id_query, engine).values)
    df['ID'] = range(max_id + 1, max_id + len(df) + 1)
    df.to_sql('table_name', engine, if_exists='append', index=False)
finally:
    engine.execute('unlock tables')

将Pandas数据框上传到MySQL数据库后如何获取列的自动增量值 [英] How to get autoincrement values for a column after uploading a Pandas dataframe to a MySQL database

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

将Pandas数据框上传到MySQL数据库后如何获取列的自动增量值 [英] How to get autoincrement values for a column after uploading a Pandas dataframe to a MySQL database

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭