在SQLite中更新/合并表 [英] Upsert / merge tables in SQLite
问题描述
我已经在Python中使用sqlite3
创建了一个数据库,该数据库具有数千个表.这些表中的每个表都包含数千行和十列.列之一是事件的日期和时间:它是一个格式为YYYY-mm-dd HH:MM:SS
的字符串,我已将其定义为每个表的主键.每隔一段时间,我都会为每个表收集一些新数据(数百行).每个新数据集均从服务器中提取,并直接作为pandas
数据框加载,或存储为CSV文件.新数据包含与我的原始数据相同的十列.我需要通过以下方式使用此新数据更新数据库中的表:
I have created a database using sqlite3
in python that has thousands of tables. Each of these tables contains thousands of rows and ten columns. One of the columns is the date and time of an event: it is a string that is formatted as YYYY-mm-dd HH:MM:SS
, which I have defined to be the primary key for each table. Every so often, I collect some new data (hundreds of rows) for each of these tables. Each new dataset is pulled from a server and loaded in directly as a pandas
data frame or is stored as a CSV file. The new data contains the same ten columns as my original data. I need to update the tables in my database using this new data in the following way:
- 给我的数据库中的一个表,对于新数据集中的每一行,如果该行的日期和时间与我数据库中现有行的日期和时间相匹配,请使用中的值更新该行的其余列新的数据集.
- 如果日期和时间尚不存在,请创建新行并将其插入到我的数据库中.
以下是我的问题:
- 我已经在Google上进行了一些搜索,看来我应该使用 UPSERT(合并)功能
sqlite
,但我似乎找不到任何示例来说明如何使用它.是否有实际的UPSERT命令,如果有的话,有人可以提供示例(最好是Python中的sqlite3
)或为我提供有用的资源吗? - 还有一种方法可以批量执行此操作,这样我就可以将每个新数据集UPSERT到我的数据库中,而不必逐行进行? (我发现此链接,这表明有可能,但我是使用数据库的新手,不确定如何实际运行UPSERT命令.)
- 是否可以使用
pandas.DataFrame.to_sql
直接执行UPSERT?
- I've done some searching on Google and it looks like I should be using the UPSERT (merge) functionality of
sqlite
but I can't seem to find any examples showing how to use it. Is there an actual UPSERT command, and if so, could someone please provide an example (preferably withsqlite3
in Python) or point me to a helpful resource? - Also, is there a way to do this in bulk so that I can UPSERT each new dataset into my database without having to go row by row? (I found this link, which suggests that it is possible, but I'm new to using databases and am not sure how to actually run the UPSERT command.)
- Can UPSERT also be performed directly using
pandas.DataFrame.to_sql
?
我的备份解决方案是使用pd.read_sql_query("SELECT * from table", con)
加载要进行UPSERT的表,执行pandas.DataFrame.merge
,从数据库中删除该表,然后使用pd.DataFrame.to_sql
将更新后的表添加到数据库中(但这会效率低下).
My backup solution is loading in the table to be UPSERTed using pd.read_sql_query("SELECT * from table", con)
, performing pandas.DataFrame.merge
, deleting the said table from the database, and then adding in the updated table to the database using pd.DataFrame.to_sql
(but this would be inefficient).
推荐答案
首先,即使问题是相关的,以后也要分别询问.
First, even though the questions are related, ask them separately in the future.
-
在SQLite中有关于UPSERT处理的文档,该文档记录了如何使用它但这有点抽象.您可以在此处查看示例和讨论: SQLite-UPSERT *不是* INSERT或REPLACE
There is documentation on UPSERT handling in SQLite that documents how to use it but it is a bit abstract. You can check examples and discussion here: SQLite - UPSERT *not* INSERT or REPLACE
使用事务,该语句将被批量执行.
Use a transaction and the statements are going to be executed in bulk.
由于此库的存在提示to_sql
不会创建UPSERT命令(仅INSERT).
As presence of this library suggests to_sql
does not create UPSERT commands (only INSERT).
这篇关于在SQLite中更新/合并表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!