将重复的 pandas 替换为to_sql(sqlite) [英] Replacing duplicates pandas to_sql (sqlite)

查看:117
本文介绍了将重复的 pandas 替换为to_sql(sqlite)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将pandas数据帧附加到sqlite.我的主键是:

I am appending pandas dataframes to sqlite. My primary key is:

Datetime | UserID | CustomerID

我的问题是,有时我会得到一个包含旧数据的新文件,我想将其附加到现有的sqlite表中.我没有将该表读到内存中,所以不能在pandas中drop_duplicates. (例如,一个文件始终是本月至今的数据,并且每天都会发送给我)

My issue is that sometimes I get a new file with old data that I want to append to the existing sqlite table. I am not reading that table into memory so I can't drop_duplicates in pandas. (For example, one file is always month-to-date data and it is sent to me everyday)

如何确保仅基于主键附加唯一值?追加新数据时,是否存在要插入或替换的pandas to_sql函数?

How can I ensure that I am only appending unique values based on my primary key? Is there a pandas to_sql function to insert or replace when I append the new data?

此外,在写入SQL之前,我应该在熊猫中指定dtypes吗?当我尝试写入SQLite时,我收到一些错误消息,并且有分类dtype.

Also, should I specify dtypes in pandas before writing to SQL? I had some error messages when I tried to write to SQLite and I had categorical dtypes.

推荐答案

如果尝试插入重复的数据,则会出现sqlite3.IntegrityError异常.您可以捕获该消息而无所事事,例如:

If you attempt to insert duplicate data you'll get a sqlite3.IntegrityError exception. You can catch that and do nothing, for example:

try:
  df.to_sql('t',conn,flavor='sqlite',if_exists='append',index=False,
            index_label='user_id')
except sqlite3.IntegrityError: 
  pass

这篇关于将重复的 pandas 替换为to_sql(sqlite)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆