根据数据帧中的内容从SQL Server删除行 [英] Delete rows from SQL server bases on content in dataframe

查看:70
本文介绍了根据数据帧中的内容从SQL Server删除行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在名为dbo.inventory的SQL Server中有一个清单表,其中包含YearMonthMaterialStock_quantity.我每天都会以csv文件的形式收到新的库存盘点,需要将其加载到dbo.inventory表中.但是,如果csv文件中的YearMonth已在数据库中退出,我确实需要删除数据库中的记录,以避免在同一个月加载多个库存盘点.

I have an inventory table in a SQL Server called dbo.inventory which contains Year, Month, Material and Stock_quantity. I receive a new inventory count as csv file each day and need to load this into the dbo.inventory table. I do however need to delete records in the database if Year and Month from the csv file already exits in the database in order to avoid loading multiple inventory counts for the same month.

在SQL中,我会这样:

In SQL I would do it like this:

Delete t1 
FROM dbo.inventory t1
JOIN csv t2 ON t1.Year = t2.Year and t1.Month = t2.Month

我不知道如何在Python脚本中执行此操作,因此避免将CSV文件作为登台表加载到数据仓库中,而只是删除与YearMonth匹配的现有行,然后加载它们.

I don't know how to do it in a Python script so I avoid loading my CSV file as a staging table into the datawarehouse, but just delete existing rows matching Year and Month and then loading them.

我在其他设置中使用了以下内容:

I have used the following in another setup:

delete_date = sales.Date.max()
connection = engine.connect()
connection.execute(f"""delete from sales where Date = '{delete_date}'""")
connection.close()

但这在这里不起作用,因为应该删除的输入是一个数据框,理论上,如果对早期加载的数字进行更正,则该数据框可能包含多年和几个月.

But this doesn't work here as the input for what should be deleted is a dataframe, which in theory could contain multiple year and months if it a correction to earlier loaded figures.

推荐答案

Pandas不支持根据特定条件删除SQL行.您必须自己删除行:

Pandas doesn't support deletion of SQL rows based on specific conditions. You have to delete the rows yourself:

import sqlalchemy as sa

engine = sa.create_engine('mssql+pyodbc://...')
meta = sa.MetaData()

# Map the Inventory table in your database to a SQLAlchemy object
inventory = sa.Table('Inventory', meta, autoload=True, autoload_with=engine)

# Build the WHERE clause of your DELETE statement from rows in the dataframe.
# Equivalence in T-SQL
#      WHERE (Year = ... AND Month = ...) OR (Year = ... AND Month = ...) OR (Year = ... AND Month = ...)
cond = df.apply(lambda row: sa.and_(inventory.c['Year'] == row['Year'], inventory.c['Month'] == row['Month']), axis=1)
cond = sa.or_(*cond)

# Define and execute the DELETE
delete = inventory.delete().where(cond)
with engine.connect() as conn:
    conn.execute(delete)

# Now you can insert the new data
df.to_sql('Inventory', engine, if_exists='append', index=False)

这篇关于根据数据帧中的内容从SQL Server删除行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆