Python/SQL:用“Null"替换 DataFrame 的空字符串将数据插入数据库的值 [英] Python/ SQL : replacing the empty strings of a DataFrame by a "Null" value to insert the data in a database

查看:427
本文介绍了Python/SQL:用“Null"替换 DataFrame 的空字符串将数据插入数据库的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这个数据框:

REFERENCE = ["GZF882348G", "SFGUZBJLNJU", "FTLNGZ242112", "DFBHGVGHG543"]
IBAN = ["FR7343563", "FR4832545", "FR9858331", "FR2001045"]
DEBIT = [26, '', 856, '']
CREDIT = ['', 324, '', 876]
MONTANT = [641, 33, '', 968]

df = pd.DataFrame({'Référence' : REFERENCE, 'IBAN' : IBAN, 'Débit' : DEBIT, 'Crédit' : CREDIT, 'Montant' : MONTANT})

在我的数据库中插入此类数据时遇到格式问题.列Débit"、Crédit"、Montant"被定义为获取浮点数作为数据.然而,这些列的数据不仅是整数,我也有空字符串,这是我的问题.我知道我必须编写一个条件,用Null"替换空字符串.SQL 格式中的值,但是我不知道如何在 python 或 SQL 中执行此操作.我正在发现/学习 SQL 环境.

I have a problem of format to insert this kind of data in my database. The columns "Débit", "Crédit", "Montant" are defined to get floats as data. However the data of these columns are not only integers, I have empty strings too and that is my issue. I know that I have to write a condition that replace a empty string by a "Null" value in the SQL format, however I do not know how to do that in python or in SQL. I am discovering/learning the SQL environment.

这是我的代码:

import pandas as pd
import pyodbc 

server = '...'
database = '...'
username = '...' 
password = '...'
driver = '...'

connection = pyodbc.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+password)
cursor = connection.cursor()

for i, row in df.iterrows():


    sql_exe = "INSERT INTO dbo.tbl_data_xml (Réference,IBAN,Débit,Crédit,Montant) VALUES (?,?,?,?,?)"
    cursor.execute(sql_exe, tuple(row))
    
    connection.commit()

任何人都可以帮助我.

谢谢

推荐答案

您似乎在 Pandas 数据框中混合了类型,其中字符串 '' 与同一列中的整数组合作为证据通过所有 object 类型.在关系数据库中,您不能混合数据类型.将 '' 转换为字符串 'NULL' 不会解决您的问题.在 SQL 中,NULL <>'NULL'

You appear to be mixing types in Pandas data frame where string, '', is combined with integer in the same column as evidenced by all object types. In relational databases you cannot mix data types. And converting '' to string 'NULL' will not resolve your issue. In SQL, NULL <> 'NULL'

df.dtypes

# Référence    object
# IBAN         object
# Débit        object
# Crédit       object
# Montant      object
# dtype: object

因此,使用 pd.to_numeric 将列转换为数字,其中空字符串 '' 转换为该实体应转换为的 NaNSQL 的 NULL 实体.

Therefore, convert columns to numeric with pd.to_numeric where empty string, '', converts to NaN which this entity should translate to SQL's NULL entity.

df[['Débit', 'Crédit', 'Montant']] = df[['Débit', 'Crédit', 'Montant']].apply(pd.to_numeric)

df.dtypes
# Référence     object
# IBAN          object
# Débit        float64
# Crédit       float64
# Montant      float64
# dtype: object

df
#       Référence       IBAN  Débit  Crédit  Montant
# 0    GZF882348G  FR7343563   26.0     NaN    641.0
# 1   SFGUZBJLNJU  FR4832545    NaN   324.0     33.0
# 2  FTLNGZ242112  FR9858331  856.0     NaN      NaN
# 3  DFBHGVGHG543  FR2001045    NaN   876.0    968.0

然后运行您的查询.事实上,使用 iterrows 避免较慢的 for 循环并考虑 df.to_numpy + cursor.executemany.

Then run your query. In fact, avoid the slower for loop with iterrows and consider df.to_numpy + cursor.executemany.

# PREPARED STATEMENT
sql_exe = "INSERT INTO dbo.tbl_data_xml (Réference,IBAN,Débit,Crédit,Montant) VALUES (?,?,?,?,?)"

# CONVERT DATA TO LIST OF NUMPY ARRAYS
sql_data = df.where(pd.notnull(df), None).to_numpy().replace(.tolist()

# EXECUTE ACTION QUERY
cursor.executemany(sql_exe, sql_data)
connection.commit()

这篇关于Python/SQL:用“Null"替换 DataFrame 的空字符串将数据插入数据库的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆