Python/SQL:用“Null"替换 DataFrame 的空字符串将数据插入数据库的值 [英] Python/ SQL : replacing the empty strings of a DataFrame by a "Null" value to insert the data in a database
问题描述
假设我有这个数据框:
REFERENCE = ["GZF882348G", "SFGUZBJLNJU", "FTLNGZ242112", "DFBHGVGHG543"]
IBAN = ["FR7343563", "FR4832545", "FR9858331", "FR2001045"]
DEBIT = [26, '', 856, '']
CREDIT = ['', 324, '', 876]
MONTANT = [641, 33, '', 968]
df = pd.DataFrame({'Référence' : REFERENCE, 'IBAN' : IBAN, 'Débit' : DEBIT, 'Crédit' : CREDIT, 'Montant' : MONTANT})
在我的数据库中插入此类数据时遇到格式问题.列Débit"、Crédit"、Montant"被定义为获取浮点数作为数据.然而,这些列的数据不仅是整数,我也有空字符串,这是我的问题.我知道我必须编写一个条件,用Null"替换空字符串.SQL 格式中的值,但是我不知道如何在 python 或 SQL 中执行此操作.我正在发现/学习 SQL 环境.
I have a problem of format to insert this kind of data in my database. The columns "Débit", "Crédit", "Montant" are defined to get floats as data. However the data of these columns are not only integers, I have empty strings too and that is my issue. I know that I have to write a condition that replace a empty string by a "Null" value in the SQL format, however I do not know how to do that in python or in SQL. I am discovering/learning the SQL environment.
这是我的代码:
import pandas as pd
import pyodbc
server = '...'
database = '...'
username = '...'
password = '...'
driver = '...'
connection = pyodbc.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+password)
cursor = connection.cursor()
for i, row in df.iterrows():
sql_exe = "INSERT INTO dbo.tbl_data_xml (Réference,IBAN,Débit,Crédit,Montant) VALUES (?,?,?,?,?)"
cursor.execute(sql_exe, tuple(row))
connection.commit()
任何人都可以帮助我.
谢谢
推荐答案
您似乎在 Pandas 数据框中混合了类型,其中字符串 ''
与同一列中的整数组合作为证据通过所有 object
类型.在关系数据库中,您不能混合数据类型.将 ''
转换为字符串 'NULL'
不会解决您的问题.在 SQL 中,NULL <>'NULL'
You appear to be mixing types in Pandas data frame where string, ''
, is combined with integer in the same column as evidenced by all object
types. In relational databases you cannot mix data types. And converting ''
to string 'NULL'
will not resolve your issue. In SQL, NULL <> 'NULL'
df.dtypes
# Référence object
# IBAN object
# Débit object
# Crédit object
# Montant object
# dtype: object
因此,使用 pd.to_numeric
将列转换为数字,其中空字符串 ''
转换为该实体应转换为的 NaN
SQL 的 NULL
实体.
Therefore, convert columns to numeric with pd.to_numeric
where empty string, ''
, converts to NaN
which this entity should translate to SQL's NULL
entity.
df[['Débit', 'Crédit', 'Montant']] = df[['Débit', 'Crédit', 'Montant']].apply(pd.to_numeric)
df.dtypes
# Référence object
# IBAN object
# Débit float64
# Crédit float64
# Montant float64
# dtype: object
df
# Référence IBAN Débit Crédit Montant
# 0 GZF882348G FR7343563 26.0 NaN 641.0
# 1 SFGUZBJLNJU FR4832545 NaN 324.0 33.0
# 2 FTLNGZ242112 FR9858331 856.0 NaN NaN
# 3 DFBHGVGHG543 FR2001045 NaN 876.0 968.0
然后运行您的查询.事实上,使用 iterrows
避免较慢的 for
循环并考虑 df.to_numpy
+ cursor.executemany
.>
Then run your query. In fact, avoid the slower for
loop with iterrows
and consider df.to_numpy
+ cursor.executemany
.
# PREPARED STATEMENT
sql_exe = "INSERT INTO dbo.tbl_data_xml (Réference,IBAN,Débit,Crédit,Montant) VALUES (?,?,?,?,?)"
# CONVERT DATA TO LIST OF NUMPY ARRAYS
sql_data = df.where(pd.notnull(df), None).to_numpy().replace(.tolist()
# EXECUTE ACTION QUERY
cursor.executemany(sql_exe, sql_data)
connection.commit()
这篇关于Python/SQL:用“Null"替换 DataFrame 的空字符串将数据插入数据库的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!