Python Pandas使用NaN值写入sql [英] Python Pandas write to sql with NaN values

查看:672
本文介绍了Python Pandas使用NaN值写入sql的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从ascii读取几百张表,然后将它们写入mySQL.用Pandas似乎很容易,但是我遇到了一个对我来说没有意义的错误:

I'm trying to read a few hundred tables from ascii and then write them to mySQL. It seems easy to do with Pandas but I hit an error that doesn't make sense to me:

我有一个8列的数据框.这是列列表/索引:

I have a data frame of 8 columns. Here is the column list/index:

metricDF.columns

Index([u'FID', u'TYPE', u'CO', u'CITY', u'LINENO', u'SUBLINE', u'VALUE_010', u'VALUE2_015'], dtype=object)

然后我使用to_sql将数据附加到mySQL

I then use to_sql to append the data up to mySQL

metricDF.to_sql(con=con, name=seqFile, if_exists='append', flavor='mysql')

我收到关于列为"nan"的奇怪错误:

I get a strange error about a column being "nan":

OperationalError: (1054, "Unknown column 'nan' in 'field list'")

您可以看到我所有的列都有名称.我意识到mysql/sql对编写的支持出现在开发中,所以也许这是原因吗?如果是这样,是否可以解决?任何建议将不胜感激.

As you can see all my columns have names. I realize mysql/sql support for writing appears in development so perhaps that's the reason? If so is there a work around? Any suggestions would be greatly appreciated.

推荐答案

更新:从熊猫0.15开始,to_sql支持写入NaN值(在数据库),因此不再需要下面描述的解决方法(请参见 https://github.com /pydata/pandas/pull/8208 ).
熊猫0.15将于10月发布,并且该功能已合并到开发版本中.

Update: starting with pandas 0.15, to_sql supports writing NaN values (they will be written as NULL in the database), so the workaround described below should not be needed anymore (see https://github.com/pydata/pandas/pull/8208).
Pandas 0.15 will be released in coming October, and the feature is merged in the development version.

这可能是由于表中的NaN值所致,这是熊猫sql函数不能很好地处理NaN的已知缺点(

This is probably due to NaN values in your table, and this is a known shortcoming at the moment that the pandas sql functions don't handle NaNs well (https://github.com/pydata/pandas/issues/2754, https://github.com/pydata/pandas/issues/4199)

作为目前的一种解决方法(对于0.14.1及更低版本的熊猫),您可以使用以下方法将nan值手动转换为无":

As a workaround at this moment (for pandas versions 0.14.1 and lower), you can manually convert the nan values to None with:

df2 = df.astype(object).where(pd.notnull(df), None)

,然后将数据帧写入sql.但是,这会将所有列转换为对象dtype.因此,您必须基于原始数据框创建数据库表.例如,如果您的第一行不包含NaN:

and then write the dataframe to sql. This however converts all columns to object dtype. Because of this, you have to create the database table based on the original dataframe. Eg if your first row does not contain NaNs:

df[:1].to_sql('table_name', con)
df2[1:].to_sql('table_name', con, if_exists='append')

这篇关于Python Pandas使用NaN值写入sql的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆