调整pandas read_sql_query NULL 值的处理? [英] Adjust pandas read_sql_query NULL value treatment?
问题描述
当我这样做
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('sqlite://')
conn = engine.connect()
conn.execute("create table test (a float)")
for _ in range(5):
conn.execute("insert into test values (NULL)")
df = pd.read_sql_query("select * from test", engine)
#df = pd.read_sql_table("test", engine)
df.a
结果是一列 None
值,而不是 float("nan")
.这非常烦人,尤其是当您逐块读取带有 NULL 值的浮点列时.
the result is a column of None
values as opposed to float("nan")
. This is pretty annoying if especially if you read float columns with NULL values chunk-wise.
read_sql_table
版本工作正常,因为我认为它可以使用类型信息.
The read_sql_table
version works fine, since I suppose it can use type information.
是否有一种简单的方法可以调整 read_sql_query
以将 NULL
值解释为 float("nan")
?
Is there an easy way I can adjust read_sql_query
to also interpret NULL
values as float("nan")
?
推荐答案
似乎一个问题 被提出,类似的东西 - coerce_float
参数 - 在版本 0.7.2 中被添加到 Pandas,根据链接页面中 wesm 的评论:
It seems an issue was raised and something like it - the coerce_float
argument - was added to pandas in version 0.7.2, as per wesm's comment in the linked page:
嗨亚瑟,我添加了一个选项 coerce_float(在上面的提交中)来转换十进制 ->浮动并用 NaN 填充 None.将十进制转换为浮点数仍然很慢.将成为即将发布的 0.7.2 的一部分
hi arthur, I added an option coerce_float (in the above commit) that converts Decimal -> float and fills None with NaN. Converting Decimal to float is still really slow. Will be part of 0.7.2 to be released soon
虽然在pandas.read_sql_query 0.18.1中的描述文档似乎令人困惑:
coerce_float : 布尔值,默认为 True
coerce_float : boolean, default True
尝试将非字符串、非数字对象(如decimal.Decimal)的值转换为浮点数,对SQL结果集很有用
Attempt to convert values to non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets
这篇关于调整pandas read_sql_query NULL 值的处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!