Pandas read_sql_query 对某些列中的所有值返回 None [英] Pandas read_sql_query returning None for all values in some columns

查看:84
本文介绍了Pandas read_sql_query 对某些列中的所有值返回 None的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 pandas read_sql_query 将数据从 MySQL 数据库表读取到 pandas 数据帧中.此表中的某些列全为 NULL 值.对于这些列,pandas 数据框的每一行都包含 None.对于所有其他列,数据框包含 NaN,其中存在 NULL 值.谁能解释为什么所有 NULL 列都返回 None ?我如何确保我拥有所有 NaN,希望无需进行手动转换?我应该补充一点,导致这个问题的两列是浮动的,第三列是双精度的,

I am using pandas read_sql_query to read data from a MySQL database table into a pandas dataframe. Some columns in this table have all NULL values. For those columns the pandas dataframe contains None in every row. For all other columns the dataframe contains NaN where there was a NULL value. Can anyone explain why None is returned for the all NULL columns? And how do I make sure I have all NaNs, hopefully without doing manual conversions? I should add that two of the columns causing this problem are float and the third is of type double,

编辑

这是一个例子.pef 和 fer 列包含数据库中的所有 NULL.

Here is an example. The columns pef and fer contain all NULLS in the database.

from sqlalchemy import create_engine
import pandas as pd
import math

querystr = "SELECT * FROM dbname.mytable"
engine = create_engine('mysql+pymysql://username:password@localhost/' + "dbname")
df = pd.read_sql_query(querystr, engine)
df.head()

    sys     dias    pef     fer
0   NaN     NaN     None    None
1   159.0   92.666  None    None
2   NaN     NaN     None    None
3   NaN     NaN     None    None
4   102.0   63.333  None    None

在 MySQL 数据库中,这些列被定义为:

In the MySQL database these columns are defined as:

Columns: 
    sys float 
    dias float 
    pef float 
    fer float

我希望 pef 和 fer 列在每一行中都包含 NaN,而不是 None.

I would expect the columns pef and fer to contain NaN in each row, not None.

推荐答案

这个问题是一个未解决的问题,在这里解释:这里:https://github.com/pandas-dev/pandas/issues/14314

The problem is an open issue and is explained here: here: https://github.com/pandas-dev/pandas/issues/14314

read_sql_query 只是获取结果集,没有任何列类型信息.如果您使用 read_sql_table 函数,它会使用通过 SQLAlchemy 获取列类型信息.

read_sql_query just gets result sets back, without any column type information. If you use the read_sql_table functions, there it uses the column type information through SQLAlchemy.

似乎 read_sql_query 只检查列中返回的前 3 个值来确定列的类型.因此,如果前 3 个值是 NULL,则无法确定列的类型,因此返回 None.

It seems that read_sql_query only checks the first 3 values returned in a column to determine the type of the column. So if the first 3 values are NULL it cannot determine the type of the column and so returns None.

因此,部分解决方法是使用 read_sql_table.我更改了我的代码以使用 read_sql_table 并且即使对于所有 NULL 列,它也按预期返回 NaN 值.但在我的实际应用程序中,我真的需要使用 read_sql_query. 所以我现在在返回结果后立即用 NaN 替换任何 None 值:

So a partial workaround is to use read_sql_table. I changed my code to use read_sql_table and it returns NaN values as expected even for the all NULL columns. But in my real application I really need to use read_sql_query. So I am now replacing any None values with NaN as soon as the results are returned:

df.replace([None], np.nan, inplace=True)

这篇关于Pandas read_sql_query 对某些列中的所有值返回 None的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆