如何忽略空白值使Pandas Python存在的行 [英] How to ignore rows where blank values excist Pandas Python

查看:652
本文介绍了如何忽略空白值使Pandas Python存在的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想做的是查询Panda DataFrame,以便为我提供原始版本的过滤版本

What i'm trying to do is query a Panda DataFrame in order to give me a filtered version of the original one

self.waferInfo = pd.read_csv(fileName, index_col= None, na_values=['NA', ""] , usecols=[18,5,6,8,2])

print(self.waferInfo.head(5))

self.df2 = self.waferInfo[(self.waferInfo.FILE_FINISH_TS >= dateBegin) & (self.waferInfo.FILE_FINISH_TS <= dateEnd) ]

print(self.df2.head(5))

当第一次打印发生时,预期的行将打印出来,但是当第二个被调用时,它将显示为空.我发现发生这种情况的原因是因为原始DataFrame有一些空白 例如:

when the first print happens the expected rows print out but when the 2nd one is called, it appears empty. I figured out the reason that was happening was because the original DataFrame has some blanks for example :

18 5 6 8 2
A  B C   E
D  E T Y P
F  R B A L

我希望我的数据框返回

18 5 6 8 2
D  E T Y P
F  R B A L

第8列有一个空单元格的事实,它返回一个完整的空DataFrame.我知道这一点是因为我删除了excel中所有具有空单元格的行,之后DataFrame正常运行. 有什么方法可以忽略具有缺失值的行.

the fact that Column 8 has an empty cell it returns a complete empty DataFrame. I know this because I deleted all the rows that had empty cell's in excel and the DataFrame worked fine after that. is there any way to ignore rows that have a missing value.

推荐答案

我认为您对问题根本原因的假设不正确.见下文.

I do not think that your assumptions about the root cause of the problem are correct. See below.

"""
18 5 6 8 2
A  B C   E
D  E T Y P
F  R B A L
"""
import pandas as pd
import numpy as np

df = pd.read_clipboard()
print(df)
print("\n")
print(df.dropna())

输出:

  18  5  6  8     2
0  A  B  C  E  None
1  D  E  T  Y     P
2  F  R  B  A     L


  18  5  6  8  2
1  D  E  T  Y  P
2  F  R  B  A  L

如果df2.head(5)不返回任何内容,则是因为df2为空,而不是因为df中存在NaN.

If df2.head(5) returns nothing, then it's because df2 is empty, which is not because there are NaN's in your df.

也许

self.waferInfo[(self.waferInfo.FILE_FINISH_TS >= dateBegin) & \
(self.waferInfo.FILE_FINISH_TS <= dateEnd) ]

应该是

self.waferInfo.loc[(self.waferInfo.FILE_FINISH_TS >= dateBegin) & \
(self.waferInfo.FILE_FINISH_TS <= dateEnd) ]

我不确定,因为您没有提供足够的样本数据.

I can't say for sure because you haven't provided enough sample data.

这篇关于如何忽略空白值使Pandas Python存在的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆