测试是否存在Pandas DataFrame [英] Testing if a pandas DataFrame exists

查看:85
本文介绍了测试是否存在Pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的代码中,我有几个变量,这些变量可以包含pandas DataFrame或根本不包含任何变量.假设我要测试并查看是否已创建某个DataFrame.我的第一个想法就是要像这样测试它:

In my code, I have several variables which can either contain a pandas DataFrame or nothing at all. Let's say I want to test and see if a certain DataFrame has been created yet or not. My first thought would be to test for it like this:

if df1:
    # do something

但是,该代码会以这种方式失败:

However, that code fails in this way:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

足够公平.理想情况下,我希望有一个适用于DataFrame或Python None的状态测试.

Fair enough. Ideally, I would like to have a presence test that works for either a DataFrame or Python None.

这是一种可行的方法:

if not isinstance(df1, type(None)):
    # do something

但是,类型测试真的很慢.

However, testing for type is really slow.

t = timeit.Timer('if None: pass')
t.timeit()
# approximately 0.04
t = timeit.Timer('if isinstance(x, type(None)): pass', setup='x=None')
t.timeit()
# approximately 0.4

太好了.除了缓慢之外,对NoneType的测试也不是很灵活.

Ouch. Along with being slow, testing for NoneType isn't very flexible, either.

一种不同的解决方案是将df1初始化为一个空的DataFrame,以便在null和非null情况下类型都相同.然后,我可以使用len()any()或类似的东西进行测试.但是,制作一个空的DataFrame似乎是一种愚蠢且浪费的事情.

A different solution would be to initialize df1 as an empty DataFrame, so that the type would be the same in both the null and non-null cases. I could then just test using len(), or any(), or something like that. Making an empty DataFrame seems kind of silly and wasteful, though.

另一种解决方案是使用一个指示符变量:df1_exists,在创建df1之前将其设置为False.然后,我将测试df1_exists,而不是测试df1.但这似乎也不是那么优雅.

Another solution would be to have an indicator variable: df1_exists, which is set to False until df1 is created. Then, instead of testing df1, I would be testing df1_exists. But this doesn't seem all that elegant, either.

是否有更好,更Pythonic的方式来解决此问题?我是否想念某些东西,或者这仅仅是大熊猫所有令人敬畏的事情的尴尬副作用?

Is there a better, more Pythonic way of handling this issue? Am I missing something, or is this just an awkward side effect all the awesome things about pandas?

推荐答案

选项1 (我的首选选项)

Option 1 (my preferred option)

如果您喜欢这种方法,请选择他的答案

Please select his answer if you like this approach

使用None初始化变量,然后在使用None进行操作之前先检查None是非常习惯的python.

It is very idiomatic python to initialize a variable with None then check for None prior to doing something with that variable.

df1 = None

if df1 is not None:
    print df1.head()


选项2


Option 2

但是,设置一个空的数据框并不是一个坏主意.

However, setting up an empty dataframe isn't at all a bad idea.

df1 = pd.DataFrame()

if not df1.empty:
    print df1.head()


选项3


Option 3

只需尝试.

try:
    print df1.head()
# catch when df1 is None
except AttributeError:
    pass
# catch when it hasn't even been defined
except NameError:
    pass


计时

df1处于初始化状态或根本不存在时


Timing

When df1 is in initialized state or doesn't exist at all

df1是其中包含某些内容的数据框

When df1 is a dataframe with something in it

df1 = pd.DataFrame(np.arange(25).reshape(5, 5), list('ABCDE'), list('abcde'))
df1

这篇关于测试是否存在Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆