在Python 3.3.2上绘制包含NaN和Pandas 0.12.0和Matplotlib 1.3.1的DataFrame时出错 [英] Error when plotting DataFrame containing NaN with Pandas 0.12.0 and Matplotlib 1.3.1 on Python 3.3.2

查看:59
本文介绍了在Python 3.3.2上绘制包含NaN和Pandas 0.12.0和Matplotlib 1.3.1的DataFrame时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,此问题与此问题不相同一个.

First of all, this question is not the same as this one.

我遇到的问题是,当我尝试绘制一个在一个单元格中包含一个numpy NaN的DataFrame时,出现错误:

The problem I'm having is that when I try to plot a DataFrame which contains a numpy NaN in one cell, I get an error:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df.plot()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1636, in plot_frame
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1321, in _make_ts_plot
    _plot(data[col], i, ax, label, style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required

如果我将np.NaN替换为数字,例如"2.3",则上述代码将起作用.

The above code works if I replace the np.NaN with a number, such as "2.3".

绘制为两个单独的序列也不起作用(当我将包含NaN的序列添加到图中时,它将失败:

Plotting as two separate Series does not work either (it fails when I add the Series containing the NaN to the plot):

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df['A'].plot(label='This is A', style='k')
<matplotlib.axes.AxesSubplot object at 0x02ACFF90>
>>> df['B'].plot(label='This is B', style='g')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1730, in plot_series
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1311, in _make_ts_plot
    _plot(data, 0, ax, label, self.style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required

但是,如果我直接使用Matplotlib的Pyplot plot()而不是使用Pandas的plot()函数,它将起作用:

However, if I do this directly with Matplotlib's Pyplot plot(), instead of using Pandas' plot() function, it works:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> plt.plot(dates, [1, 4, 9, 16, 25], 'k', dates, [2, 5, np.NAN, 17, 26], 'g')
[<matplotlib.lines.Line2D object at 0x03E98650>, <matplotlib.lines.Line2D object at 0x040929B0>]
>>> plt.show()
>>>

所以似乎我有一个解决方法,但是当我绘制大型DataFrame时,我宁愿使用Pandas的plot()方法,这更方便. 我尝试遵循堆栈跟踪,但是过了一会儿它变得复杂了(我对Pandas,Numpy和Matplotlib源代码不熟悉). 我是在做错什么,还是这可能是熊猫的plot()中的错误?

So it seems that I have a workaround, but as I plot large DataFrames, I would prefer to use Pandas' plot() method, which is more convenient. I've tried to follow the stack trace, but after a while it gets complicated (I'm not familiar with Pandas, Numpy and Matplotlib source code). Am I doing something wrong, or is this a possible bug in Pandas' plot()?

谢谢您的帮助!

我在Windows x86和Linux AMD64上都尝试了以下版本的相同结果:

I tried both on Windows x86 and on Linux AMD64 with the same results with these versions:

  • Python 3.3.2
  • 熊猫0.12.0
  • Matplotlib 1.3.1
  • Numpy 1.7.1

推荐答案

似乎这是带有熊猫0.12的matplotlib 1.3.1

It seems this is matplotlib 1.3.1 with pandas 0.12 integration bug:

解决方法是降级到matplotlib 1.3.0. (但是,请注意,此版本的matplotlib包含在具有非ASCII字体名称的字体的系统上的错误,因此您可能需要选择问题!).此降级将触发降级到numpy 1.7.1,因此您应该(再次)升级到numpy 1.8.0. 该错误应在即将发布的Pandas 0.13中修复.但是Pandas 0.13可能会破坏一些现有代码(因为pandas.Series不再是numpy.ndarray的子​​类),因此,至少在短期内,还是需要做出一些艰难的选择.

The workaround is to downgrade to matplotlib 1.3.0. (Note, however, that this version of matplotlib contains a bug on systems which have fonts with non-ASCII font names, so you may need to pick your problem!). This downgrade will trigger a downgrade to numpy 1.7.1, so you should then (again) upgrade to numpy 1.8.0. This error should be fixed in the upcoming Pandas 0.13. However Pandas 0.13 may break some existing code (because pandas.Series is no longer a subclass of numpy.ndarray), so again, some hard choices may be required, at least in the short term.

只需检查一下,代码就可以与matplotlib 1.3.0正常工作:

Just checked, code works fine with matplotlib 1.3.0:

>>> import matplotlib
>>> matplotlib.__version__
'1.3.0'
>>> df.plot()
<matplotlib.axes.AxesSubplot object at 0x04E8B4F0>
>>> plt.show(_)

这篇关于在Python 3.3.2上绘制包含NaN和Pandas 0.12.0和Matplotlib 1.3.1的DataFrame时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆