Statsmodels镶嵌图ValueError:无法将float NaN转换为整数 [英] Statsmodels mosaic plot ValueError: cannot convert float NaN to integer

查看:105
本文介绍了Statsmodels镶嵌图ValueError:无法将float NaN转换为整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的pandas DataFrame,我想为其创建镶嵌图.这是我的代码:

import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic 

mydata = pd.DataFrame({'id2': {64: 'Angelica', 
                               65: 'DXW_UID', 66: 'casuid01', 
                               67: 'casuid01', 68: 'EC93_uid', 
                               69: 'EC93_uid', 70: 'EC93_uid', 
                               60: 'DXW_UID',  61: 'AtmosFox', 
                               62: 'DXW_UID', 63: 'DXW_UID'}, 
                       'id1': {64: 'TGP', 
                               65: 'Retention01', 66: 'default',
                               67: 'default', 68: 'Musa_EC_9_3', 
                               69: 'Musa_EC_9_3', 70: 'Musa_EC_9_3', 
                               60: 'default', 61: 'default', 
                               62: 'default', 63: 'default'}})

mydata
            id1       id2
60      default   DXW_UID
61      default  AtmosFox
62      default   DXW_UID
63      default   DXW_UID
64          TGP  Angelica
65  Retention01   DXW_UID
66      default  casuid01
67      default  casuid01
68  Musa_EC_9_3  EC93_uid
69  Musa_EC_9_3  EC93_uid
70  Musa_EC_9_3  EC93_uid

[11 rows x 2 columns]

当我排除第64行时,我可以创建一个镶嵌图.

mosaic(mydata[mydata.id1!='TGP'], ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11E0D3B0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.594059405940594, 0.49504950495049505)), (('default', 'AtmosFox'), (0.0, 0.49834983498349833, 0.594059405940594, 0.16501650165016499)), (('default', 'casuid01'), (0.0, 0.66666666666666663, 0.594059405940594, 0.33003300330033009)), (('default', 'EC93_uid'), (0.0, 1.0, 0.594059405940594, 0.0)), (('Retention01', 'DXW_UID'), (0.599009900990099, 0.0, 0.09900990099009899, 0.99009900990099009)), (('Retention01', 'AtmosFox'), (0.599009900990099, 0.99339933993399343, 0.09900990099009899, 0.0)), (('Retention01', 'casuid01'), (0.599009900990099, 0.99669966996699666, 0.09900990099009899, 0.0)), (('Retention01', 'EC93_uid'), (0.599009900990099, 1.0, 0.09900990099009899, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7029702970297029, 0.0, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7029702970297029, 0.0033003300330033004, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7029702970297029, 0.0066006600660066007, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7029702970297029, 0.0099009900990099011, 0.29702970297029707, 0.99009900990099009))]))

该图显示得很好(除了一些标签看起来有些可笑-但这不是问题).

当我包含第64行时,会发生错误.我的问题是,为什么该行会导致此错误,并且该如何解决?我可以看到在尝试绘制图像时发生了错误,但是NaN的来源一点都不明显,特别是因为之前的绘图工作得很好.

mosaic(mydata, ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11D13ED0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.5373936408419167, 0.49342105263157893)), (('default', 'AtmosFox'), (0.0, 0.49671052631578938, 0.5373936408419167, 0.16447368421052627)), (('default', 'casuid01'), (0.0, 0.66447368421052622, 0.5373936408419167, 0.32894736842105265)), (('default', 'Angelica'), (0.0, 0.99671052631578938, 0.5373936408419167, 0.0)), (('default', 'EC93_uid'), (0.0, 1.0, 0.5373936408419167, 0.0)), (('TGP', 'DXW_UID'), (0.5423197492163009, 0.0, 0.08956560680698614, 0.0)), (('TGP', 'AtmosFox'), (0.5423197492163009, 0.0032894736842105261, 0.08956560680698614, 0.0)), (('TGP', 'casuid01'), (0.5423197492163009, 0.0065789473684210523, 0.08956560680698614, 0.0)), (('TGP', 'Angelica'), (0.5423197492163009, 0.0098684210526315784, 0.08956560680698614, 0.98684210526315785)), (('TGP', 'EC93_uid'), (0.5423197492163009, 1.0, 0.08956560680698614, 0.0)), (('Retention01', 'DXW_UID'), (0.6368114643976712, 0.0, 0.08956560680698614, 0.98684210526315785)), (('Retention01', 'AtmosFox'), (0.6368114643976712, 0.99013157894736836, 0.08956560680698614, 0.0)), (('Retention01', 'casuid01'), (0.6368114643976712, 0.99342105263157876, 0.08956560680698614, 0.0)), (('Retention01', 'Angelica'), (0.6368114643976712, 0.99671052631578938, 0.08956560680698614, 0.0)), (('Retention01', 'EC93_uid'), (0.6368114643976712, 1.0, 0.08956560680698614, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7313031795790416, 0.0, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7313031795790416, 0.0032894736842105261, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7313031795790416, 0.0065789473684210523, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'Angelica'), (0.7313031795790416, 0.0098684210526315784, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7313031795790416, 0.013157894736842105, 0.2686968204209583, 0.98684210526315785))]))

运行上面的命令时,我得到以下回溯:

  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 374, in idle_draw
    self.draw()
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
    self.figure.draw(self.renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
    func(*args)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
    a.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
    tick.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
    self.label1.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
    ismath=ismath, mtext=self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
    font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 299, in resizeEvent
    self.draw()
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
    self.figure.draw(self.renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
    func(*args)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
    a.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
    tick.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
    self.label1.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
    ismath=ismath, mtext=self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
    font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer

我在spyder IDE中使用默认设置运行了以上代码.

此处解决了类似的问题,数字下溢是罪魁祸首.但是,如果是这种情况,那么原因就不那么明显了.

解决方案

根据mosiac的标签代码中的某些内容对此很难. /p>

要查看此信息,请将您的DataFrame转换为列联表:

In [161]: pd.crosstab(mydata.id1, mydata.id2)
Out[161]: 
id2          Angelica  AtmosFox  DXW-UID  EC93-uid  casuid01
id1                                                         
Musa-EC-9-3         0         0        0         3         0
Retention01         0         0        1         0         0
TGP                 1         0        0         0         0
default             0         1        3         0         2

并为所有这些零添加一个小位".然后,mosiac可以正常工作.

In [165]: ct = pd.crosstab(mydata.id1, mydata.id2)
In [166]: ctplus = ct + 1
In [167]: mosaic(ctplus.unstack())

这会导致相当漂亮的结果:

微小的缺点是,这是错误的!但是您可以通过这样做来补救

ctplus = ct + 1e-8

只需对所有这些零添加一点.该图仍然有效(但是看起来很丑,因为马赛克的所有这些零图块上的标签都彼此重叠):

I have a simple pandas DataFrame, for which I would like to create a mosaic plot. Here is my code:

import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic 

mydata = pd.DataFrame({'id2': {64: 'Angelica', 
                               65: 'DXW_UID', 66: 'casuid01', 
                               67: 'casuid01', 68: 'EC93_uid', 
                               69: 'EC93_uid', 70: 'EC93_uid', 
                               60: 'DXW_UID',  61: 'AtmosFox', 
                               62: 'DXW_UID', 63: 'DXW_UID'}, 
                       'id1': {64: 'TGP', 
                               65: 'Retention01', 66: 'default',
                               67: 'default', 68: 'Musa_EC_9_3', 
                               69: 'Musa_EC_9_3', 70: 'Musa_EC_9_3', 
                               60: 'default', 61: 'default', 
                               62: 'default', 63: 'default'}})

mydata
            id1       id2
60      default   DXW_UID
61      default  AtmosFox
62      default   DXW_UID
63      default   DXW_UID
64          TGP  Angelica
65  Retention01   DXW_UID
66      default  casuid01
67      default  casuid01
68  Musa_EC_9_3  EC93_uid
69  Musa_EC_9_3  EC93_uid
70  Musa_EC_9_3  EC93_uid

[11 rows x 2 columns]

I can create a mosaic plot just fine when I exclude row 64.

mosaic(mydata[mydata.id1!='TGP'], ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11E0D3B0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.594059405940594, 0.49504950495049505)), (('default', 'AtmosFox'), (0.0, 0.49834983498349833, 0.594059405940594, 0.16501650165016499)), (('default', 'casuid01'), (0.0, 0.66666666666666663, 0.594059405940594, 0.33003300330033009)), (('default', 'EC93_uid'), (0.0, 1.0, 0.594059405940594, 0.0)), (('Retention01', 'DXW_UID'), (0.599009900990099, 0.0, 0.09900990099009899, 0.99009900990099009)), (('Retention01', 'AtmosFox'), (0.599009900990099, 0.99339933993399343, 0.09900990099009899, 0.0)), (('Retention01', 'casuid01'), (0.599009900990099, 0.99669966996699666, 0.09900990099009899, 0.0)), (('Retention01', 'EC93_uid'), (0.599009900990099, 1.0, 0.09900990099009899, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7029702970297029, 0.0, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7029702970297029, 0.0033003300330033004, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7029702970297029, 0.0066006600660066007, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7029702970297029, 0.0099009900990099011, 0.29702970297029707, 0.99009900990099009))]))

The plot comes out fine (with the exception of some of the labels looking a little funny--but that's not the issue).

The errors occur when I include row 64. My questions are, why does this row cause this error, and how can I fix it? I can see that the error occurs when trying to draw the image, but it is not at all obvious where the NaN is coming from, especially since the plot before worked just fine.

mosaic(mydata, ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11D13ED0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.5373936408419167, 0.49342105263157893)), (('default', 'AtmosFox'), (0.0, 0.49671052631578938, 0.5373936408419167, 0.16447368421052627)), (('default', 'casuid01'), (0.0, 0.66447368421052622, 0.5373936408419167, 0.32894736842105265)), (('default', 'Angelica'), (0.0, 0.99671052631578938, 0.5373936408419167, 0.0)), (('default', 'EC93_uid'), (0.0, 1.0, 0.5373936408419167, 0.0)), (('TGP', 'DXW_UID'), (0.5423197492163009, 0.0, 0.08956560680698614, 0.0)), (('TGP', 'AtmosFox'), (0.5423197492163009, 0.0032894736842105261, 0.08956560680698614, 0.0)), (('TGP', 'casuid01'), (0.5423197492163009, 0.0065789473684210523, 0.08956560680698614, 0.0)), (('TGP', 'Angelica'), (0.5423197492163009, 0.0098684210526315784, 0.08956560680698614, 0.98684210526315785)), (('TGP', 'EC93_uid'), (0.5423197492163009, 1.0, 0.08956560680698614, 0.0)), (('Retention01', 'DXW_UID'), (0.6368114643976712, 0.0, 0.08956560680698614, 0.98684210526315785)), (('Retention01', 'AtmosFox'), (0.6368114643976712, 0.99013157894736836, 0.08956560680698614, 0.0)), (('Retention01', 'casuid01'), (0.6368114643976712, 0.99342105263157876, 0.08956560680698614, 0.0)), (('Retention01', 'Angelica'), (0.6368114643976712, 0.99671052631578938, 0.08956560680698614, 0.0)), (('Retention01', 'EC93_uid'), (0.6368114643976712, 1.0, 0.08956560680698614, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7313031795790416, 0.0, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7313031795790416, 0.0032894736842105261, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7313031795790416, 0.0065789473684210523, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'Angelica'), (0.7313031795790416, 0.0098684210526315784, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7313031795790416, 0.013157894736842105, 0.2686968204209583, 0.98684210526315785))]))

When I run the above, I get this Traceback:

  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 374, in idle_draw
    self.draw()
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
    self.figure.draw(self.renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
    func(*args)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
    a.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
    tick.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
    self.label1.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
    ismath=ismath, mtext=self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
    font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 299, in resizeEvent
    self.draw()
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
    FigureCanvasAgg.draw(self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
    self.figure.draw(self.renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
    func(*args)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
    a.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
    tick.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
    self.label1.draw(renderer)
  File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
    ismath=ismath, mtext=self)
  File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
    font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer

I ran the above code in the spyder IDE, with default settings.

A similar issue was addressed here, and numerical underflow was the culprit. However, if that is the case here, it is not at all obvious why.

解决方案

According to the docs the first parameter should be a contingency table. The fact that your way of doing things works at all seems to be an undocumented feature.

The behaviour you're seeing (including your "funny" looking labels) is because many of the entries in your contingency table are zero, and something in the labelling code of mosiac is having a hard time with that.

To see this, convert your DataFrame to a contingency table:

In [161]: pd.crosstab(mydata.id1, mydata.id2)
Out[161]: 
id2          Angelica  AtmosFox  DXW-UID  EC93-uid  casuid01
id1                                                         
Musa-EC-9-3         0         0        0         3         0
Retention01         0         0        1         0         0
TGP                 1         0        0         0         0
default             0         1        3         0         2

And add a "little bit" to all those zeros. The mosiac then works fine.

In [165]: ct = pd.crosstab(mydata.id1, mydata.id2)
In [166]: ctplus = ct + 1
In [167]: mosaic(ctplus.unstack())

Which results in the rather beautiful:

The tiny downside is that it's wrong! But you can remedy that by doing

ctplus = ct + 1e-8

to just add a tiny bit to all those zeros. The plot still works (but looks ugly because the labels on all those zero tiles of the mosaic are all on top of each other):

这篇关于Statsmodels镶嵌图ValueError:无法将float NaN转换为整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆