绘制包含NaN的pandas数据帧 [英] Plot pandas dataframe containing NaNs

查看:141
本文介绍了绘制包含NaN的pandas数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有来自三个不同GPS接收器的冰速GPS数据。这些数据位于熊猫数据框中,其索引为朱利安日(从2009年开始增量)。

I have GPS data of ice speed from three different GPS receivers. The data are in a pandas dataframe with an index of julian day (incremental from the start of 2009).

这是数据的子集(主要数据集是3487235)行...):

This is a subset of the data (the main dataset is 3487235 rows...):

                    R2          R7         R8
1235.000000 116.321959  100.805197  96.519977
1235.000116 NaN         100.771133  96.234957
1235.000231 NaN         100.584559  97.249262
1235.000347 118.823610  100.169055  96.777833
1235.000463 NaN         99.753551   96.598350
1235.000579 NaN         99.338048   95.283989
1235.000694 113.995003  98.922544   95.154067

数据框格式为:


Index: 6071320 entries, 127.67291667 to 1338.51805556
Data columns:
R2    3487235  non-null values
R7    3875864  non-null values
R8    1092430  non-null values
dtypes: float64(3)

R2以不同的速率采样到R7和R8因此NaNs是a ppear系统地以该间距。

R2 sampled at a different rate to R7 and R8 hence the NaNs which appear systematically at that spacing.

尝试 df.plot()绘制整个数据框(或索引的行位置)在绘制R7和R8方面工作正常,但没有绘制R2。同样,只是做 df.R2.plot()也不起作用。绘制R2的唯一方法是执行 df.R2.dropna()。plot(),但这也会删除表示无数据时段的NaN(而不仅仅是比其他接收器更频繁的采样频率。)

Trying df.plot() to plot the whole dataframe (or indexed row locations thereof) works fine in terms of plotting R7 and R8, but doesn't plot R2. Similarly, just doing df.R2.plot() also doesn't work. The only way to plot R2 is to do df.R2.dropna().plot(), but this also removes NaNs which signify periods of no data (rather than just a coarser sampling frequency than the other receivers).

有没有其他人遇到这个?关于这个问题的任何想法都会感激不尽:)

Has anyone else come across this? Any ideas on the problem would be gratefully received :)

推荐答案

你没有看到任何东西的原因是因为默认的情节风格只是一条线。但该线在NaN处被中断,因此只会绘制多个连续值。后者不会发生在你的情况下。您需要更改绘图的样式,这取决于您想要看到的内容。

The reason your not seeing anything is because the default plot style is only a line. But the line gets interupted at NaN's so only multiple consequtive values will be plotted. And the latter doesnt happen in your case. You need to change the style of plotting, which depends on what you want to see.

对于初学者,请尝试添加:

For starters, try adding:

.plot(marker='o')

这应该使所有数据点显示为圆圈。它容易变得杂乱,因此调整标记,边缘颜色等可能是有用的。我没有完全适应Pandas如何使用matplotlib,所以如果情节变得更复杂,我经常会自己切换到matplotlib,例如:

That should make all data points appear as circles. It easily gets cluttered so adjusting markersize, edgecolor etc might be usefull. Im not fully adjusted to how Pandas is using matplotlib so i often switch to matplotlib myself if plots get more complicated, eg:

plt.plot(df.R2.index.to_pydatetime(), df.R2, 'o-')

这篇关于绘制包含NaN的pandas数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆