如何在matplotlib中创建密度图? [英] How to create a density plot in matplotlib?

查看:637
本文介绍了如何在matplotlib中创建密度图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中,我可以通过执行以下操作来创建所需的输出:

In R I can create the desired output by doing:

data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
         rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))

在python(带有matplotlib)中,我得到的最接近的是一个简单的直方图:

In python (with matplotlib) the closest I got was with a simple histogram:

import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()

我也尝试了 normed = True参数,但是什么也没得到除了尝试使高斯拟合直方图之外.

I also tried the normed=True parameter but couldn't get anything other than trying to fit a gaussian to the histogram.

根据网络上的示例,我最近的尝试是在scipy.statsgaussian_kde周围,但是到目前为止,我一直没有成功.

My latest attempts were around scipy.stats and gaussian_kde, following examples on the web, but I've been unsuccessful so far.

推荐答案

Sven展示了如何使用Scipy中的类gaussian_kde,但是您会注意到它看起来与您用R生成的看起来不太一样.是因为gaussian_kde尝试自动推断带宽.您可以通过更改gaussian_kde类的功能covariance_factor以某种方式发挥带宽的作用.首先,这是在不更改该功能的情况下得到的结果:

Sven has shown how to use the class gaussian_kde from Scipy, but you will notice that it doesn't look quite like what you generated with R. This is because gaussian_kde tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function covariance_factor of the gaussian_kde class. First, here is what you get without changing that function:

但是,如果我使用以下代码:

However, if I use the following code:

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()

我知道

这与您从R获得的结果非常接近.我做了什么? gaussian_kde使用可变函数covariance_factor计算其带宽.在更改函数之前,由covariance_factor返回的该数据值约为0.5.降低它会降低带宽.更改该函数后,我必须调用_compute_covariance,以便可以正确计算所有因素.它与R中的bw参数并不完全对应,但是希望它可以帮助您朝正确的方向前进.

which is pretty close to what you are getting from R. What have I done? gaussian_kde uses a changable function, covariance_factor to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call _compute_covariance after changing that function so that all of the factors would be calculated correctly. It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.

这篇关于如何在matplotlib中创建密度图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆