如何在(季节性的)KDE图中找到中位数? [英] How to locate the median in a (seaborn) KDE plot?

查看:99
本文介绍了如何在(季节性的)KDE图中找到中位数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试做一个内核密度估计(KDE)图与seaborn并找到中位数.代码看起来像这样:

I am trying to do a Kernel Density Estimation (KDE) plot with seaborn and locate the median. The code looks something like this:

import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

sns.set_palette("hls", 1)
data = np.random.randn(30)
sns.kdeplot(data, shade=True)

# x_median, y_median = magic_function()
# plt.vlines(x_median, 0, y_median)

plt.show()

如您所见,我需要一个magic_function()来从kdeplot中获取x和y的中值.然后我想用例如vlines.但是,我不知道该怎么做.结果应该看起来像这样(显然黑色中间条在这里是错误的):

As you can see I need a magic_function() to fetch the median x and y values from the kdeplot. Then I would like to plot them with e.g. vlines. However, I can't figure out how to do that. The result should look something like this (obviously the black median bar is wrong here):

我想我的问题与seaborn并不严格相关,也适用于其他种类的matplotlib图.任何想法都将不胜感激.

I guess my question is not strictly related to seaborn and also applies to other kinds of matplotlib plots. Any ideas are greatly appreciated.

推荐答案

您需要:

  1. 提取kde行的数据
  2. 对其进行积分以计算累积分布函数(CDF)
  3. 找到使CDF等于1/2的值,即中位数

import numpy as np
import scipy
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_palette("hls", 1)
data = np.random.randn(30)
p=sns.kdeplot(data, shade=True)

x,y = p.get_lines()[0].get_data()

#care with the order, it is first y
#initial fills a 0 so the result has same length than x
cdf = scipy.integrate.cumtrapz(y, x, initial=0)

nearest_05 = np.abs(cdf-0.5).argmin()

x_median = x[nearest_05]
y_median = y[nearest_05]

plt.vlines(x_median, 0, y_median)
plt.show()

这篇关于如何在(季节性的)KDE图中找到中位数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆