使用公共轴绘制两个变量的仓位相关性 [英] Plotting binned correlation of two variables using common axis

查看:61
本文介绍了使用公共轴绘制两个变量的仓位相关性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经将三个列表加载到了熊猫数据框中.

I have three lists that I have loaded into a pandas dataframe.

import pandas as pd
df = pd.DataFrame({'x': location})
df = df.assign(y1 = variable1)
df = df.assign(y2 = variable2)

我想画出y1与y2的相关性,其中x是共同的x轴.也就是说,实际上,我想根据x位置对y1和y2值进行分类,在每个bin中找到y1与y2的相关性,然后在整个x域中绘制一条相关性线.因此,我的最终绘图将在y轴上具有相关性,而在x轴上将具有相关性.

I would like to plot the correlation of y1 with y2 with x being the common x-axis. That is, really, I would like to bin y1 and y2 values according to x location, find the correlation of y1 with y2 within each bin and then plot a line of the correlations across the whole x domain. So my final plot will have correlation on the y-axis and location on the x-axis.

我以前已经使用scipy binned_statistics 函数绘制条件均值来完成与此操作不完全不同的操作,但是我认为我不能轻易将其扩展到相关性.无论如何,我也想在使用熊猫方面做得更好一些,因此,我尽可能地避免使用这条路线.

I have previously done something not completely dissimilar to this using the scipy binned_statistics function to plot conditional means but I don't think I can easily extend that to correlations. I would also like to get a bit better at using pandas anyway so I'm trying to avoid that route if at all possible.

我敢肯定,这已经被问过了,但是我遇到的所有事情似乎都在研究多个分布图.

I'm sure this has been asked before but everything that I have come across seems to be looking at multiple distribution plots.

推荐答案

我已经或多或少地提出了解决方案.在此处实施类似于的操作:

I've more or less arrived at a solution. Implementing something similar to what was used here I have:

nbins = 20
df['bins'] = pd.qcut(df['x'], q=nbins)
plotdatadf = df.groupby('bins')[['y1', 'y2']].corr().iloc[0::2, -1]

这为我提供了一个数据框,每个仓的相关系数分别为 y1 y2 ,其中,仓沿 x 平均分配就每个箱的观察而言.

This provides me with a data frame with a correlation coefficient of y1 and y2 for each bin, where bins are evenly divided along x in terms of observations per bin.

我现在可以返回到先前的数据帧,并使用这些相关值添加原始长度的另一列,条件是 if bin [1]然后corr = corr [1] 类型的复制.然后可以将此列绘制为y,而将我已经存在的x绘制为折线图.

I can now go back to my previous dataframe and add another column of the original length with these correlation values, conditional on if bin[1] then corr = corr[1]-type copying. This column can then be plotted as y against my already existing x as a line plot.

这篇关于使用公共轴绘制两个变量的仓位相关性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆