图中每个点的高度值 [英] Height values for each point in a plot

查看：97 发布时间：2020/10/18 2:58:21 r data-visualization scatter-plot kernel-density

本文介绍了图中每个点的高度值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在一个名为s1m的数据框中有一个蛋白质-蛋白质相互作用的数据。每个DB和AD对都进行交互，我也可以绘制它：

 >头（s1m）
 DB_num AD_num 
 [1，] 2 8153 
 [2，] 7 3553 
 [3，] 8 4812 
 [4，] 13 7838 
 [5，] 24 3315 
 [6，] 24 6012

图的数据看起来像：

然后我使用在此站点上找到的代码绘制填充的轮廓线：

  ##计算2D内核密度，请参阅MASS书，第130-131页
 require（MASS）
z <-kde2d（s1m [，1]，s1m [，2]，n = 50）
 plot（s1m，xlab = X label，ylab = Y label，pch = 19，cex = .4）
 fill.contour（z，drawlabels = FALSE，add = TRUE ）

它给了我生成的图像（减去涂鸦）：

我的问题：我需要在其中注释每一行数据原始的 s1m 数据框，其编号对应于轮廓m上的高度ap（因此我在上面的图片上涂鸦了）。我认为列表z具有我要查找的值，但我不确定。

最后，我希望我的数据看起来像这样，所以我可以研究蛋白质的相互作用：

  DB_num AD_num高度
 [1，] 2 8153 1 
 [2，] 7 3553 1 
 [3，] 8 4812 3 
 [4，] 13 7838 6 
 [5，] 24 3315 2 
 [6，] 24 6012等。

解决方案

如果您想要实际的高度而不是每个箱的分配高度

  ##虚拟数据
 DF <-data.frame（DB_num = rnorm（10000），AD_num = rnorm（10000）
 
 require（ MASS）
 
 kde<-kde2d（DF [，1]，DF [，2 ]，n = 50）

请注意 kde2d 返回作为分量 z 的组成部分，在此情况下，矩阵是具有50行和列的行，其中行对应查看 x 数据，并查看 y 列。因为矩阵只是一个向量，并且数据由列填充，所以我们可以利用它并堆叠 x 和 y 值每次 n 次（此处 n = 50 ），然后展开 kde $ z

  dd <--dim（kde $ z）
 res <--数据.frame（DB_num = rep（kde $ x，times = dd [1]），
 AD_num = rep（kde $ y，times = dd [2]），
 height = as.numeric（kde $ z））

这会产生

 >头（res）
 DB_num AD_num高度
 1 -3.582508378 -3.79074271 0.000000000000000000000000000606447447484 
 2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229 
 3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190 
 4 -3.1226 0.0000000000000000000055565720524140235 
 5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503 
 6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471

要获取垃圾箱，您需要遵循 filled.contour 所做的操作，即通过以下方式形成中断：

  nlevels<-20 ##默认
 brks<-pretty（range（res（hs $ height），nlevels）
 
> brks 
 [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 
 [16] 0.15 0.16

然后使用 cut 将每个 height 分配给一个箱 brks ，类似

  res<-transform（res，bin = as.numeric（cut（height，brks）））

哪个给出

 > head（res）
 DB_num AD_num高度框
 1 -3.582508378 -3.79074271 0.000000000000000000000000000606447447 1 1 
 2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229 1 
 3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190 1 
 -3.122674029 -3.32899576 0.0000000000000000000055565720524140235 1 
 5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503 1 
 6 -2.816117797 -3.02116446 0.000000000000000815937052828768207499471 1

您可能需要检查？cut 的详细信息，以确定在垃圾箱边界上的行为，但这应该可以使您足够接近。

I have a data of protein-protein interactions in a data frame entitled: s1m. Each DB and AD pair make an interaction and I can plot it as well:

> head(s1m)
     DB_num AD_num
[1,]      2   8153
[2,]      7   3553
[3,]      8   4812
[4,]     13   7838
[5,]     24   3315
[6,]     24   6012

Plot of the data looks like:

I then used code I found on this site to plot filled contour lines:

## compute 2D kernel density, see MASS book, pp. 130-131
require(MASS)
z <- kde2d(s1m[,1], s1m[,2], n=50)
plot(s1m, xlab="X label", ylab="Y label", pch=19, cex=.4)
filled.contour(z, drawlabels=FALSE, add=TRUE)

It gave me the resulting image(minus the scribbles):

MY QUESTION: I need to annotate each line of data in the original s1m data frame with a number corresponding to its height on the contour map (hence my scribbles on the image above). I think the list z has the values I am looking for, but I am not sure.

In the end I would want my data to hopefully look something like this so I could study the protein interactions in groups:

         DB_num AD_num   height
    [1,]      2   8153        1
    [2,]      7   3553        1
    [3,]      8   4812        3
    [4,]     13   7838        6
    [5,]     24   3315        2
    [6,]     24   6012        etc.

解决方案

This is one option if you want the actual height not the bin each is assigned to

## dummy data
DF <- data.frame(DB_num = rnorm(10000), AD_num = rnorm(10000))

require("MASS")

kde <- kde2d(DF[,1], DF[,2], n = 50)

Note the kde2d returns as component z which is a matrix with (in this case) 50 rows and columns where rows correspond to the x data and columns to the y data. As a matrix is just a vector, and the data are filled by columns, we can exploit this and stack the x and y values n times each (n = 50 here), then unwind kde$z

dd <- dim(kde$z)
res <- data.frame(DB_num = rep(kde$x, times = dd[1]),
                  AD_num = rep(kde$y, times = dd[2]),
                  height = as.numeric(kde$z))

This produces

> head(res)
        DB_num      AD_num                                  height
1 -3.582508378 -3.79074271 0.0000000000000000000000000006907447484
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190
4 -3.122674029 -3.32899576 0.0000000000000000000055565720524140235
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503
6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471

To get the bins, you'd need to follow what filled.contour did, which is to form breaks via

nlevels <- 20 ## default
brks <- pretty(range(res$height), nlevels)

> brks
 [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14
[16] 0.15 0.16

Then use cut to assign each height to a bin on basis of brks, something like

res <- transform(res, bin = as.numeric(cut(height, brks)))

Which gives

> head(res)
        DB_num      AD_num                                  height bin
1 -3.582508378 -3.79074271 0.0000000000000000000000000006907447484   1
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229   1
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190   1
4 -3.122674029 -3.32899576 0.0000000000000000000055565720524140235   1
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503   1
6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471   1

You'll probably want to check the details of ?cut to determine behaviour on the boundary of a bin, but that should get you close enough.

这篇关于图中每个点的高度值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

图中每个点的高度值 [英] Height values for each point in a plot

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

图中每个点的高度值 [英] Height values for each point in a plot

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭