图中每个点的高度值 [英] Height values for each point in a plot
问题描述
>头(s1m)
DB_num AD_num
[1,] 2 8153
[2,] 7 3553
[3,] 8 4812
[4,] 13 7838
[5,] 24 3315
[6,] 24 6012
图的数据看起来像:
然后我使用在此站点上找到的代码绘制填充的轮廓线:
##计算2D内核密度,请参阅MASS书,第130-131页
require(MASS)
z <-kde2d(s1m [,1],s1m [,2],n = 50)
plot(s1m,xlab = X label,ylab = Y label,pch = 19,cex = .4)
fill.contour(z,drawlabels = FALSE,add = TRUE )
它给了我生成的图像(减去涂鸦):
我的问题:我需要在其中注释每一行数据原始的 s1m
数据框,其编号对应于轮廓m上的高度ap(因此我在上面的图片上涂鸦了)。我认为列表z具有我要查找的值,但我不确定。
最后,我希望我的数据看起来像这样,所以我可以研究蛋白质的相互作用:
DB_num AD_num高度
[1,] 2 8153 1
[2,] 7 3553 1
[3,] 8 4812 3
[4,] 13 7838 6
[5,] 24 3315 2
[6,] 24 6012等。
如果您想要实际的高度而不是每个箱的分配高度
##虚拟数据
DF <-data.frame(DB_num = rnorm(10000),AD_num = rnorm(10000)
require( MASS)
kde<-kde2d(DF [,1],DF [,2 ],n = 50)
请注意 kde2d
返回作为分量 z
的组成部分,在此情况下,矩阵是具有50行和列的行,其中行对应查看 x
数据,并查看 y
列。因为矩阵只是一个向量,并且数据由列填充,所以我们可以利用它并堆叠 x
和 y
值每次 n
次(此处 n = 50
),然后展开 kde $ z
dd <--dim(kde $ z)
res <--数据.frame(DB_num = rep(kde $ x,times = dd [1]),
AD_num = rep(kde $ y,times = dd [2]),
height = as.numeric(kde $ z))
这会产生
>头(res)
DB_num AD_num高度
1 -3.582508378 -3.79074271 0.000000000000000000000000000606447447484
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190
4 -3.1226 0.0000000000000000000055565720524140235
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503
6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471
要获取垃圾箱,您需要遵循 filled.contour
所做的操作,即通过以下方式形成中断:
nlevels<-20 ##默认
brks<-pretty(range(res(hs $ height),nlevels)
> brks
[1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14
[16] 0.15 0.16
然后使用 cut
将每个 height
分配给一个箱 brks
,类似
res<-transform(res,bin = as.numeric(cut(height,brks)))
哪个给出
> head(res)
DB_num AD_num高度框
1 -3.582508378 -3.79074271 0.000000000000000000000000000606447447 1 1
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229 1
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190 1
-3.122674029 -3.32899576 0.0000000000000000000055565720524140235 1
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503 1
6 -2.816117797 -3.02116446 0.000000000000000815937052828768207499471 1
您可能需要检查?cut
的详细信息,以确定在垃圾箱边界上的行为,但这应该可以使您足够接近。
I have a data of protein-protein interactions in a data frame entitled: s1m. Each DB and AD pair make an interaction and I can plot it as well:
> head(s1m)
DB_num AD_num
[1,] 2 8153
[2,] 7 3553
[3,] 8 4812
[4,] 13 7838
[5,] 24 3315
[6,] 24 6012
Plot of the data looks like:
I then used code I found on this site to plot filled contour lines:
## compute 2D kernel density, see MASS book, pp. 130-131
require(MASS)
z <- kde2d(s1m[,1], s1m[,2], n=50)
plot(s1m, xlab="X label", ylab="Y label", pch=19, cex=.4)
filled.contour(z, drawlabels=FALSE, add=TRUE)
It gave me the resulting image(minus the scribbles):
MY QUESTION: I need to annotate each line of data in the original s1m
data frame with a number corresponding to its height on the contour map (hence my scribbles on the image above). I think the list z has the values I am looking for, but I am not sure.
In the end I would want my data to hopefully look something like this so I could study the protein interactions in groups:
DB_num AD_num height
[1,] 2 8153 1
[2,] 7 3553 1
[3,] 8 4812 3
[4,] 13 7838 6
[5,] 24 3315 2
[6,] 24 6012 etc.
This is one option if you want the actual height not the bin each is assigned to
## dummy data
DF <- data.frame(DB_num = rnorm(10000), AD_num = rnorm(10000))
require("MASS")
kde <- kde2d(DF[,1], DF[,2], n = 50)
Note the kde2d
returns as component z
which is a matrix with (in this case) 50 rows and columns where rows correspond to the x
data and columns to the y
data. As a matrix is just a vector, and the data are filled by columns, we can exploit this and stack the x
and y
values n
times each (n = 50
here), then unwind kde$z
dd <- dim(kde$z)
res <- data.frame(DB_num = rep(kde$x, times = dd[1]),
AD_num = rep(kde$y, times = dd[2]),
height = as.numeric(kde$z))
This produces
> head(res)
DB_num AD_num height
1 -3.582508378 -3.79074271 0.0000000000000000000000000006907447484
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190
4 -3.122674029 -3.32899576 0.0000000000000000000055565720524140235
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503
6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471
To get the bins, you'd need to follow what filled.contour
did, which is to form breaks via
nlevels <- 20 ## default
brks <- pretty(range(res$height), nlevels)
> brks
[1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14
[16] 0.15 0.16
Then use cut
to assign each height
to a bin on basis of brks
, something like
res <- transform(res, bin = as.numeric(cut(height, brks)))
Which gives
> head(res)
DB_num AD_num height bin
1 -3.582508378 -3.79074271 0.0000000000000000000000000006907447484 1
2 -3.429230262 -3.63682706 0.0000000000000000000000002951259863229 1
3 -3.275952146 -3.48291141 0.0000000000000000000000558203373144190 1
4 -3.122674029 -3.32899576 0.0000000000000000000055565720524140235 1
5 -2.969395913 -3.17508011 0.0000000000000000014967010810961022503 1
6 -2.816117797 -3.02116446 0.0000000000000008159370528768207499471 1
You'll probably want to check the details of ?cut
to determine behaviour on the boundary of a bin, but that should get you close enough.
这篇关于图中每个点的高度值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!