ggplot中错放的点 [英] Misplaced points in ggplot
问题描述
我正在读取这样的文件:
I'm reading in a file like so:
genes<-read.table("goi.txt",header=TRUE, row.names=1)
control<-log2(1+(genes[,1]))
experiment<-log2(1+(genes[,2]))
并将它们作为简单的散点图绘制在 ggplot
中:
And plotting them as a simple scatter in ggplot
:
ggplot(genes, aes(control, experiment)) +
xlim(0, 20) +
ylim(0, 20) +
geom_text(aes(control, experiment, label=row.names(genes)),size=3)
但是,这些点未正确放置在我的绘图上(请参见附图)
However the points are incorrectly placed on my plot (see attached image)
这是我的数据
control expt
gfi1 0.189634 3.16574
Ripply3 13.752000 34.40630
atonal 2.527670 4.97132
sox2 16.584300 42.73240
tbx15 0.878446 3.13560
hes8 0.830370 8.17272
Tlx1 1.349330 7.33417
pou4f1 3.763400 9.44845
pou3f2 0.444326 2.92796
neurog1 13.943800 24.83100
sox3 17.275700 26.49240
isl2 3.841100 10.08640
如您所见,"Ripply3"在图表上显然处于错误的位置!
As you can see, 'Ripply3' is clearly in the wrong position on the graph!
我做的事真的很愚蠢吗?
Am I doing something really stupid?
推荐答案
ggplot
使用的 aes()
函数首先在您通过数据=基因
.这就是为什么您可以(并且应该)仅通过诸如 control
之类的裸列名称来指定变量的原因; ggplot
将自动知道在哪里可以找到数据.
The aes()
function used by ggplot
looks first inside the data frame you provide via data = genes
. This is why you can (and should) specify variable only by bare column names like control
; ggplot
will automatically know where to find the data.
但是R的范围界定系统是,如果在当前环境中未找到该名称的任何东西,则R将在父环境中查找,依此类推,直到到达全局环境,直到找到该名称的东西为止.
But R's scoping system is such that if nothing by that name is found in the current environment, R will look in the parent environment, and so on, until it reaches the global environment until it finds something by that name.
因此 aes(控件,实验)
在数据框内查找名为 control
和 experiment
的变量 <代码>基因.它会找到原始的,未转换的 control
变量,但当然 genes
中没有 experiment
变量.因此,它一直沿环境链延伸,直到遇到全局环境为止,在全局环境中您定义了隔离变量 experiment
并使用了该变量.
So aes(control, experiment)
looks for variables named control
and experiment
inside the data frame genes
. It finds the original, untransformed control
variable, but of course there is no experiment
variable in genes
. So it continues up the chain of environments until it hits the global environment, where you have defined the isolated variable experiment
and uses that.
您打算做更多类似的事情:
You meant to do something more like this:
genes$controlLog <- log2(1+(genes[,1]))
genese$exptLog <- log2(1+(genes[,2]))
其次:
ggplot(genes, aes(controlLog, exptLog)) +
xlim(0, 20) +
ylim(0, 20) +
geom_text(aes(controlLog, exptLog, label=row.names(genes)),size=3)
这篇关于ggplot中错放的点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!