ggplot中错放的点 [英] Misplaced points in ggplot

查看:41
本文介绍了ggplot中错放的点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在读取这样的文件:

I'm reading in a file like so:

genes<-read.table("goi.txt",header=TRUE, row.names=1)
control<-log2(1+(genes[,1]))
experiment<-log2(1+(genes[,2]))

并将它们作为简单的散点图绘制在 ggplot 中:

And plotting them as a simple scatter in ggplot:

ggplot(genes, aes(control, experiment)) +
    xlim(0, 20) + 
    ylim(0, 20) +
    geom_text(aes(control, experiment, label=row.names(genes)),size=3)

但是,这些点未正确放置在我的绘图上(请参见附图)

However the points are incorrectly placed on my plot (see attached image)

这是我的数据

          control     expt
gfi1     0.189634  3.16574
Ripply3 13.752000 34.40630
atonal   2.527670  4.97132
sox2    16.584300 42.73240
tbx15    0.878446  3.13560
hes8     0.830370  8.17272
Tlx1     1.349330  7.33417
pou4f1   3.763400  9.44845
pou3f2   0.444326  2.92796
neurog1 13.943800 24.83100
sox3    17.275700 26.49240
isl2     3.841100 10.08640

如您所见,"Ripply3"在图表上显然处于错误的位置!

As you can see, 'Ripply3' is clearly in the wrong position on the graph!

我做的事真的很愚蠢吗?

Am I doing something really stupid?

推荐答案

ggplot 使用的 aes()函数首先在您通过数据=基因.这就是为什么您可以(并且应该)仅通过诸如 control 之类的裸列名称来指定变量的原因; ggplot 将自动知道在哪里可以找到数据.

The aes() function used by ggplot looks first inside the data frame you provide via data = genes. This is why you can (and should) specify variable only by bare column names like control; ggplot will automatically know where to find the data.

但是R的范围界定系统是,如果在当前环境中未找到该名称的任何东西,则R将在父环境中查找,依此类推,直到到达全局环境,直到找到该名称的东西为止.

But R's scoping system is such that if nothing by that name is found in the current environment, R will look in the parent environment, and so on, until it reaches the global environment until it finds something by that name.

因此 aes(控件,实验)在数据框内查找名为 control experiment 的变量 <代码>基因.它会找到原始的,未转换的 control 变量,但当然 genes 中没有 experiment 变量.因此,它一直沿环境链延伸,直到遇到全局环境为止,在全局环境中您定义了隔离变量 experiment 并使用了该变量.

So aes(control, experiment) looks for variables named control and experiment inside the data frame genes. It finds the original, untransformed control variable, but of course there is no experiment variable in genes. So it continues up the chain of environments until it hits the global environment, where you have defined the isolated variable experiment and uses that.

您打算做更多类似的事情:

You meant to do something more like this:

genes$controlLog <- log2(1+(genes[,1]))
genese$exptLog <- log2(1+(genes[,2]))

其次:

ggplot(genes, aes(controlLog, exptLog)) +
     xlim(0, 20) + 
     ylim(0, 20) +
     geom_text(aes(controlLog, exptLog, label=row.names(genes)),size=3)

这篇关于ggplot中错放的点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆