列名称中的破折号产生“找不到对象”。错误 [英] Dash in column name yields "object not found" Error

查看:127
本文介绍了列名称中的破折号产生“找不到对象”。错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我具有一个从数据生成散点图的功能,其中提供了一个参数来选择用于着色点的列。这是一个简化的版本:

I have a function to generate scatter plots from data, where an argument is provided to select which column to use for coloring the points. Here is a simplified version:

library(ggplot2)

plot_gene <- function (df, gene) {
   ggplot(df, aes(x, y)) + 
     geom_point(aes_string(col = gene)) +
     scale_color_gradient()
}

其中 df data.frame 包含列 x y 的列,然后是一堆基因名称。这对大多数基因名称都适用。但是,有些破折号会失败:

where df is a data.frame with columns x, y, and then a bunch of gene names. This works fine for most gene names; however, some have dashes and these fail:

print(plot_gene(df, "Gapdh")) # great!
print(plot_gene(df, "H2-Aa")) # Error: object "H2" not found

似乎已解析基因变量( H2-Aa 变为 H2-Aa )。我该如何解决?有没有办法表明字符串不应该通过 aes_string 中的 eval

It appears the gene variable is getting parsed ("H2-Aa" becomes H2 - Aa). How can I get around this? Is there a way to indicate that a string should not go through eval in aes_string?

如果您需要一些输入来玩,就像我的数据一样失败:

If you need some input to play with, this fails like my data:

df <- data.frame(c(1,2), c(2,1), c(1,2), c(2,1))
colnames(df) <- c("x", "y", "Gapdh", "H2-Aa")

对于我的真实数据,我使用的是 read.table(...,header = TRUE)并使用破折号获取列名称,因为

For my real data, I am using read.table(..., header=TRUE) and get column names with dashes because the raw data files have them.

推荐答案

通常,R会非常努力地确保您的data.frame中具有列名,可以是有效的变量名。当使用使用非标准评估类型语法的函数时,使用非标准列名(那些不是有效的变量名)会导致问题。当专注于使用这样的变量名时,您通常必须将它们包装在中间。在正常情况下

Normally R tries very hard to make sure you have column names in your data.frame that can be valid variable names. Using non-standard column names (those that are not valid variable names) will lead to problems when using functions that use non-standard evaluation type syntax. When focused to use such variable names you often have to wrap them in back ticks. In the normal case

ggplot(df, aes(x, y)) + 
  geom_point(aes(col = H2-Aa)) +
  scale_color_gradient()
# Error in FUN(X[[i]], ...) : object 'H2' not found

会返回错误,但是

ggplot(df, aes(x, y)) + 
  geom_point(aes(col = `H2-Aa`)) +
  scale_color_gradient()

会工作。

如果您确实需要,可以在反引号中粘贴

You can paste in backticks if you really want

geom_point(aes_string(col = paste0("`", gene, "`")))

或者您也可以从一开始就将其视为符号并使用 aes_q 内嵌

or you could treat it as a symbol from the get-go and use aes_q instread

geom_point(aes_q(col = as.name(gene)))

最新版本的 ggplot 支持通过 !! 转义 aes_string aes_q 这样就可以做到

The latest release of ggplot support escaping via !! rather than using aes_string or aes_q so you could do

geom_point(aes(col = !!rlang::sym(gene)))

这篇关于列名称中的破折号产生“找不到对象”。错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆