从数据库检索数据的功能出现错误 [英] error with a function to retrieve data from a database

查看:171
本文介绍了从数据库检索数据的功能出现错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从NCBI网站获取FASTA文件,我使用以下功能

I am trying to get a FASTA file form NCBI website, I use the following function

getncbiseq <- function(accession){
  dbs <- c()
  for (i in 1:numdbs){
    db <- dbs[i]
    choosebank(db)
    resquery <- try(query(".tmpquery", paste("AC=", accession)),silent = TRUE)
    if (!(inherits(resquery, "try-error"))){
      queryname <- "query2"
      thequery <- paste("AC=",accession,sep="")
      query(`queryname`,`thequery`)
      # see if a sequence was retrieved:
      seq <- getSequence(query2$req[[1]])
      closebank()
      return(seq)
    }
    closebank()
  }
  print(paste("ERROR: accession",accession,"was not found"))
}    

当我尝试检索序列

mydata <- getncbiseq("NC_001477")

getSequence(query2 $ req [[1]])错误:找不到对象'query2'

Error in getSequence(query2$req[[1]]) : object 'query2' not found

还有缩短这些循环功能的更好方法吗?

Is there a better way to shorten these loop function also ?

如果我使用

query('queryname','the query')
#or 
query("queryname","thequery")

我遇到另一个错误

query("queryname","thequery")中的错误: 无效的请求:在(^)处的未知列表:\"(^)thequery \"

Error in query("queryname", "thequery") : invalid request:"unknown list at (^): \"(^)thequery\""

推荐答案

我认为您打算将对query()的调用分配给名为query2的变量,但您却忘记了这样做.试试这个:

I think you intended to assign your call to query() to a variable called query2, but you forgot to do it. Try this:

if (!(inherits(resquery, "try-error"))) {
  queryname <- "query2"
  thequery <- paste("AC=", accession, sep="")
  query2 <- query(queryname, thequery)
  # see if a sequence was retrieved:
  seq <- getSequence(query2$req[[1]])
  closebank()
  return(seq)
}

正如您所提到的,您的其余代码也有一些古怪之处,可以加以改进.

As you mentioned, the rest of your code also has some quirks and kinks which could probably be improved upon.

更新:

这是在dbs向量上使用sapply而不是显式的for循环(后者通常被R人所讨厌)的代码重构:

Here is a refactor of your code using sapply on the dbs vector instead of an explicit for loop (the latter which is usually frowned upon by R people):

processdbs <- function(x, y) {
    choosebank(x)
    resquery <- try(query(".tmpquery", paste("AC=", y)), silent = TRUE)
    if (!(inherits(resquery, "try-error"))) {
      queryname <- "query2"
      thequery  <- paste("AC=", y, sep="")
      query2 <- query(queryname, thequery)

      # see if a sequence was retrieved:
      seq <- getSequence(query2$req[[1]])
      closebank()
      return(seq)
    }
    closebank()
}

getncbiseq <- function(accession) {
   dbs <- c("genbank","refseq","refseqViruses","bacterial")
   result <- sapply(dbs, processdbs, y=accession)
   closebank()

   print(paste("ERROR: accession",accession,"was not found"))
}

您可能需要做一些额外的工作,以检查result载体并确定是否在任何地方检索到了序列.

You may have to do a slight amount of additional work to inspect the result vector and determine whether a sequence was retrieved anywhere.

这篇关于从数据库检索数据的功能出现错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆