如何防止R中的错误消息被截断 [英] How to prevent truncation of error messages in R

查看:103
本文介绍了如何防止R中的错误消息被截断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RJDBC在R中查询数据库.查询是根据从文件中读取的数据建立的.这些查询可能会很长,并且可能包含不存在的列(导致错误).

I am querying a database in R using RJDBC. The queries are built up from data which is read in from a file. These queries can get very long, and can potentially include non existent columns (resulting in an error).

下面是一个简化的示例,它以文件为输入,并运行从文件生成的2个查询.

Below is a simplified example, it takes the file as input and the runs 2 queries generated from the file.


table     column
drinks    cost
drinks    sugar
drinks    volume
food      cost

SELECT column, cost, sugar FROM drinks;
SELECT cost FROM food;

由于这些查询的时间可能很长,因此来自数据库的任何错误通常会在有用信息之前被截断.我当前的错误之一是:

Because these queries can get very long, any errors from the database are often truncated before the useful information. One of my current errors reads:

错误[2018-05-16 16:53:07]处理DAR-2018-00008原始错误消息的表data_baseline_biosamples时出错:.verify.JDBC.result(r,无法检索JDBC结果集的错误" ,:无法检索SELECT ed.studyid的JDBC结果集,{很长的列列表},ct.nmr_xl_vldl_pl,ct.nmr_xl _

ERROR [2018-05-16 16:53:07] Error processing table data_baseline_biosamples for DAR-2018-00008 original error message: Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", : Unable to retrieve JDBC result set for SELECT ed.studyid, {very long list of columns} ,ct.nmr_xl_vldl_pl,ct.nmr_xl_

由于数据库错误包括关键信息之前的整个查询,因此截断会删除用于解决问题的有价值的信息.

Because the database error includes the entire query before the key information, the truncation removes valuable information for solving the problem.

在这种情况下,错误消息可能以以下内容结尾:

In this case the error message probably ends with something like this:

(第1行,"littlefeltfangs"拥有的表"data_biosamples"不包含列"sample_source".)

(line 1, Table 'data_biosamples' owned by 'littlefeltfangs' does not contain column 'sample_source'.)

如何记录数据库发送的完整错误消息,或者提取该消息的最后一部分?

How to I record the full error message sent by the database or otherwise extract the final part of that message?

我正在tryCatch中捕获错误,并使用futile.logger将错误传递到日志文件中.截断后的总错误长度为8219个字符,其中8190个字符似乎来自数据库.

I am capturing the error in a tryCatch and passing the error into a log file using futile.logger. The total error length when truncated is 8219 characters, with 8190 of those appearing to be from the database.

推荐答案

不是RJDBC切断了错误消息.

请参见?stop:

错误将被截断为getOption("warning.length")个字符,默认为1000.

Errors will be truncated to getOption("warning.length") characters, default 1000.

因此您可以设置以下选项:

So you can set the option:

stop(paste(rep(letters, 50L), collapse = ''))
options(warning.length = 2000L)
stop(paste(rep(letters, 50L), collapse = ''))

您会在第一条消息中看到截断,但没有第二条消息.

You'll notice the truncation in the first message, but no the second.

对于我自己的帮助程序功能,该功能可以捕获RDJBC中的错误,我可以使用类似的东西:

For my own helper functions catching errors from RDJBC, I use something like:

result = tryCatch(<some DB operation>, error = identity)

然后在result$message上执行正则表达式以测试各种常见错误&产生友好的错误消息.

Then do regular expressions on result$message to test for various common errors & produce a friendlier error message.

?stop中未提及的是warning.length只能在相当窄的值范围内.为了探索这一点,我运行了以下代码:

Not mentioned in ?stop is that warning.length can only be in a fairly narrow range of values. To explore this I ran the following code:

can = logical(16000L)
for (ii in seq_along(can)) {
  res = tryCatch(options(warning.length = ii),
                 error = identity)
  if (inherits(res, 'error')) {
    can[ii] = FALSE
  } else can[ii] = TRUE
}

png('~/Desktop/warning_valid.png')
plot(can, las = 1L, ylab = 'Valid option value?',
     main = 'Valid option values for `warning.length`',
     type = 's', lwd = 3L, log = 'x')
first = which.max(can)
switches = c(first, first + which.min(can[first:length(can)] - 1L))
abline(v = switches, lty = 2L, col = 'red', lwd = 2L)
axis(side = 1L, at = switches, las = 2L, cex = .5)
dev.off()

打败了这些数字(100和8172)的来源,它们看起来相当随意(8196是2的最接近的幂). 此处这些值被硬编码到的R源.我已经r-devel 上询问了此问题;我将相应地更新此帖子.

Beats me where these numbers (100 & 8172) come from, they seem fairly arbitrary (8196 is the nearest power of 2). Here is the place in the R source where these values are hard-coded in. I've asked about this on r-devel; I'll update this post accordingly.

FWIW,在我自己的错误分析帮助器函数(用于查询PrestoDB)中,我有以下一行:

FWIW, in my own error-parsing helper function (built for querying PrestoDB), I have this line:

core_msg = gsub('.*(Query failed.*)\\)\\s*$', '\\1', result$message)

这是针对PrestoDB发出的错误消息的,因此您必须自己对其进行自定义,但是这样做的目的是剔除错误消息的那一部分,而这只是在重新查询本身.

This is catered to the error messages that come out of PrestoDB, so you'll have to customize it yourself, but the idea is to clip out that part of your error message which is just regurgitating the query itself.

或者,当然,您可以将result$message分成少于8172个字符的两位,然后分别打印出来.

Alternatively, of course you can split result$message into two bits which are less than 8172 characters and print them out separately.

这篇关于如何防止R中的错误消息被截断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆