执行结果文件的简单轮询 [英] Implementation of simple polling of results file

查看:144
本文介绍了执行结果文件的简单轮询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的论文的数据收集模块,我已经实现了一个简单的轮询机制。这是需要的,因为我将每个数据收集请求(许多之一)作为SQL查询,通过Web form提交,由 RCurl 代码模拟。服务器处理每个请求并生成具有特定URL( RESULTS_URL 在下面的代码中的结果的文本文件)。无论请求,URL和文件名是一样的(我不能改变)。由于不同数据请求的处理时间显然是不同的,有些请求可能需要大量的时间,所以我的 R 代码需要知道,当结果准备就绪文件被重新生成),以便它可以检索它们。以下是我解决此问题的解决方案。

For one of my dissertation's data collection modules, I have implemented a simple polling mechanism. This is needed, because I make each data collection request (one of many) as SQL query, submitted via Web form, which is simulated by RCurl code. The server processes each request and generates a text file with results at a specific URL (RESULTS_URL in code below). Regardless of the request, URL and file name are the same (I cannot change that). Since processing time for different data requests, obviously, is different and some requests may take significant amount of time, my R code needs to "know", when the results are ready (file is re-generated), so that it can retrieve them. The following is my solution for this problem.

POLL_TIME <- 5 # polling timeout in seconds

在函数 srdaRequestData()中,在发出数据请求之前:

In function srdaRequestData(), before making data request:

# check and save 'last modified' date and time of the results file
# before submitting data request, to compare with the same after one
# for simple polling of results file in srdaGetData() function
beforeDate <- url.exists(RESULTS_URL, .header=TRUE)["Last-Modified"]
beforeDate <<- strptime(beforeDate, "%a, %d %b %Y %X", tz="GMT")

<making data request is here>

在函数 srdaGetData()中调用 srdaRequestData()

# simple polling of the results file
repeat {
  if (DEBUG) message("Waiting for results ...", appendLF = FALSE)
  afterDate <- url.exists(RESULTS_URL, .header=TRUE)["Last-Modified"]
  afterDate <-  strptime(afterDate, "%a, %d %b %Y %X", tz="GMT")
  delta <- difftime(afterDate, beforeDate, units = "secs")
  if (as.numeric(delta) != 0) { # file modified, results are ready
    if (DEBUG) message(" Ready!")
    break
  }
  else { # no results yet, wait the timeout and check again
    if (DEBUG) message(".", appendLF = FALSE)
    Sys.sleep(POLL_TIME)
  }
}

<retrieving request's results is here>

模块的主要流程/事件序列是线性的,如下所示:

The module's main flow/sequence of events is linear, as follows:

Read/update configuration file
Authenticate with the system
Loop through data requests, specified in configuration file (via lapply()),
  where for each request perform the following:
  {
    ...
    Make request: srdaRequestData()
    ...
    Retrieve results: srdaGetData()
    ...
  }

问题上面的代码是它似乎没有按照预期的方式工作:在进行数据请求时,代码应该打印等待结果...,然后定期检查结果文件为了进行修改(重新生成),打印进度点,直到结果准备就绪,当它打印确认。然而,实际行为是代码等待很长时间(我有意使一个请求长时间运行),不打印任何东西,但是,显然是检索结果和打印同时等待结果...和准备两者。

The issue with the code above is that it doesn't seem to be working as expected: upon making data request, the code should print "Waiting for results ..." and then, periodically checking the results file for being modified (re-generated), print progress dots until the results are ready, when it prints confirmation. However, the actual behavior is that the code waits long time (I intentionally made one request a long-running), not printing anything, but then, apparently retrieves results and prints both "Waiting for results ..." and " Ready" at the same time.

在我看来,这是一种同步问题,但是我不清楚究竟是什么。或者,也许这是其他的东西,我以某种方式丢失它。 您的建议和帮助将不胜感激!

It seems to me that it's some kind of synchronization issue, but I can't figure out what exactly. Or, maybe it's something else and I'm somehow missing it. Your advice and help will be much appreciated!

推荐答案

在对这个问题的评论中,我相信 MrFlick 解决了这个问题:轮询逻辑似乎是有效的,但是问题是进度消息与当前的不同步系统上的事件。

In a comment to the question, I believe MrFlick solved the issue: the polling logic appears to be functional, but the problem is that the progress messages are out of synch with current events on the system.

默认情况下,R控制台输出缓存。这是设计的:加快速度,避免与频繁的消息等相关的分心的闪烁。我们倾向于忘记这一事实,特别是在以非常互动的方式使用R后,运行各种特殊语句在控制台(控制台缓冲区在返回> 提示之前自动刷新)。

By default, the R console output is buffered. This is by design: to speed things up and avoid the distracting flicker that may be associated with frequent messages etc. We tend to forget this fact, particularly after we've been using R in a very interactive fashion, running various ad-hoc statement at the console (the console buffer is automatically flushed just before returning the > prompt).

可以通过在每个关键输出语句之后显式刷新控制台,使用 flush来获取 message(),更通常的是实时的控制台输出.console()函数,或通过在R GUI级别禁用缓冲区(右键单击控制台时,请参阅缓冲输出Ctrl W 项目,这也可以在 Misc 菜单中找到)

It is however possible to get message() and more generally console output in "real time" by either explicitly flushing the console after each critical output statement, using the flush.console() function, or by disabling buffering at the level of the R GUI (right-click when on the console, see Buffered output Ctrl W item. This is also available in the Misc menu)

这是一个玩具示例,显示使用flush 。安慰。注意使用 cat()而不是 message(),因为前者不会自动添加CR / LF到输出。然而,后者是有用的,因为它的消息可以通过 suppressMessages()等来抑制。另外如评论所示,您可以使用\b(退格)字符使数字覆盖。

Here's a toy example of the explicit use of flush.console. Note the use of cat() rather than message() as the former doesn't automatically add a CR/LF to the output. The latter however is useful however because its messages can be suppressed with suppressMessages() and the like. Also as shown in the comment you can cat the "\b" (backspace) character to make the number overwrite one another.

CountDown <- function() {
  for (i in 9:1){
    cat(i)
    # alternatively to cat(i) use:  message(i)
    flush.console()    # <<<<<<<  immediate ouput to console.
    Sys.sleep(1)
    cat(" ")   # also try cat("\b") instead ;-)
  }
  cat("... Blast-off\n")
}

输出如下,当然这并不明显,在这个打印出来之前,它总共需要10秒钟,每一秒打印一个数字,在最后的爆破之前;删除flush.console()语句,输出将在10秒后立即出现,即当函数终止时(除非控制台在GUI的级别不缓冲)。

The output is the following, what is of course not evident in this print-out is that it took 10 seconds overall with one number printed every second, before the final "Blast off"; do remove the flush.console() statement and the output will come at once, after 10 seconds, i.e. when the function terminates (unless console is not buffered at the level of the GUI).


CountDown()
9 8 7 6 5 4 3 2 1 ... Blast-off

CountDown() 9 8 7 6 5 4 3 2 1 ... Blast-off

这篇关于执行结果文件的简单轮询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆