当所有任务完成时,异步写入appengine blob并完成它 [英] Writing to an appengine blob asynchronously and finalizing it when all tasks complete

查看:128
本文介绍了当所有任务完成时,异步写入appengine blob并完成它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个难题。

我遍历一组按日期参数化的URL并获取它们。例如,以下是一个示例:

somewebservice.com?start=01-01-2012&end=01-10-2012



有时,从URL返回的内容会被截断(缺少随机附带的'截断错误'消息),因为我定义了太大的范围,所以我必须将查询分解为两个网址

somewebservice.com?start=01-01-2012&end=01-05-2012



somewebservice.com?start=01-06-2012&end=01-10-2012



我以递归方式执行此操作,直到结果不再被截断,然后我写入一个blob,允许并发写入。



这些URL读取调用/ blob写入中的每一个都在单独的任务队列任务中处理。



问题是,我不能为了我的生活而设计一个方案来知道所有任务何时完成。我尝试过使用分片计数器,但递归使它很难。有人建议我使用Pipeline API,所以我观看了3次Slatkin的演讲。它似乎不适用于递归(但我承认我还没有完全理解这个库)。

有没有办法知道何时完成一组任务队列任务(以及递归产生的子项),以便我可以完成我的blob并对其进行任何操作? / p>

谢谢,
John

解决方案

这是我所做的。我不得不稍微修改米奇的解决方案,但他肯定会让我朝着正确的方向发展,而不是立即返回未来的价值。



我必须创建一个中间层DummyJob,它接受递归的输出

  public static class DummyJob extends Job1< Void,List< Void>> {
@Override
public Value< Void>运行(List< Void>傻瓜){
return null;


然后,我将DummyJob的输出提交给Blob Finalizer in a waitFor

  List< FutureValue< Void>> dummies = new ArrayList< FutureValue< Void>>(); 
for(Interval in:ins){
dummies.add(futureCall(new DataFetcher(),immediate(file),immediate(in.getStart()),
immediate(in.getEnd ())));
}

FutureValue< Void> fv = futureCall(new DummyJob(),futureList(dummies));

return futureCall(new DataWriter(),immediate(file),waitFor(fv));

谢谢米奇和尼克!!


I have a difficult problem.

I am iterating through a set of URLs parameterized by date and fetching them. For example, here is an example of one:

somewebservice.com?start=01-01-2012&end=01-10-2012

Sometimes, the content returned from the URL gets truncated (missing random results with a 'truncated error' message attached) because I've defined too large a range, so I have to split the query into two URLs

somewebservice.com?start=01-01-2012&end=01-05-2012

somewebservice.com?start=01-06-2012&end=01-10-2012

I do this recursively until the results aren't truncated anymore, and then I write to a blob, which allows concurrent writes.

Each of these URL fetch calls/blob writes is handled in a separate task queue task.

The problem is, I can't for the life of me devise a scheme to know when all the tasks have completed. I've tried using sharded counters, but the recursion makes it difficult. Someone suggested I use the Pipeline API, so I watched the Slatkin talk 3 times. It doesn't appear to work with recursion (but I admit I still don't fully understand the lib).

Is there anyway to know when a set of task queue tasks (and children that get spawned recursively) are completed so I can finalize my blob and do whatever with it?

Thanks, John

解决方案

All right, so here's what I did. I had to modify Mitch's solution just a bit, but he definitely got me in the right direction with the advice to return the future value instead of an immediate one.

I had to create an intermidate DummyJob that takes the output of the recursion

   public static class DummyJob extends Job1<Void, List<Void>> {
      @Override
      public Value<Void> run(List<Void> dummies) {
         return null;
      }
   }

Then, I submit the output of the DummyJob to the Blob Finalizer in a waitFor

List<FutureValue<Void>> dummies = new ArrayList<FutureValue<Void>>();
for (Interval in : ins) {
   dummies.add(futureCall(new DataFetcher(), immediate(file), immediate(in.getStart()),
         immediate(in.getEnd())));
}

FutureValue<Void> fv = futureCall(new DummyJob(), futureList(dummies));

return futureCall(new DataWriter(), immediate(file), waitFor(fv));

Thank you Mitch and Nick!!

这篇关于当所有任务完成时,异步写入appengine blob并完成它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆