如何从不同的文件夹中同时读取文件,但在每个文件夹中按特定顺序读取? [英] How to read the files concurrently from different folders but in a specific order within each folder?

查看:31
本文介绍了如何从不同的文件夹中同时读取文件,但在每个文件夹中按特定顺序读取?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 apache 骆驼文件组件.

我有一个要求,不同文件夹中有多个文件.我想按顺序读取这些目录下的文件(基于时间戳).

注意:所有这些目录都是根目录的子目录.

<块引用>

例如: root-

/dir1 - 文件 1、文件 2、文件 3/dir2 - 文件 4、文件 5、文件 6

我需要的是,

1 个线程应该根据时间戳读取 dir1 中的所有文件,其他线程应该从 dir2 读取.

我现在正在做的是,

from("file:/root/?recursive=true&sortBy=file:modified").threads(10).to("another component");

但这不是我想要的方式,而是将不同的线程分配给不同的文件,因此无法实现处理顺序.

请告诉我如何达到我的要求.

解决方案

您需要使用动态路由器根据文件所在的目录路由到不同的使用者.例如,假设您希望三个处理器并行运行:

>

from("file:/root?sortBy=file:modified).to("myDynamicRouter")from("seda:myQueue0").to("我的处理器")from("seda:myQueue1").to("我的处理器")from("seda:myQueue2").to("我的处理器")

MyDynamicRouter 是一个路由器,它将根据文件的目录返回 SEDA 队列的名称(参见 http://camel.apache.org/dynamic-router.html)

例如

public void process(File file) {String queueName = "seda:MyQueue" + (file.getParent().hashCode() % 3);返回队列名称;}

因此所有文件都将按日期顺序从不同的目录中读取.当他们进入各种 seda 队列时,他们将保持日期顺序.由于同一目录中的所有文件都放在同一个队列中,因此它们将按日期顺序进行处理.

需要注意的一点是,来自不同目录的文件可能会在同一个处理器中交织在一起.无需大量工作和一些阻塞,这正是您在为处理器编码时必须考虑的事情.

I want to use apache camel file component.

I have a requirement where there are multiple files in different folders. I want to read files under those directories in an order (based on timestamp).

Note: All these directories will be the subdirectories of the root.

Eg: root-

     /dir1 - file1, file2, file3

     /dir2  - file4, file5, file6

What I need here is,

1 thread should read all files in dir1 based on timestamp and other thread should read from dir2.

What I am doing now is,

from("file:/root/?recursive=true&sortBy=file:modified").threads(10).to("another component");

But this is not working the way I wanted, instead it is assigning different threads to different files and so the order of processing is not achieved.

Please let me know how to achieve my requirement.

解决方案

You need to use a dynamic router to route to different consumers based on the directory the file is in. eg suppose you want three processors running in parallel:

from("file:/root?sortBy=file:modified)
.to("myDynamicRouter")

from("seda:myQueue0")
.to("myProcessor")

from("seda:myQueue1")
.to("myProcessor")

from("seda:myQueue2")
.to("myProcessor")

MyDynamicRouter is a router which will return the name of a SEDA queue based on the directory of the file (see http://camel.apache.org/dynamic-router.html)

eg

public void process(File file) {
    String queueName = "seda:MyQueue" + (file.getParent().hashCode() % 3);
    return queueName;
}

So all the files will be read from the different directories in date order. When they get put on the various seda queues they will remain in date order. As all files from the same directory are put on the same queue, they will be processed in date order.

One thing to watch is that files from different directories may be interleaved together in the same processor. Without a lot of work and some blocking going that, is just something you'll have to allow for when coding your processor.

这篇关于如何从不同的文件夹中同时读取文件,但在每个文件夹中按特定顺序读取?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆