如何使用Scriptella ETL多个文件? [英] How to ETL multiple files using Scriptella?

查看:99
本文介绍了如何使用Scriptella ETL多个文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个日志报告生成的多个日志文件1.csv,2.csv和3.csv。
我想读取这些文件并使用Scriptella同时解析它们。

I am having multiple log files 1.csv,2.csv and 3.csv generated by a log report. I want to read those files and parse them concurrently using Scriptella.

推荐答案

Scriptella不提供开箱即用的并行作业执行。相反,您应该使用由操作系统或编程环境提供的作业调度程序(例如,通过将作业提交到ExecutorService来运行多个ETL文件)。

Scriptella does not provide parallel job execution out of the box. Instead you should use a job scheduler provided by an operating system or a programming environment (e.g. run multiple ETL files by submitting jobs to an ExecutorService).

这里是一个有效的示例导入指定为系统属性的单个文件:

Here is a working example to import a single file specified as a system property:

ETL文件

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="in" driver="csv" url="$input"/>
    <connection id="out" driver="text"/>
    <query connection-id="in">
        <script connection-id="out">
            Importing: $1, $2
        </script>
    </query>
</etl>

Java代码以并行运行文件:

//Imports 3 csv files in parallel using a fixed thread pool
public class ParallelCsvTest {
    public static void main(String[] args) throws EtlExecutorException, MalformedURLException, InterruptedException {
        final ExecutorService service = Executors.newFixedThreadPool(3);
        for (int i=1;i<=3;i++) {
            //Pass a name as a parameter to ETL file, e.g. input<i>.csv
            final Map<String,?> map = Collections.singletonMap("input", "input"+i+".csv");
            EtlExecutor executor = EtlExecutor.newExecutor(new File("parallel.csv.etl.xml").toURI().toURL(), map);
            service.submit((Callable<ExecutionStatistics>)executor);
        }
        service.shutdown();
        service.awaitTermination(10, TimeUnit.SECONDS);
    }
}

Tu运行此示例创建3个csv文件input1.csv ,input2.csv和input3.csv并将它们放在当前工作目录中。 CSV文件的示例:

Tu run this example create 3 csv files input1.csv, input2.csv and input3.csv and put them in the current working directory. Example of the CSV file:

Level, Message
INFO,Process 1 started
INFO,Process 1 stopped   

这篇关于如何使用Scriptella ETL多个文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆