与 ArangoDB 配合使用的 ETL 工具 - 它们是什么? [英] ETL Tools that function well with ArangoDB - What are they?

查看:30
本文介绍了与 ArangoDB 配合使用的 ETL 工具 - 它们是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

那里有很多 ETL 工具.免费的并不多.在那里的免费选择中,他们似乎对 ArangoDB 没有任何了解或支持.如果有人处理过将他们的数据迁移到 ArangoDB 并自动执行此过程,我很想听听您是如何做到这一点的.下面我列出了我们对 ETL 工具的几种选择.这些选择实际上是我从 Bas Geerdink 的 2016 年 Spark Europe 演讲中选择的.

There are so many ETL tools out there. Not many that are Free. And of the Free choices out there they don't appear to have any knowledge of or support for ArangoDB. If anyone has dealt with the migration of their data over to ArangoDB and automated this process I would love to hear how you accomplished this. Below I have listed out several choices we have for ETL Tools. These choices I actually took from the 2016 Spark Europe presentation by Bas Geerdink.

* IBM InfoSphere DataStage
* Oracle Warehouse Builder
* Pervasive Data Integrator
* PowerCenter Informatica
* SAS Data Management
* Talend Open Studio
* SAP Data Services
* Microsoft SSIS
* Syncsort DMX
* CloverETL
* Jaspersoft
* Pentaho
* NiFi

推荐答案

我能够利用 Apache NiFi 来实现这个目标.以下是我为将数据从源数据库中提取到 ArangoDB 所做的工作的极其基本的概述.

I was able to utilize Apache NiFi in order to accomplish this goal. Below is an extremely basic overview of what I did in order to get data out of a source Database into ArangoDB.

使用 NiFi,您可以从许多标准数据库中提取数据.已经创建了许多 Java 驱动程序来处理 MySQL、SQLite、Oracle 等数据库......

Using NiFi you are able to extract data from many of the standard databases out there. There are many Java Drivers out there that are already created to work with databases such as MySQL, SQLite, Oracle, etc....

我能够使用两个处理器从源数据库中提取数据:

I was able to use two processors to pull data out of a source database using:

QueryDatabaseTable执行SQL

这些输出采用 NiFi 的 Avro 格式,然后我使用 ConvertAvroToJSON 处理器将其转换为 JSON.这会将输出转换为 JSON 列表.

The output of these are in NiFi's Avro format which I then converted to JSON using the ConvertAvroToJSON Processor. This converts the output to a JSON List.

虽然 NiFi 中确实没有任何专为与 ArangoDB 一起使用而构建的功能,但 ArangoDB 内置了一项功能,那就是它的 API.

While there really isn't anything within NiFi specifically built for use with ArangoDB there is one feature that comes built in with ArangoDB and that is it's API.

我能够使用 NiFi 的 InvokeHTTP 处理器和 POST 方法将数据批量插入 ArangoDB 到名为 Cities 的集合中.

I was able to Bulk Insert data into ArangoDB using NiFi's InvokeHTTP Processor with a POST method into a Collection named Cities.

我用作 RemoteURL 的值:

The value I used as the RemoteURL:

http://localhost:8529/_api/import?collection=cities&type=list&details=true

下面是NiFi的截图.我绝对可以用它来开始我的研究.我希望这对其他人有帮助.忽略一些额外的处理器,因为我将它们放在那里用于测试目的,并且正在使用 JOLT 以查看是否可以使用它来转换"我的 JSON.ETL中的"T".

Below is a screenshot of NiFi. I could have definitely used this to kick start my research. I hope this helps someone else. Ignore some of the extra processors as I have them in there for testing purposes and was messing around with JOLT to see if I can use it to 'Transform' my JSON. The "T" in ETL.

这篇关于与 ArangoDB 配合使用的 ETL 工具 - 它们是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆