Pentaho DI-JSON嵌套文件输出 [英] Pentaho DI - JSON Nested File Output

查看:153
本文介绍了Pentaho DI-JSON嵌套文件输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个需要从多个表中获取记录的要求.主表与其他表具有一对多关系.

I have a requirement where I need to fetch records from multiple tables. The primary table is having one-to-many relationship to other tables.

我的数据源是Oracle DB. Oracle数据库具有指定的表.一个叫学生,另一个叫科目.

My data source is Oracle DB. Oracle db is having the specified tables. One called Student other one is Subjects.

例如,我有一个学生表,其中"Student_Id"是主键,其他列如firstname,lastName等.每个学生都注册了多个学科,因此我们的student_id是Subjects表的外键.科目表中包含科目名称,状态,教师姓名等,即一个学生可以拥有多个科目.在学生"表中,我有学生的电话号码,例如他的家庭电话,手机和父母的联系电话.这三个数字应作为一个对象放在学生节点下,如下所示.

For sample, I have a Student Table where "Student_Id" is the Primary Key and other columns like firstname, lastName etc. Each student have registered for multiple subjects so we have student_id is the foreign key to the Subjects table. Subjects table is having subject name, Status, Teacher Name etc i.e. a student can have multiple subjects. In the Student table, I have students phone numbers like his home phone, cell phone and parent's contact number. These 3 numbers should come as one object under student node as given below.

因此,要求将学生表中的所有学生及其每个学生的相应科目显示为数组,并为每个学生显示电话号码.输出应为Json格式.

So requirement is to show all the students from the student table and their corresponding subjects for each student as an array and the Phone Numbers for each student. The output should be in Json format.

我给出了下面的结构.请让我知道如何使用Pentaho数据集成工具实现这一目标.我对这项技术非常陌生.

I have given the structure below. Please let me know how to achieve this using Pentaho data integration tool. I am very much new to this technology.

    {
  "data": [
    {
      "Student_ID": "1",
      "FirstName": "fname1",
      "LastName": "lname1",
      "subjects": [
        {
          "Name": "Physics",
          "Status": "Active",
          "Teacher": "Teacher1"
        },
        {
          "Name": "History",
          "Status": "InActive",
          "Teacher": "Teacher2"
        }
      ],
      "Phone": {
        "Home": "123456",
        "Cell": "3456790",
      }
    },
    {
      "Student_ID": "2",
      "FirstName": "fname2",
      "LastName": "lname2",
      "subjects": [
        {
          "Name": "Geography",
          "Status": "Active",
          "Teacher": "Teacher1"
        },
        {
          "Name": "English",
          "Status": "InActive",
          "Teacher": "Teacher2"
        }
      ],
      "Phone": {
        "Home": "123456",
        "Cell": "3456790",
      }
    }
  ]
}

推荐答案

在Pentaho DI中,JSON输出步骤不支持嵌套数据集.为了实现嵌套的JSON结构,您需要使用Javascript步骤来构建嵌套的结构,并最终传递到输出中.

In Pentaho DI, JSON Output Step doesn't support nested dataset. In order to achieve a nested JSON structure, you need to use Javascript step to build the nested structure and eventually pass onto the output.

通常的流程如下: Input -> Group By -> Modified Javascript (JSON.Stringfy) -> Text file output (stored as .js extension without header)

该流程的示例屏幕截图如下:

A sample screenshot of the flow is as:

注意:这不是确切的解决方案,但可以更清楚地说明步骤和流程.

Note: This is not an exact solution but gives a clearer idea of the steps and flow.

修改了JS步骤以构建嵌套结构.请注意,这是基于MySQL(sakila)的默认工作数据库.我正在使用actor -> film(movie)关系;与您的student -> subject类似的数据集.

Modified JS Step to build the nested structure. Note this is based on a default working database from MySQL (sakila). I am using actor -> film(movie) relation; similar dataset as your student -> subject.

//Initialization Steps -> Change this according to your data set
var json = {};
json.movie={};
Child_Accounts = []; 
var split_film_id   =[];
var split_title     =[];
var split_descr     =[];

//Non Nested Structure -> In your scenario these would be Student ID and Names
json.movie.actor_id     = actor_id;
json.movie.first_name   = first_name;
json.movie.last_name    = last_name;

//splitting the data
split_film_id   = film_id.split(';');
split_title     = title.split(';');
split_descr     = description.split(';');

// Loop through the splitted data and build the child structure
for(i=0; i<split_film_id.length; i++){
    var childCol = {};
    childCol.film_id=split_film_id[i];
    childCol.title=split_title[i];
    childCol.description=split_descr[i];
    Child_Accounts.push(childCol);
 }
 json.movie.films=Child_Accounts;

 // JSON Stringify the data object
 var JsonOutput = JSON.stringify(json);

其余步骤很简单.

示例输出

希望这会有所帮助:)

要点已上传此处.

这篇关于Pentaho DI-JSON嵌套文件输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆