Talend的JSON解析器 [英] JSON parser for Talend

查看:418
本文介绍了Talend的JSON解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一些帮助,以设计一种策略来解析Talend作业(Java作业,而不是Perl)中的JSON文档.我正在使用Talend版本:5.0.2,并在Mac上进行开发,并计划在Linux机器上运行.

I need some help devising a strategy to parse JSON docs within a Talend job (Java job, not Perl). I am using Talend Version: 5.0.2 and developing on a Mac, planning to run on a Linux box.

不幸的是,由于文件格式的原因,我无法使用tFileInputJSON组件-每个文件包含数百个JSON文档,而完整的JSON文档占据了文件中的一行.我认为正确的解决方案是逐行读取文件,然后将其传递到JSON解析器中,然后从那里将结果发送到其余工作.

Unfortunately, I cannot use the tFileInputJSON component because of the format of my files -- each file contains several hundred JSON docs, with a complete JSON doc taking up one line in the file. I think the right solution is to read the file line by line then pass it into a JSON parser and from there send the results to the rest of the job.

我看到的是,我的选择是:

As I see it my options are:

a)将行输入发送到某种Java JSON解析器.如果这是我需要采取的策略,那么我想就如何处理输出并获得

a) send the line input to some sort of Java JSON parser. If that's the strategy I need to take, I'd like some advice on how to deal with the output and getting

b)找到一个Talend组件,该组件可解析JSON文档,但在流中,而不是在一个有效JSON格式的单个文件中.

b) find a Talend component that parses JSON docs, but within a flow as opposed to on a single file in valid JSON format.

我已经搜索了这个组件,但是似乎找不到它.从我的搜索来看,似乎tFileInputJSON组件还是相对较新的.

I've searched around for this component but can't seem to find it. From my search, it seems even the tFileInputJSON component is relatively new.

我绝对知道这是Java可以轻松完成的事情.我的问题是要在Talend框架中同步整个过程.

有人对我下一步应该去哪里有什么建议?

Anyone have some advice on where I should turn next?

先谢谢了.

推荐答案

您是否尝试过创建自定义例程?您可以在代码"下(在左侧的存储库窗口中)执行此操作,右键单击例程"并创建您的自定义例程. 这样,您就可以编写Java函数,然后可以在作业中的某个位置(tMap,tJava等)调用Java函数. 您可以阅读输入文件,并在每行/每个元素上执行一个函数,或者执行任何您想做的事情.

Have you tried creating a custom routine? You can do so under Code (in the repository window on the left), right click on Routines and create your custom routine. This lets you write a Java function which can then be called from somewhere in your job (tMap, tJava, whatever). You could read your input file and call a function on each line/element or whatever that does something you want.

与任何Java函数一样,该例程随后可以写入文件,在屏幕上打印或返回一些列表对象,您可以在其他tJava,tJavaFlex,tJavaRow或您工作中的任何Talend组件中进一步处理这些对象.

Like any Java function, the routine can then write to file, print to screen or return some list object that you can further work on in another tJava, tJavaFlex, tJavaRow or whatever Talend components in your job.

这可能有点怪癖,但是您可以使用自定义例程来做很多事情.

It may feel a little hacky, but you can do a lot just using custom routines.

如果您想一路创建自己的组件,这可能是一个很好的起点:

If you want to go all the way and create your own component, this may be a good way to start: http://www.talendforge.org/forum/viewtopic.php?id=17650 Of course, creating components is much more time-consuming, but may be useful if you think you'll be reusing this code in multiple projects/cases.

这篇关于Talend的JSON解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆