使用storm hdfs连接器将数据写入HDFS [英] Using the storm hdfs connector to write data into HDFS

查看:539
本文介绍了使用storm hdfs连接器将数据写入HDFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

storm-hdfs连接器的源代码,可用于将数据写入HDFS。
github网址是: https://github.com/ptgoetz/storm-hdfs
有一个特殊的拓扑结构:HdfsFileTopology用于将'|'分隔数据写入HDFS。
link: https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/bolt/HdfsFileTopology.java

The source code for the "storm-hdfs connector" that can be used to write data into HDFS. The github url is : https://github.com/ptgoetz/storm-hdfs There is a particular topology: "HdfsFileTopology" used to write '|' delimited data into HDFS. link: https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/bolt/HdfsFileTopology.java

我对代码部分有疑问:

I have questions about the part of the code:

Yaml yaml = new Yaml();
        InputStream in = new FileInputStream(args[1]);
        Map<String, Object> yamlConf = (Map<String, Object>) yaml.load(in);
        in.close();
        config.put("hdfs.config", yamlConf);

        HdfsBolt bolt = new HdfsBolt()
                .withConfigKey("hdfs.config")
                .withFsUrl(args[0])
                .withFileNameFormat(fileNameFormat)
                .withRecordFormat(format)
                .withRotationPolicy(rotationPolicy)
                .withSyncPolicy(syncPolicy)
                .addRotationAction(new MoveFileAction().toDestination("/dest2/"));

代码的这一部分是做什么的,特别是YAML部分?

What does this part of the code do, especially the YAML part?

推荐答案

我认为代码非常清晰。为了让 HdfsBolt 能够写入HDFS,它需要关于HDFS本身的信息,这就是你在创建YAML文件时所做的事情。

I think the code is quite clear. In order for HdfsBolt to be able to write into HDFS, it needs information about the HDFS itself and that is what you do when your create that YAML file.

为了运行该拓扑结构,可以将该YAML文件的路径作为命令行参数提供。
$ b

And to run that topology, you provide the path of that YAML file as a command line argument.


用法:HdfsFileTopology [拓扑名称] [yaml配置文件]

Usage: HdfsFileTopology [topology name] [yaml config file]

库的作者在这里做了一个很好的描述: Storm-HDFS用法

The author of the library made a good description here: Storm-HDFS Usage.

如果您阅读源代码,您会发现YAML文件的内容将用于配置HDFS。正确地说,它可能是像 HDFS默认值,但我无法确定。

If you read the source code, you will find the contents of the YAML file will be used to configure the HDFS. Properly it could be something like HDFS Defaults but I can't be sure.

正确地问问图书馆的作者是否合适。

Properly it is bette to ask the author of the library.

这篇关于使用storm hdfs连接器将数据写入HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆