使用 Storm hdfs 连接器将数据写入 HDFS [英] Using the storm hdfs connector to write data into HDFS

查看:49
本文介绍了使用 Storm hdfs 连接器将数据写入 HDFS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可用于将数据写入 HDFS 的storm-hdfs 连接器"的源代码.github 网址是:https://github.com/ptgoetz/storm-hdfs有一个特殊的拓扑结构:HdfsFileTopology"用来写'|'分隔数据到 HDFS.链接:https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/bolt/HdfsFileTopology.java

The source code for the "storm-hdfs connector" that can be used to write data into HDFS. The github url is : https://github.com/ptgoetz/storm-hdfs There is a particular topology: "HdfsFileTopology" used to write '|' delimited data into HDFS. link: https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/bolt/HdfsFileTopology.java

我对代码部分有疑问:

Yaml yaml = new Yaml();
        InputStream in = new FileInputStream(args[1]);
        Map<String, Object> yamlConf = (Map<String, Object>) yaml.load(in);
        in.close();
        config.put("hdfs.config", yamlConf);

        HdfsBolt bolt = new HdfsBolt()
                .withConfigKey("hdfs.config")
                .withFsUrl(args[0])
                .withFileNameFormat(fileNameFormat)
                .withRecordFormat(format)
                .withRotationPolicy(rotationPolicy)
                .withSyncPolicy(syncPolicy)
                .addRotationAction(new MoveFileAction().toDestination("/dest2/"));

这部分代码做了什么,尤其是 YAML 部分?

What does this part of the code do, especially the YAML part?

推荐答案

我觉得代码很清楚.为了让 HdfsBolt 能够写入 HDFS,它需要有关 HDFS 本​​身的信息,这就是您在创建 YAML 文件时所做的.

I think the code is quite clear. In order for HdfsBolt to be able to write into HDFS, it needs information about the HDFS itself and that is what you do when your create that YAML file.

要运行该拓扑,您需要提供该 YAML 文件的路径作为命令行参数.

And to run that topology, you provide the path of that YAML file as a command line argument.

用法:HdfsFileTopology [拓扑名称] [yaml 配置文件]

Usage: HdfsFileTopology [topology name] [yaml config file]

库的作者在这里做了很好的描述:Storm-HDFS 用法.

The author of the library made a good description here: Storm-HDFS Usage.

如果你阅读了源代码,你会发现 YAML 文件的内容将用于配置 HDFS.正确地它可能类似于 HDFS默认值,但我不能确定.

If you read the source code, you will find the contents of the YAML file will be used to configure the HDFS. Properly it could be something like HDFS Defaults but I can't be sure.

最好询问图书馆的作者.

Properly it is bette to ask the author of the library.

这篇关于使用 Storm hdfs 连接器将数据写入 HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆