Flink的简单Hello World示例 [英] Simple hello world example for Flink

查看:386
本文介绍了Flink的简单Hello World示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找使用Apache flink进行打招呼的最简单示例.

I am looking for the simplest possible example of an hello-world experience with Apache flink.

假设我刚刚将flink安装在一个干净的盒子上,那么我要做的最低工作是什么?我意识到这很模糊,下面是一些示例.

Assume I have just installed flink on a clean box, what is the bare minimum I would need to do to 'make it do something'. I realize this is quite vague, here are some examples.

来自终端的三个python示例:

Three python examples from the terminal:

python -c "print('hello world')"
python hello_world.py
python python -c "print(1+1)"

当然,流应用程序要复杂一些,但是这与我之前在火花流处理中所做的类似:

Of course a streaming application is a bit more complicated, but here is something similar that I did for spark streaming earlier:

https://spark.apache .org/docs/latest/streaming-programming-guide.html#a-quick-example

如您所见,这些示例具有一些不错的属性:

As you see these examples have some nice properties:

  1. 它们很小
  2. 对其他工具/资源的依赖程度最低
  3. 可以轻松调整逻辑(例如,不同的数字或不同的分隔符)

所以我的问题:

到目前为止,我发现的示例包含需要编译的50行代码.

What I found so far are examples with 50 lines of code that you need to compile.

如果由于第3点而无法避免这种情况,那么可以满足第1点和第2点的要求,并且使用(仅)默认发货的罐子,或者可以从信誉良好的来源容易获得的罐子.

If this cannot be avoided due to point 3, then something that satisfies points 1 and 2 and uses (only) jars that are shipped by default, or easily available from a reputable source, would also be fine.

推荐答案

在大多数大数据和相关框架中,我们以Word Count程序作为Hello World示例.链接:

final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<String> text = env.fromCollection(Arrays.asList("This is line one. This is my line number 2. Third line is here".split(". ")));

    DataSet<Tuple2<String, Integer>> wordCounts = text
        .flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
          @Override
          public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
            for (String word : line.split(" ")) {
              out.collect(new Tuple2<>(word, 1));
            }
          }
        })
        .groupBy(0)
        .sum(1);

wordCounts.print();

从文件读取

final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(1);

    //The path of the file, as a URI
    //(e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    DataSet<String> text = env.readTextFile("/path/to/file");

    DataSet<Tuple2<String, Integer>> wordCounts = text
        .flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
          @Override
          public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
            for (String word : line.split(" ")) {
              out.collect(new Tuple2<String, Integer>(word, 1));
            }
          }
        })
        .groupBy(0)
        .sum(1);

    wordCounts.print();

不要使用try catch处理在wordCounts.print()上抛出的异常,而是将throw添加到方法签名.

Do not handle exception thrown on wordCounts.print() using try catch but instead add throw to method signature.

将以下依赖项添加到pom.xml.

Add the following dependency to the pom.xml.

<dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-java</artifactId>
      <version>1.8.0</version>
</dependency> 

在此处了解有关flatMap,groupBy,sum和其他flink操作的信息: https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/

Read about flatMap, groupBy, sum and other flink operations here : https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/

Flink流媒体文档和示例: https ://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/datastream_api.html

Flink Streaming documentation and examples: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/datastream_api.html

这篇关于Flink的简单Hello World示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆