Flink的简单Hello World示例 [英] Simple hello world example for Flink
问题描述
我正在寻找使用Apache flink进行打招呼的最简单示例.
I am looking for the simplest possible example of an hello-world experience with Apache flink.
假设我刚刚将flink安装在一个干净的盒子上,那么我要做的最低工作是什么?我意识到这很模糊,下面是一些示例.
Assume I have just installed flink on a clean box, what is the bare minimum I would need to do to 'make it do something'. I realize this is quite vague, here are some examples.
来自终端的三个python示例:
Three python examples from the terminal:
python -c "print('hello world')"
python hello_world.py
python python -c "print(1+1)"
当然,流应用程序要复杂一些,但是这与我之前在火花流处理中所做的类似:
Of course a streaming application is a bit more complicated, but here is something similar that I did for spark streaming earlier:
https://spark.apache .org/docs/latest/streaming-programming-guide.html#a-quick-example
如您所见,这些示例具有一些不错的属性:
As you see these examples have some nice properties:
- 它们很小
- 对其他工具/资源的依赖程度最低
- 可以轻松调整逻辑(例如,不同的数字或不同的分隔符)
所以我的问题:
到目前为止,我发现的示例包含需要编译的50行代码.
What I found so far are examples with 50 lines of code that you need to compile.
如果由于第3点而无法避免这种情况,那么可以满足第1点和第2点的要求,并且使用(仅)默认发货的罐子,或者可以从信誉良好的来源容易获得的罐子.
If this cannot be avoided due to point 3, then something that satisfies points 1 and 2 and uses (only) jars that are shipped by default, or easily available from a reputable source, would also be fine.
推荐答案
在大多数大数据和相关框架中,我们以Word Count程序作为Hello World示例.链接:
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
DataSet<String> text = env.fromCollection(Arrays.asList("This is line one. This is my line number 2. Third line is here".split(". ")));
DataSet<Tuple2<String, Integer>> wordCounts = text
.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
@Override
public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
for (String word : line.split(" ")) {
out.collect(new Tuple2<>(word, 1));
}
}
})
.groupBy(0)
.sum(1);
wordCounts.print();
从文件读取
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
//The path of the file, as a URI
//(e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
DataSet<String> text = env.readTextFile("/path/to/file");
DataSet<Tuple2<String, Integer>> wordCounts = text
.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
@Override
public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
for (String word : line.split(" ")) {
out.collect(new Tuple2<String, Integer>(word, 1));
}
}
})
.groupBy(0)
.sum(1);
wordCounts.print();
不要使用try catch处理在wordCounts.print()上抛出的异常,而是将throw添加到方法签名.
Do not handle exception thrown on wordCounts.print() using try catch but instead add throw to method signature.
将以下依赖项添加到pom.xml.
Add the following dependency to the pom.xml.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>1.8.0</version>
</dependency>
在此处了解有关flatMap,groupBy,sum和其他flink操作的信息: https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/
Read about flatMap, groupBy, sum and other flink operations here : https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/
Flink流媒体文档和示例: https ://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/datastream_api.html
Flink Streaming documentation and examples: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/datastream_api.html
这篇关于Flink的简单Hello World示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!