Map Operator
DataStream → DataStream, takes one element and produces one element. A map function that multiplies values in the input stream by 2:
DataStream<Integer> dataStream = //...
dataStream.map(new MapFunction<Integer, Integer>() {
@Override
public Integer map(Integer value) throws Exception {
return 2 * value;
}
});
FlatMap Operator
DataStream → DataStream, takes one element, produces 0, 1, or more elements. A FlatMap function that splits sentences:
dataStream.flatMap(new FlatMapFunction<String, String>() {
@Override
public void flatMap(String value, Collector<String> out) throws Exception {
for(String word: value.split(" ")){
out.collect(word);
}
}
});
Filter Operator
DataStream → DataStream, returns boolean, filters out elements that satisfy the condition. Filter out elements with value 0:
dataStream.filter(new FilterFunction<Integer>() {
@Override
public boolean filter(Integer value) throws Exception {
return value != 0;
}
});
KeyBy Operator
DataStream → KeyedStream, logically divides the stream into disjoint partitions, all records with the same key are assigned to the same partition:
dataStream.keyBy(value -> value.getSomeKey())
dataStream.keyBy(value -> value.f0)
Reduce Operator
KeyedStream → DataStream, “rolling” reduce is an accumulation process that processes elements in the stream one by one and progressively updates the result. For example, input data stream [1, 2, 3, 4, 5], output of partial sums would be [1, 3, 6, 10, 15]:
keyedStream.reduce(new ReduceFunction<Integer>() {
@Override
public Integer reduce(Integer value1, Integer value2) throws Exception {
return value1 + value2;
}
});
Fold Operator
KeyedStream → DataStream, for keyed data streams with initial value, performs rolling fold by combining current element with previous fold value and emitting new value:
DataStream<String> result =
keyedStream.fold("start", new FoldFunction<Integer, String>() {
@Override
public String fold(String current, Integer value) {
return current + "-" + value;
}
});
Aggregations Operator
KeyedStream → DataStream, performs rolling aggregation on KeyedStream. The difference between min and minBy is that min returns the minimum value while minBy returns the element with the minimum value in that field:
keyedStream.sum(0);
keyedStream.sum("key");
keyedStream.min(0);
keyedStream.min("key");
keyedStream.max(0);
keyedStream.max("key");
keyedStream.minBy(0);
keyedStream.minBy("key");
keyedStream.maxBy(0);
keyedStream.maxBy("key");
Window Operator
KeyedStream → WindowedStream, defines windows on partitioned KeyedStream, windows group data in each key based on some characteristics:
dataStream.keyBy(value -> value.f0).window(TumblingEventTimeWindows.of(Time.seconds(5)));
WindowAll Operator
DataStream → AllWindowedStream, defines windows on regular DataStream, groups all stream events:
dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5)));
Window Apply Operator
WindowedStream → DataStream, applies a general function to the entire window
Window Reduce Operator
WindowedStream → DataStream, applies a functional Reduce function to the window and returns a simplified value
Window Fold Operator
WindowedStream → DataStream, applies a functional fold function to the window and returns the folded value
Aggregations on windows
WindowedStream → DataStream, aggregates contents in the window
Union Operator
DataStream → DataStream, merges two or more data streams, creating a new data stream containing all elements from all data streams
Window Join Operator
DataStream, DataStream → DataStream, joins two data streams by given key and a common window
Interval Join Operator
KeyedStream, KeyedStream → DataStream, connects elements in two keyed streams with the same key within a given time interval
Window CoGroup Operator
DataStream, DataStream → DataStream, coGroups two data streams on given key and common window
Connect Operator
DataStream, DataStream → ConnectedStreams, connects two data streams, preserving their types
CoMap, CoFlatMap Operator
ConnectedStreams → DataStream, similar to Map and FlatMap, but for connected streams