Parallel & Cluster Execution¶
Some tasks — particularly those involving large datasets or complex computations — are processed using a distributed execution engine.
Cluster-Based Processing Highlights:¶
- Powered by Apache Beam with a Flink runner (optional module)
- Executes tasks in parallel across multiple nodes
- Ideal for:
- Large-scale transformations
- Streaming data pipelines
- Aggregations and joins over big data
You can enable or disable cluster execution per task, based on the complexity or performance requirements.