Multi Tenancy
Overview¶
Our data platform supports multi-tenancy to allow multiple teams, projects, or clients to share the same infrastructure while maintaining strict data and resource isolation.
This ensures cost efficiency, simplified operations, and centralized management without compromising security.
This feature is yet to be imlpmented
Key Capabilities¶
-
Project-Based Isolation
Each tenant operates within its own project scope, ensuring data, jobs, and configurations remain isolated from other tenants. -
User and Role Management
Integrated with Keycloak for authentication and authorization.
Role-based access control (RBAC) ensures users can only interact with projects and resources they have been granted access to. -
Resource Quotas
Enforce CPU, memory, and storage limits per tenant to avoid resource starvation and ensure fair usage. -
Namespace Separation in Kubernetes
Each tenant’s workloads can be deployed in separate Kubernetes namespaces, ensuring logical separation and easier resource monitoring. -
Job and Task Isolation
Apache Beam pipelines (running on Flink/Spark clusters) are scoped per project or tenant to prevent data leakage and allow independent scaling.
Benefits of Multi-Tenancy¶
- Cost Efficiency – Share underlying infrastructure without duplicating entire environments.
- Operational Simplicity – Centralized monitoring, logging, and management while keeping tenant data isolated.
- Scalable Workload Execution – Beam tasks can scale independently for each tenant based on workload demands.
- Security & Compliance – Isolated execution environments ensure compliance with data protection requirements.
Example Architecture for Multi-Tenancy¶
- Kubernetes Namespace per Tenant – Logical and resource isolation is enforced by creating separate namespaces for each tenant, ensuring resource limits and failure domains are confined.
- Isolated Beam Pipelines – Each tenant’s jobs are executed using separate Beam runner configurations (e.g., Flink or Spark), preventing interference and ensuring workload isolation.
- Per-Tenant Storage & Metadata – Data and metadata are segregated using dedicated storage solutions such as separate buckets, databases, or schemas, ensuring data privacy and compliance.
Note: While compute and storage are shared at the infrastructure layer, all execution contexts, configurations, and data remain fully isolated between tenants.