Hub and spoke topology
Central data platform, multiple consuming teams. Scales access without scaling chaos.
One Data Engineering workspace acts as the central hub, containing all pipelines, lakehouses, and transformation logic. Multiple Data Analytics workspaces connect to it as spokes, each serving a different consuming team. Each spoke accesses the hub's Gold layer via shortcuts and builds its own semantic models and reports.
This topology scales the Split model to support multiple consumers without duplicating the engineering layer. The platform team owns the hub; consuming teams own their spokes. Each spoke can have its own access controls, capacity allocation, and deployment cycle. The trade-off is coordination: more workspaces mean more shortcuts to manage, more cross-workspace dependencies to track, and more communication required between the hub and spoke teams.
Pros
-
Single source of truth All curated data lives in one place. Consuming teams access the same Gold layer, ensuring consistency across reports and reducing the risk of conflicting numbers.
-
Team autonomy Each spoke team controls their own workspace. They can build, iterate, and deploy without waiting for the platform team or coordinating with other consumers.
-
Granular access control Each spoke has its own permissions. Finance sees Finance reports; Marketing sees Marketing reports. No one accidentally stumbles into another team's work-in-progress.
-
Independent capacity and billing Each spoke can sit on its own capacity. You can allocate more CUs to high-priority teams or charge back costs to individual departments.
-
Scales horizontally Adding a new consuming team means creating a new spoke workspace and a shortcut. The hub doesn't change. You can grow the number of consumers without increasing hub complexity.
Cons
-
Shortcut proliferation Each spoke needs shortcuts to the hub. As the number of spokes grows, so does the number of shortcuts to create, maintain, and audit.
-
Hub becomes a bottleneck All spokes depend on the hub. If the platform team is slow to deliver new tables or fix data quality issues, every spoke is blocked.
-
Cross-spoke visibility is limited Each spoke is isolated. If two teams need to share assets or collaborate on a report, you need to figure out where it lives and who owns it.
-
Coordination overhead The platform team must communicate changes to all spoke teams. A schema change in the hub can break multiple downstream workspaces simultaneously.
-
Governance complexity More workspaces mean more places to audit, more capacity assignments to monitor, and more deployment pipelines to manage. The operational surface area grows with each spoke.
Example Scenarios
-
Central BI platform serving multiple business units A platform team owns data engineering. Finance, Marketing, Operations, and Sales each have their own analytics workspace. Each builds reports tailored to their needs, all sourced from the same trusted data.
-
Regulated data with multiple consumer groups Healthcare or financial services where raw data access is tightly controlled. The hub restricts access to the engineering layer; each spoke gets only the aggregated, compliant views they're authorised to see.
-
Shared services model IT provides data as a service to the rest of the organisation. Business units don't need to understand pipelines or lakehouses; they just consume curated datasets through their own workspace.
-
Multi-geography or multi-region deployment One central hub, with spokes for EMEA, APAC, and Americas. Each region can have localised reports and capacity allocation while sharing the same underlying data.
-
Embedded analytics for different products A SaaS company with multiple products. One hub contains all operational data; each product team has a spoke for their embedded analytics, keeping development isolated.
When This Topology Breaks Down
-
Different domains need fundamentally different source systems or engineering logic (one hub can't serve all)
-
Spoke teams want to own their own data engineering, not just analytics
-
The platform team can't keep up with demand from all the spokes
-
Cross-domain data products require tight integration that shortcuts can't cleanly support
-
When you hit these limits, Domain Mesh Topology is the natural next step.
Considerations
-
Define the contract between hub and spokes Be explicit about what the hub provides: which tables, which refresh cadence, which quality guarantees. Spoke teams should know what they can rely on and what's subject to change.
-
Establish a change notification process Schema changes in the hub affect all spokes. Build a communication channel (Teams, email, changelog) so spoke teams have advance warning before breaking changes land.
-
Standardise spoke workspace setup If every spoke is structured differently, governance becomes painful. Create a template or checklist for new spoke workspaces: naming conventions, folder structure, default roles, monitoring setup.
-
Monitor the hub closely The hub is a single point of failure. Set up alerting for pipeline failures, data quality issues, and capacity contention. Problems here cascade to every spoke.
-
Plan for spoke lifecycle Teams change, projects end. Have a process for decommissioning spokes when they're no longer needed, including cleaning up shortcuts and reclaiming capacity.