Foundation
No standards, no platform team, no process
You're probably here if...
- Every team provisions infrastructure their own way. There's no shared standard and nobody enforcing one.
- Deployments depend on one person who knows the magic steps. When they're on holiday, things don't ship.
- We don't have a platform team. We have an ops person who's permanently firefighting.
- Developers wait days for environments because every request goes through a ticket queue.
- We have three different ways to run a container and five different pipeline setups across six teams.
- When something breaks in production, nobody's sure who owns it or what the process is.
What Level 2 looks like
Here's what changes when you get there.
Developers follow a single, documented process for source control and deployments. They spend their time writing code, not waiting for access, environments, or someone to run a script for them.
A small platform team exists with a clear mandate: reduce toil and build shared foundations. They have a backlog, not just an inbox.
Incidents are handled by a process, not by whoever happens to be around. Source control, CI/CD, and cloud provisioning are standardised, reducing risk and cutting onboarding time in half.
The journey
Stop improvising. Build the five foundations every platform team needs.
These are the capabilities to build. Each one moves you forward. None require a complete platform rewrite, so start where the pain is highest.
Source Control Standards
A single, enforced approach to how code is stored, branched, reviewed, and merged, applied consistently across every team.
When every team has its own branching model and merge rules, automation breaks, audits fail, and onboarding is a guessing game. The rest of your platform is built on top of this. If it's inconsistent, everything above it will be too.
Pick a branching strategy (trunk-based development is the right default for most teams) and document it. Enforce branch protection rules. Define PR review requirements and commit message standards. Run one enablement session. Measure compliance.
CI/CD Baseline
One shared pipeline template that every team uses for building, testing, and deploying, not a different homegrown setup per team.
Ten different pipelines mean ten different quality gates, ten different security scanning configurations, and ten times the work when something needs to change platform-wide. You can't enforce consistency or improve things at scale without a shared baseline.
Build a reusable pipeline template as a shared library or reusable workflow. Include build, test, security scan, and deploy stages. Onboard all teams to it. Track adoption. Make it easier to use than not to.
Shared Infrastructure Patterns
Documented, reusable ways to provision common infrastructure (containers, databases, networking) so teams start from a known good baseline instead of from scratch.
Every team provisioning infrastructure their own way guarantees configuration drift, security gaps, and knowledge locked in people's heads. When one of those people leaves, you find out the hard way.
Identify the 3-5 most common infrastructure needs across teams (containerised app, managed database, VNet setup are usually top of the list). Write IaC templates for each. Publish with docs. Make them the required starting point for new projects.
Observability Basics
Every production service emits structured logs and key metrics to a central place, with alerts that fire before a customer notices something is wrong.
Teams running blind spend hours diagnosing incidents that a dashboard would have caught in minutes. Without a minimum observability standard, every outage is a forensics exercise and on-call is a punishment.
Define a minimum standard: structured log format, three key metrics per service, one health alert. Create a shared dashboard template. Require all production services to meet it. Review compliance in your next incident retrospective.
On-Call & Incident Basics
A defined process for who owns what, how incidents are declared, how they're worked, and how the team learns from them. Written down, not improvised each time.
Without a shared incident process, every outage is chaos. The same mistakes happen repeatedly. The same people get paged. Nobody learns. A lightweight process, even a one-page runbook, breaks the cycle.
Start with service ownership: every production service has a named owner. Add a simple severity classification (P1/P2/P3 is enough). Write a one-page incident runbook. Run your first blameless post-mortem after the next incident. Use an on-call tool so rotation is fair.
Europe's platform engineering consultancy
Want to move faster? That's what we're here for.
This roadmap is built from Zure's experience running platform engineering engagements across Europe. We know where teams get stuck because we've helped them get unstuck. If you want expert delivery alongside the roadmap, not instead of it, talk to us.