An increasing number of businesses are creating platform teams to build a solid foundation on which their developers can work. For this to be successful it’s necessary to treat the platform as a product. Here I explore what that means.
What is a platform?
A platform can broadly be described as “APIs, environments, services and practices that allow developers to focus on delivering value.” Really it’s a collection of any tools, templates and processes that allows a developer to get on with developing and not have to worry about Ops.
Why have a platform?
The traditional way of running a technology team would be to have separate teams for each discipline.
- Product managers are in their ivory towers making product decisions completely separate from any of your other teams.
- Developers are writing code to the product specs they receive, without really caring why.
- QA receive untested code from the developers and check it meets specs.
- Ops respond to ticket requests for environments and infrastructure changes, like opening firewall ports.
At first glance it seems like this is a nice separation of responsibilities, but in reality it’s a disaster. None of these teams is really working in a silo; every action they take is likely to impact one of the other teams. A delay in Ops is going to directly translate to a delay in every other team. You have “coupled backlogs”.
The most successful attempt to solve for coupled backlogs was Agile. This brought Product, Dev and QA into the same teams, forcing them to work together on tasks. They now had uncoupled backlogs.
From a feature delivery perspective, one of the main problems with Agile is that Ops remains a separate team and the backlog coupling still exists. In fact, I have seen organisiations where, because they have also made Ops Agile and work in sprints the delay to developers has actually increased as requests might wait a week before entering Ops next sprint.
Devops tries to solve the problem of coupled backlogs by moving all operations work into the dev team. In theory that sounds great but the execution is normally sub-optimal. Instead of moving operations people into the dev team typically just the work is moved. This means you have developers now doing ops work, which is a completely separate speciality. Some developers are good at it and some become a liability as they unknowingly introduce security holes in the infrastructure as they only really care about getting their apps working.
The other problem with devops is how the value being provided by developers breaks down. Instead of spending half a day writing some code and moving on to the next feature they often find themselves spending an additional 2 days trying to get the infrastructure working.
At first glance platform appears to be a regression towards Agile. Ops is once again a speciality all on its own. The key difference is that if you have got the platform right then the dev team never need to directly interact with the ops team. Specialised work is once again being done by specialists but with completely uncoupled backlogs.
In practice this will manifest as developers writing minimal config for the infrastructure, e.g. a YAML snippet defining required port openings, and that being automatically provisioned within pre-defined constraints.
The mission statement
To help guide the direction of a platform it’s useful to have a mission statement that describes what you are trying to achieve.
A stable and predictable platform that is the developer’s preferred choice for deploying applications by removing toil and reducing decision making.
There are four key elements to this mission statement that really define what a platform should be.
- Stable and predictable - Developers shouldn’t be distracted by a platform having poor availability and they shouldn’t be consistently having to adjust their integrations with the platform.
- Preferred choice - Platforms only work best when developers aren’t forced to use them. The platform product manager should need to keep improving the platform to make it so that the development teams choose to use it. There may also be some use cases that as a product team you decide you don’t want to support and it’s OK for developers to look elsewhere for a solution.
- Reducing decision making - Developers are frequently asked to make decisions that sit outside of their expertise. “What event bus should I use?”, “How many partitions should my kafka topic have?”. Anything that can be done to move these questions to specialists is an organisational win.
Ensuring individual features add value
While the mission statement gives a good overall direction there is a core set of criteria that all features should be assessed against.
- Is the feature self service for developers?
- Does it contribute to platform stability?
- Are difficult problems being simplified?
- Will the feature remove and standardise repetitive tasks?
- Does the feature avoid adding any constraints on developers?
- Are the number of decisions a developer needs to make reduced?
- Are new capabilities added without breaking backwards compatability?
- Does the feature provide measurable value (shortening development cycle/reducing infrastructure costs/adding security)?
The answer doesn’t need to be yes to all of the questions but if it’s no to any of the questions then there needs to be some other good justification for prioritising is as a feature.
Discovering platform needs
Pain point analysis
One of the simplest ways to discover what you should be adding to the platform is pain point analysis. This is as straight forward as surveying all of your developers to find out what they struggle with on the existing platform. These findings might conflict with metric based prioritisation, but one of the biggest challenges is gaining developer trust. The emotional response to the platform is perhaps the key success driver within an organisation.
This can be monitored with a variant of a net promoter score. Survey developers quarterly on “How likely would you choose this platform for your next project?” and look at the direction of the response scores. If they start going down then that’s a big red flag for developers looking for other platforms.
Developer workflow analysis
The key platform metric for the business is how fast are developers moving from ideation to delivery. The platform should be removing any steps that slow this process down. A product manager should be sitting with other product teams to analyse the journey from a developer starting on an issue and deploying. Measurements should be set up that track this duration, within the context of the type of feature being delivered.
New developer on boarding flow
If it takes a new developer weeks to be able to get to a point of deploying on the platform then there is the potential that a huge amount of productivity is being lost. This is particularly critical in times of rapid scale up. The time from a new developer starting employment to deploying first application should be measured and ideally the product manager should be shadowing developers in each joining cohort.
This is perhaps the most straightforward of the sources for features. The platform needs to be available so any feature that improves that, within the scope of defined SLOs, should be put into the backlog. The more difficult task is to understand when to prioritise more emotion based features ahead of metric driven availability feature. It’s very easy to fall back onto a path where it’s easy to measure the outcome, but what’s the point in a highly available platform if nobody wants to use it.
I’ve touched on some of the ways of measuring success above but success of the platform can be more broadly defined. Is it delivering value to the business? How value is defined is going to be slightly different for every business, but the one thing in common is that maximising return on investment from developers will be a key driver. If that is kept in mind then you’re on the road to success.