People APIs: the art of scaling a software organization

4 person team

How do you increase the number of developers without losing much of the productivity you get when a team is just “a startup”? Is it even possible? If so, how do you do it? The answers are somewhat counter-intuitively: ”yes” and ”you don’t”. How do we reconcile these two answers? Let’s explore!

The fallacy of “better communication”

One frequently repeated, and mostly wrong fallacy of growing software organizations is ”we should be better at communicating!”.

The issue is almost never the quality or amount of communication, but rather, the sheer crippling overhead of communication, if you have a large team and everyone needs to know everything that is going on. If you look at the image at the top of this post, you can see that in a team of 4 people, the team needs to maintain 6 relationships between people in total. 6 person team If you increase the number of people to 6, all of a sudden the number of relationships goes up to to 15. Increase more, and the number just growing by new team size -1 connections per added person.

You almost never need more or hard-to-quantify “better” communication. The solution is quite simply to have less communication and less dependence on everyone needing to know about everything.

How exactly do we achieve this? We need a few things:

Small, autonomous teams

The first thing we need is small teams, where we can easily maintain all the relationships within the team required to function effectively, while communicating well. The mythical Amazon ”Two Pizza-team”, a team that can be fed by two large pizzas is a good yard-stick. In other words, 4-8 people, preferably no more.

Within this size, communication will be manageable, it will be easy for everyone to understand exactly what is going on and who is doing what.

The second part is autonomy. This is where most organizations fail. A truly autonomous team requires 3 things:

Be given problems to solve, not solutions to implement.
Be empowered to solve those problems as they best see fit, without being second guessed.
Have the skills and resources within the team to solve the problems with minimal dependence on the rest of the organization.

The last two points may make some people uneasy. How do we coordinate things that simply cannot fit within a team or affects several teams?

How do we determine team boundaries?

This is probably the easiest answer, but also in practice, the hardest thing to get right: Team boundaries should align around natural business domains, for which Domain Driven Design is a useful tool. Due to Conway’s Law, this is something that should absolutely not be left to chance, or worse, HR.

Let’s not mince words about team structure and domains: along which lines you organize teams will effectively determine your technical architecture as well, whether you want it or not.

Organizational APIs, both technical and social

Now comes the hard part: in the few cases where teams do depend on each other, how do we ensure this runs smoothly, effectively, and causes as few blockers between teams as possible?

We need two types of Organizational “APIs”: social and technical.

Technical

The technical “APIs” are effectively the overall architecture of your software: the guiding principles of how services from different teams interact, share data etc.

What are the standard data protocols? JSON? Avro? Protobuf?

How do services from different teams interact, do they use synchronous REST or RPC? Do they use asynchronous messaging? Events, Commands or both? If any or all, when do they use what style of communication?

These are likely answers that could either be decided by engineering leadership alone, or where the answers grow organically through the facilitation of the engineering leadership. The important part is that questions are considered at this level of abstraction, and not lower, where they infringe upon team autonomy.

Social

How do teams interact when they have dependencies or requirements upon each other? What are the rules of engagement? This will look very different from organization to organization, but it is an aspect that needs both consideration and facilitation to arrive at some ground rules. There are a few guiding principles to drive out your answers:

Minimize dependencies and communication between teams.
Minimize the amount of time one team is blocked by another.
Minimize the amount of disruption from external teams on internal team priorities.

The last two points in particular are hard to reconcile, and there are no hard and fast answers to this conflict. But, there maybe a hint in the next section.

Duplication of effort and teams as startups & service providers

Should we worry about teams duplicating effort of other teams? Probably not the first time, maybe the second time, definitely the third time. If we care about team autonomy, we should not mandate teams using any centralized services, nor should we mandate them using the work of other teams. We might want to recommend it, as a Golden Path: ”If you do this, you have this support available. If you do something else, you can, but you’re on your own”.

How would this play out in practice? The simplest analogy is that of an internal market within a company: once a problem arises enough times, you might put a team on it, and offer their work to others. But like with a real market, their services must be compelling enough for other teams to adopt it out of choice, not compulsion.

Simply put:

the cost of duplicated effort is less than the cost of lost team autonomy.

Thus duplicated effort is the lesser of two evils. Most of the time.

Wrapping up

Where does all this leave us?

Team autonomy reigns supreme. Autonomy requires the ability to do the work and bringing dependencies on other parts of the organization down to as close as possible to 0.

Where dependencies exist, they need to be managed, this is done by reducing the number of relationships and communications lines that need to be maintained, so as to keep communication complexity down to a minimum.

The tools we should use for this is Domain Driven Design, and an "API" approach: having well set-out rules and principles for the points of interaction between teams, both socially and technically.

Nothing should be set in stone, but having some fundamental principles in place to begin the journey will help immensely. Why don't we call those principles People APIs?