Data Mesh, Team Topologies and the convergence of important ideas
The eagle-eyed among our readers might note that there are several similarities between the Data Mesh and Team Topologies. In this article, we will explore these similarities and discuss how they can be used together to create a more effective data organisation.
Firstly, Team Topologies is the idea that if we want to build effective software systems, and effective organisations building these systems, we must pay attention not only to technical architecture, but also the impact of Conway's Law: how we structure our organisation and its communication modes. We won't go into every detail, but would rather recommend reading the book with the same name, or the Team Topologies website.
The second idea, Data Mesh, lies very close to Team Topologies and the next idea.
The third influential idea that is present in both Team Topologies and Data Mesh is that of Domain Driven Design ("DDD"). DDD is a set of principles and patterns that help us to build software systems that are more closely aligned with the business domain. This is important because it helps us to build systems that are more flexible and easier to change. Again, we won't go into every detail, but would rather recommend reading the book with the same name, or the DDD website.
A fourth idea, that anyone squinting at Data Mesh might think of, is Event Sourcing. We won't go into depth in this post, but it should be fairly obvious that event sourcing is one possible, fairly good fit for input- and output ports of a data product in a Data Mesh.
There is a clear convergence of ideas here, under slightly different names and banners. But what does this convergence mean for us? How can we use these ideas together to build better data- and software organisations?
Convergence of Data Engineering & Software Engineering
The first obvious conclusion we can draw is that the convergence of these ideas is a sign of the convergence of Data Engineering and Software Engineering. What works in one domain, clearly works in the other, that we are coming to the same conclusions over time with evidence, only labelling them slightly differently should surprise no one.
The separation between Data Engineering and Software Engineering is a tenuous one at best, to start with: backend- & distributed systems software engineering principles should be quite similar to data engineering principles, the separation mostly comes from the divergence in problem domains, as well as the fact that Data Engineering has frequently drawn people with an analytics- or data science background, rather than a software engineering background. This is changing, however, as the field matures and there is more cross-pollination both in ideas and peoples backgrounds.
It is our firm belief, that to build effective domain teams, they will need to draw on skills from software, ops and data alike. Sometimes these skills will be provided by different people, sometimes the same individuals will have multiple skills to varying degrees of depth. But the important thing is that the skills are present in the team, rather than having to depend heavily on other teams for these skills.
Convergence of Operational & Analytical infrastructure and data
One issue with organisations with traditional central data teams is that they frequently have an entirely different infrastructure stack than the rest of the organisation. This is a problem, because it means that the central data team is not able to leverage the same tools and infrastructure as the rest of the organisation, and vice versa. This leads to a lot of friction, and duplicated effort.
The merging of operational- and analytical expertise into domain teams is likely to lead to a convergence of infrastructure- and data stacks as well. It will not be a complete eradication of one or the other stack, but rather a case of both stacks being better integrated and more interoperable, while underlying raw infrastructure (storage, compute) converges more completely. The last thing the world needs is yet another compute- & cluster orchestrator.
Implications: multi-functional teams, multi-functional people
The obvious implication is already stated: multi-functional teams. The second implication is, in an organisation that embraces this convergence, and more data-driven decision-making, we need multi-functional people too. We need to help people develop skills in multiple areas. They do not necessarily need deep expertise in all of them, one is quite enough. But we need managers, product people and software engineers to be data literate, and we need analysts- and data engineers to be software engineering literate. The point is not that everyone needs to know everything, just that they know enough about the areas of expertise of their colleagues to be able to communicate effectively across specialisations, and understand the consequences of their decisions on their co-workers and other parts of the organisation.
Conclusion
Domain Driven Design, Team Topologies, Data Mesh, DevOps are complementary concepts with areas of overlap. Anyone looking to implement Data Mesh would be foolish to not first look at the other concepts, and see how they can be used to inform the implementation of Data Mesh. Team Topologies in particular has many useful insights and tools, that will help shape the organisation and communication modes necessary for the success of a Data Mesh initiative.
The fact that these ideas have overlap, are converging on the same conclusions, at about the same point in time, is not a fluke: let's view it as a sign of independent validation of the ideas, and a sign that we are on the right track.