Joe Wood

Quantitative Software Architecture – From FinTech to FinOps

There’s a recurring debate in the tech community comparing the advantages of Microservice Architectures with a Monolithic approach to software architecture. Taken to the extreme, a microservice approach would have every separate team for every module, responsible for their own technology, scalability, agility, and operational stability. While the other extreme would have a single large system, with a single backend database all working on a large code repository. Is this really a one-dimensional problem, trading complexity against flexibility? Or is there another option?

Microservices Solve for Organizational Complexity

In Andrei Taranchenko’s post about Death by a Thousand Microservices, he writes:

Trying to shove a distributed topology into your company’s structure is a noble effort, but it almost always backfires. It’s a common approach to break up a problem into smaller pieces and then solve those one by one. So, the thinking goes, if you break up one service into multiple ones, everything becomes easier.

Microservices decompose software architecture, just like functions in a monolithic program. With these services comes the data that they manage, living in separate islands behind each service API wall. If you want to join that data together you need to coordinate across your services teams….

What if the teams are too busy to add new service calls for each reporting use-case? One approach to this join problem is to query the smallest dataset first, then query the next largest for each entity. That approach kind of works, but that’s a lot of single entity querying, and it’s far from the most efficient design. We’re not really using the power of relational database engines at all. We’ve traded organizational friction for poorer performance and much more complexity.

Monolith Solves for Small-Scale Simplicity

So maybe the Monolith isn’t so bad. One big database would turn the above example into a simple join and aggregate query – no problem! Relational databases are particularly good at data relationships. But what about rogue queries? A missing join clause can wreak havoc on database performance. What about technology choices for backend, languages, and frontend? We’re back to the same old issues that made microservices so appealing.

A monolith is fine at solving problems facing start-ups, but with growth it quickly becomes a hinderance to scale in both performance and organizational size. For teams that are distributed geographically, or working remotely, that ceiling could be hit more quickly than before.

How the Data Mesh Solves the Data Dichotomy

There’s an underlying challenge here, data likes to be co-located to be efficiently queried, but functionality and responsibility is better separated and encapsulated to scale well. Ben Stopford from Confluent calls this the Data Dichotomy:

The underlying issue is that data and services don’t sing too sweetly together. On one side, encapsulation encourages us to hide data; decoupling services from one another so they can continue to change and grow. … But on the other side, we need the freedom to slice and dice shared data like any other dataset.

The only way to solve this dichotomy is to establish a data architecture. Data can reside locally in different services or domains across multiple systems, as long as each dataset has a clear source of truth and a governance contract. Zhamak Dehghani first coined the approach as a Data Mesh. Her series of articles at Thoughtworks explains how a distributed domain driven approach can help bring the concepts of data lakes to autonomous teams.

Put simply: – copying data between teams is OK, if the data is read-only, kept up to date and security policies are respected.

Consider the above example – what if the customer team had a copy of the Orders data, so that they could calculate the top customers by orders placed? What if the products team had a copy of Customer and Order data, so that they could report on which products were selling better in each market? This solves our data dichotomy (data is now local), and it solves our scaling issue (separate autonomous teams and no monolithic design).

Growing from a monolith to a data mesh is far easier than incrementally incorporating microservices. Data that started as being originated in the same big database in a monolith, can gradually be replaced with copies fed from remote domains in remote teams. Elaborate SQL queries joining multiple datasets together can remain unchanged. Those queries now operate on copies of data, kept fresh and in-policy under a governance framework.

So, we’re left with a Data Engineering problem, where managed copies of data propagate through the organization. Tools for data governance, lineage, schema evolution, access control and data quality are rapidly maturing. Like Confluent above, other major data platforms are recognizing this pattern and offering their own support. There’s growing support in the Open Source community too, building on established tools like dbt with like projects like AirByte, Meltano and Mage.

Organization and performance scaling can often be solved with a data architecture approach, especially for data oriented companies. It’s easy to fall into the trap of thinking that software architecture is a service problem, when it’s the data we should be thinking about all along.

Switching away from Wordpress and starting a new here, using write.as. Tooting at joewood@writing.exchange