By Sam Theisens on 20 mei 2023
In March 2023, Amazon published a blog post , detailing how they had managed to reduce the cost of their audio-video monitoring service by 90%. The key to this reduction was migrating from a distributed, microservice architecture to a monolith. The blog post went viral, prompting some software industry celebrities to question the entire concept of microservices.
So, does this mean microservices are fundamentally flawed? Should we all migrate back to monoliths? No and definitely no I would say. Instead, my takeaways from this article are:
In the next section I will share one of our own experiences, not entirely different from the Amazon example.
Considering that all the functionality used in the Amazon case belongs to the same domain, it arguably does not even serve as
a case against improper use of microservices, but instead a case against misuse distributed computing.
Let's look into an example of misuse of distributed computing at Vandebron now.
For utility companies, accurately predicting both electricity consumption and production is crucial. Failing to do so can result in blackouts or overproduction, both of which are very costly. Vandebron is a unique utility company in that the electricity that our customers consume is produced by a very large amount of relatively small scale producers, who produce electricity using windmills or solar panels. The large number and the weather dependent nature of these producers make it very hard to predict electricity generation accurately.
To do this, we use a machine learning model that is trained on historical production data and predictions from the national weather institute. As you can imagine, this is a computationally intensive task, involving large amounts of data. Fortunately, we have tooling in place that allows us to distribute computations of a cluster of machines if the task is too large for a single machine to handle.
However, here's the catch: the fact that we can distribute computations does not mean that we should. Initially it seemed that we couldn't analyze the weather data quick enough for the estimation of our production to still be a prediction rather than a postdiction. We decided to distribute the computation of the weather data over a cluster of machines. This worked, but it made our software more complex and Jeff Bezos even richer than he already was.
Upon closer inspection, we found an extreme inefficiency in our code. It turned out that we were repeatedly reading the entire weather dataset into memory, for every single "pixel". After removing this performance bug, the entire analysis could easily be done on a single machine.
So if microservices aren't about performance, what are they about? If I had to sum it up in one sentence It would be:
Microservices are a way to scale your organization
There is a lot of detail hiding in that sentence, which I can't unpack in the scope of this article. If you're interested what microservices have meant for us, I would recommend you watch the presentation below.
At Vandebron, we jumped onto the "microservice bandwagon" circa 2019. This wasn't a decision made on a whim. We had seen a few industry trends come and go, so we first read up and did our own analysis. We found that the concept of microservices held promise, but also knew that they would come at a cost.
These are some of the dangers we identified and what we did to mitigate them.
Danger | Mitigation |
A stagnating architecture | Compile and unit-test time detection of breaking changes |
Complicated and error prone deployments | Modular CI/CD pipelines |
Team siloization | A single repository (AKA monorepo) for all microservices and a discussion platform for cross-domain and cross-team concerns |
Duplication of code | Shared in house libraries for common functionality |
The following presentation to the students of VU University, Amsterdam explains how we implemented some of these mitigations and what we learned from them.