There are some in our industry who can read a schema and instantly see the joins.
They dream in sets, write SQL queries routinely and reach for window functions without hesitation.
And then, there are the others...
For them, the problem isn’t laziness, or stubbornness, or lack of training.
Tragically, they were born this way.
Does breaking systems into smaller parts you can understand individually really make them easier to manage and scale?
Today, we explore the pitfalls of this obsession and draw lessons from nature and other
fields of engineering. As always, this will be a controversial take - if it wasn't, why would
you bother reading this instead of some LLM generated nonsense on LinkedIn?
In my previous post - I introduced the idea that you can think of database queries
as a series of loops. Let me take this ideas even further - introducing more complex database concepts in terms more
familiar to computer programmers.
Databases can do this cool thing called "aggregation". You may think of it like GROUP BY. You will have seen queries
like this:
SELECT x, SUM(v) AS agg_y
FROM a
GROUP BY x
What is going on here? And how can we think of this like a loop?
Aggregate or GROUP BY?
To establish some terminology: You will hear me refer to GROUP BY as "aggregation" in the following passages.
We will call SUM(v) above an "aggregate function" and we will call x a "grouping column". We will refer to the
entire loop as the "aggregation loop"
When designing database schemas, you may find yourself needing to create a property bag table that is in a 1-n
relationship with an entity table. It isn't always viable to just add more columns to a table because you may not know
all the properties you want in advance. The problem with these property bags is that they can be hard to query.
In this blog entry, we will explore the property bag data modelling use case and provide some tricks that will make it
very simple to query those bags.