Today, we are going to talk about estimating filter selectivity and how to fail at it.
Two queries are similar enough that we will analyse them together: Q12 and Q14.
Like listening to a song from the '80s, it is good to be back to TPC-H.
I have been busy doing work with Floe.
If you are a regular reader of this blog, I think you will like this article I wrote recently:
In my previous post - I introduced the idea that you can think of database queries
as a series of loops. Let me take this ideas even further - introducing more complex database concepts in terms more
familiar to computer programmers.
Databases can do this cool thing called "aggregation". You may think of it like GROUP BY. You will have seen queries
like this:
SELECT x, SUM(v) AS agg_y
FROM a
GROUP BY x
What is going on here? And how can we think of this like a loop?
Aggregate or GROUP BY?
To establish some terminology: You will hear me refer to GROUP BY as "aggregation" in the following passages.
We will call SUM(v) above an "aggregate function" and we will call x a "grouping column". We will refer to the
entire loop as the "aggregation loop"
When designing database schemas, you may find yourself needing to create a property bag table that is in a 1-n
relationship with an entity table. It isn't always viable to just add more columns to a table because you may not know
all the properties you want in advance. The problem with these property bags is that they can be hard to query.
In this blog entry, we will explore the property bag data modelling use case and provide some tricks that will make it
very simple to query those bags.