The Database Doctor
Musing about Databases

Latest Posts

Cover image for TPC series - TPC-H Query 5 - Transitive Closure and Join Order Dependencies
TPC series - TPC-H Query 5 - Transitive Closure and Join Order Dependencies

Welcome back to the TPC-H analysis. If this is your first time, I highly recommend that you visit the previous blogs in the series first. They're here (and I look forward to seeing you in a...

Cover image for TPC series - TPC-H Query 4 - Semi Join and Uniqueness
TPC series - TPC-H Query 4 - Semi Join and Uniqueness

Today we are looking at a Q04 — which on the surface is similar to Q17. Like Q17, Q04 has a correlated subquery that can be de-correlated using a join. But sometimes, a regular INNER JOIN is...

Cover image for SQL Deficiency Syndrome: Born without Joins
SQL Deficiency Syndrome: Born without Joins

There are some in our industry who can read a schema and instantly see the joins. They dream in sets, write SQL queries routinely and reach for window functions without hesitation. And a...

Cover image for TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting
TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting

I want to teach you an important skill that will serve your well as a database specialist. One blog entry is not going to be enough, but here is my goal: When you look at an SQL query in the you...

Cover image for Modern CMake for C++ Projects
Modern CMake for C++ Projects

A useful guide to help you get started with CMake

Cover image for TPC series - TPC-H Query 2 and 17 - De-correlation
TPC series - TPC-H Query 2 and 17 - De-correlation

The great promise databases make to programmers is: "Tell me what you want and I will figure out the fastest way to do it." A database is a computer science engine — it knows and...

Cover image for Joins are NOT Expensive! - Raw Reading
Joins are NOT Expensive! - Raw Reading

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place. One of the largest misconceptions is are I...

Cover image for Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation
Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation

After the wonderful feedback on the previous blog about Iceberg - it is now time to switch gears. Databases are more than row storage engines. They are algorithm machines, helping that...

Cover image for Iceberg, The Right Idea - The Wrong Spec - Part 2 of 2: The Spec
Iceberg, The Right Idea - The Wrong Spec - Part 2 of 2: The Spec

Let us finally look at what is so wrong with the Iceberg spec and why this simply isn't a serious attempt at solving the metadata problem of large Data Lakes. In the first part of this I took...

Cover image for Iceberg, The Right Idea - The Wrong Spec - Part 1 of 2: History
Iceberg, The Right Idea - The Wrong Spec - Part 1 of 2: History

Iceberg: The great unifying vision finally allowing us to escape the vendor lock-in of our database engines. One table and metadata format to find them ... And in the darkness bind I the...