The Database Doctor
Musing about Databases

Tag: performance

Cover image for TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting
TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting

I want to teach you an important skill that will serve your well as a database specialist. One blog entry is not going to be enough, but here is my goal: When you look at an SQL query in the you...

Cover image for TPC series - TPC-H Query 2 and 17 - De-correlation
TPC series - TPC-H Query 2 and 17 - De-correlation

The great promise databases make to programmers is: "Tell me what you want and I will figure out the fastest way to do it." A database is a computer science engine — it knows and...

Cover image for Joins are NOT Expensive! - Part 1: Raw Reading
Joins are NOT Expensive! - Part 1: Raw Reading

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place. One of the largest misconceptions is are I...

Cover image for Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation
Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation

After the wonderful feedback on the previous blog about Iceberg - it is now time to switch gears. Databases are more than row storage engines. They are algorithm machines, helping that...

Cover image for Testing is Hard  and we often use the wrong Incentives
Testing is Hard and we often use the wrong Incentives

I have been spending a lot of time thinking about testing and reviewing testing lately. At a superficial level - testing looks simple: Write test matrix, code tests, run tests, learn we...

Cover image for Why are Databases so Hard to Make? Part 2 - Logging to Disk
Why are Databases so Hard to Make? Part 2 - Logging to Disk

Transaction logs. Why are they so important and why are they so hard to make?

Cover image for Why are Databases so Hard to Make? Part 1 - CPU usage
Why are Databases so Hard to Make? Part 1 - CPU usage

In our previous blogs, we have visited the idea that "databases are just loops". At this point, my dear readers may rightfully ask: "if those database are indeed just -...

Cover image for Databases are Just Loops - Part 3: Row and Batch execution
Databases are Just Loops - Part 3: Row and Batch execution

Our database journey makes a brief stop. We need to appreciate an important design decision every database must make: Should I use row or batch execution? Depending on the database - or...

Cover image for Databases are just Loops - Part 2: GROUP BY
Databases are just Loops - Part 2: GROUP BY

In my previous post - I introduced the idea that you can think of database queries as a series of loops. Let me take this ideas even further - introducing more complex database concepts in...