The Database Doctor
Musing about Databases

Posts

Cover image for Why are Databases so Hard to Make? Part 4 - Digging up Graves
Why are Databases so Hard to Make? Part 4 - Digging up Graves

In my last post about high speed DML, I talked how it is possible to modify tables at the kind of speeds that a modern SSD can deliver. I sketched an outline of an algorithm that can easily us...

Cover image for Why are Databases so Hard to Make? Part 3 - High Speed DML
Why are Databases so Hard to Make? Part 3 - High Speed DML

After a brief intermezzo about testing (read about my thoughts here: Testing is Hard and we often use the wrong Incentives) - it is time to continue our journey together to where we will A...

Cover image for Testing is Hard  and we often use the wrong Incentives
Testing is Hard and we often use the wrong Incentives

I have been spending a lot of time thinking about testing and reviewing testing lately. At a superficial level - testing looks simple: Write test matrix, code tests, run tests, learn we...

Cover image for Why are Databases so Hard to Make? Part 2 - Logging to Disk
Why are Databases so Hard to Make? Part 2 - Logging to Disk

Transaction logs. Why are they so important and why are they so hard to make?

Cover image for Why are Databases so Hard to Make? Part 1 - CPU usage
Why are Databases so Hard to Make? Part 1 - CPU usage

In our previous blogs, we have visited the idea that "databases are just loops". At this point, my dear readers may rightfully ask: "if those database are indeed just -...

Cover image for Databases are Just Loops - Part 3: Row and Batch execution
Databases are Just Loops - Part 3: Row and Batch execution

Our database journey makes a brief stop. We need to appreciate an important design decision every database must make: Should I use row or batch execution? Depending on the database - or...

Cover image for Databases are just Loops - Part 2: GROUP BY
Databases are just Loops - Part 2: GROUP BY

In my previous post - I introduced the idea that you can think of database queries as a series of loops. Let me take this ideas even further - introducing more complex database concepts in...

Cover image for Databases are just Loops - Part 1: Joins
Databases are just Loops - Part 1: Joins

I decided to write this post in response to my recent discussions with Matt Martin. Matt and I have been sparring lately over software performance. During one of these discussions, the...

Cover image for Finding things in Data Models with Property Bags
Finding things in Data Models with Property Bags

When designing database schemas, you may find yourself needing to create a property bag table that is in a 1-n relationship with an entity table. It isn't always viable to just add more a...

Cover image for LAST_VALUE and FIRST_VALUE Window Aggregate tricks
LAST_VALUE and FIRST_VALUE Window Aggregate tricks

When writing code for analytical databases, there comes a point where you learn that window aggregates often allow you to express complex joins in new ways. Sometimes, they even allow...

Cover image for Row or Column based Storage?
Row or Column based Storage?

These days, columnar storage formats are getting a lot more attention in relational databases. Parquet, with its superior compression, is quickly taking over from CSV formats. SAP a...

Cover image for What is a good database key?
What is a good database key?

A central value add of modeling databases for analytical purposes and speed is that you can restore the sanity, often lost in the source system, that comes from using good keys in each it...

Cover image for What is Cost in a Query Planner?
What is Cost in a Query Planner?

If you ever looked at query plan from a SQL databases - you are likely to have come into something called "cost". You may even have heard that most advanced database use based a...

Cover image for Join Elimination in Query Planners
Join Elimination in Query Planners

Most relational databases are both by human SQL query writers and tools that automatically generate queries. Query tools frequently add joins into the queries, even when those joins...

Synchronisation in .NET – Part 4: Partitioned Locks

In this final instalment of the synchronisation series, we will look at fully scalable solutions to the problem first stated in Part 1: adding monitoring to a high speed app where that 4...

Synchronisation in .NET – Part 3: Spinlocks and Interlocks/Atomics

In the previous instalments (Part 1 and Part 2) of this series, we have drawn some conclusions about both .NET itself and CPU architectures. Here is what we know so far: In this blog I will...

Synchronisation in .NET – Part 2: Racy Data Structures, Padding and False Sharing

In the previous blog post we saw how the lock() statement in .NET scales very poorly when there is a contention on a data structure. It was clear that a performance logging framework that...

Synchronisation in .NET– Part 1: lock(), Dictionaries and Arrays

As part of our tuning efforts at Livedrive, I ran into a deceptively simple problem that beautifully illustrates some of the scale principles I have been teaching to the SQL Server for I...