Database Doctor
Writing on databases, performance, and engineering.

Posts with tag: architecture

Cover

Database Services and Disaggregation

Iceberg and Parquet, for all their flaws, have shown us a fascinating path forward for the database industry: Disaggregation. Apache Arrow is quickly moving us in the direction of common interchange formats inside the database and on the wire.

It's now possible to imagine a future where databases aren't single systems from one vendor, but made by combining multiple components, from different contributors, into a single coherent system.

This idea isn't new, and I claim no credit for observing it. But I'd like to share my perspective on it — since that's what I do here.

Read More...

Cover

Joins are NOT Expensive! - Raw Reading

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place.

One of the largest misconceptions is this: "Joins are expensive". It is apparently believed (as I discovered from discussions on LinkedIn) that it costs less CPU to turn your data model into a flat table and serve that to users - than it does to join everytime you access the table.

It is certainly true that object stores allow nearly infinite I/O (although at a very high CPU cost to handle the HTTP overhead) and nearly infinite storage. But, is there really such a thing as "sacrificing disk space to save the CPU cost of joins?"

Today, let us put this to the test.

Read More...

Greed vs Bravery Based Engineering

It is difficult to find words that accurately describe the cruelty, selfishness and outright evil on display from the White House these days. The guiding principle of Gordon Gekko: "Greed.. is GOOD", has finally reached a crescendo. As long as you are greedy - you can be above the law and the sickness of our society is refleced in the mentally deranged leaders we elect.

I think we must accept that it is possible to have a coherent world interpretation through the lens of "Might makes Right". It's the way sociopaths view others, it is the system that dictators will have you accept. In such a world view, "trust" simply does not exist - except in the short term wielding of power to create fear. In a might makes right world - every transaction is a zero sum game. To have a winner, there must be a loser.

You can choose to submit yourself to might - or you can fight back. Today, I want to talk about a few bugs in the "might is right" based world view that we can exploit as engineers. Hopefully, we can grapple together on how we can collectivity dignify our species again.

Read More...

Cover

Coupling, Complexity, and Coding

Why is the IT industry obsessed with decoupling?

Does breaking systems into smaller parts you can understand individually really make them easier to manage and scale?

Today, we explore the pitfalls of this obsession and draw lessons from nature and other fields of engineering. As always, this will be a controversial take - if it wasn't, why would you bother reading this instead of some LLM generated nonsense on LinkedIn?

Read More...

Making Decent Python Libraries - Part 1

Python has now infected computer science departments and data analysts across the planet. The resulting ecosystem is a mess of libraries - that are often poorly designed out outright harmful.

Recently, I have had to write a few libraries of my own and this has taught me a lot about what makes a good Python library. In this series of blog entries, I will share this knowledge and tell you about the lessons I learned so you don't have to suffer through them.

Read More...