Database Doctor
Writing on databases, performance, and engineering.

Posts with tag: data-lakes

Cover

Joins are NOT Expensive! - Raw Reading

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place.

One of the largest misconceptions is this: "Joins are expensive". It is apparently believed (as I discovered from discussions on LinkedIn) that it costs less CPU to turn your data model into a flat table and serve that to users - than it does to join everytime you access the table.

It is certainly true that object stores allow nearly infinite I/O (although at a very high CPU cost to handle the HTTP overhead) and nearly infinite storage. But, is there really such a thing as "sacrificing disk space to save the CPU cost of joins?"

Today, let us put this to the test.

Read More...