Row or Column based Storage?

These days, columnar storage formats are getting a lot more attention in relational databases. Parquet, with its superior compression, is quickly taking over from CSV formats. SAP Hana, Vertica, Yellowbrick, Databricks, SQL Azure and many others all promote columnar representations as the best option for analytical workloads. It is not hard to see why. But … Read more

What is Cost in a Query Planner?

If you ever looked at query plan from a SQL databases – you are likely to have come into something called “cost”. You may even have heard that most advanced database use “cost based optimisers” But what exactly is cost and how does the database use it? SQL Query optimisation – Introduction The goal of … Read more

Join Elimination in Query Planners

Most relational databases are both by human SQL query writers and tools that automatically generate queries. Query tools frequently add joins into the queries, even when those joins are not needed. This results in unnecessary query slowdowns and increase CPU usage. For example, consider the classic Star Schema model with a fact table in the … Read more