Column vs Row database
• By Garren Smith
Database Column Store Row Store
Rough Notes for now…
So, broadly speaking, row-oriented databases are more efficient for queries where you want to read all the columns from a single row.
- Column-orientated database allows for different shaped data and dynamic columns
- Fast for aggregations because it can scan the whole column
- Doesn’t need to have indexes as the way columns are stored means that they basically indexes already
- Great for tables with lots of columns
- More performant with large bulk updates
- row-oriented databases are more efficient for queries where you want to read all the columns from a single row.
- Will use indexes so similar then to column format but increased overhead to point back to the primary data
- Better with smaller tables and more single updates
- Casandra is column-orientated
References
- https://www.honeycomb.io/blog/why-observability-requires-distributed-column-store
- https://www.polarsignals.com/blog/posts/2022/05/04/introducing-arcticdb
- https://help.sap.com/docs/SAP_HANA_PLATFORM/6b94445c94ae495c83a19646e7c3fd56/bd2e9b88bb571014b5b7a628fca2a132.html
- https://www.scattered-thoughts.net/writing/a-shallow-survey-of-olap-and-htap-query-engines/
- https://www.tinybird.co/blog-posts/when-to-use-columnar-database <— this is good
- https://news.ycombinator.com/item?id=7846779