a curated list of database news from authoritative sources

July 23, 2024

July 22, 2024

Optimizing aggregation in the Vitess query planner

The Vitess query planner takes multiple passes over a query plan to optimize it as much as possible before execution. A recent tricky bug report led to an improvement in how the optimizer functions.

An Interesting Optimization

Introduction # I recently encountered an intriguing bug. A user reported that their query was causing vtgate to fetch a large amount of data, sometimes resulting in an Out Of Memory (OOM) error. For a deeper understanding of grouping and aggregations on Vitess, I recommend reading this prior blog post. The Query # The problematic query was: selectsum(user.type)fromuserjoinuser_extraonuser.team_id=user_extra.idgroupbyuser_extra.idorderbyuser_extra.id;The planner was unable to delegate aggregation to MySQL, leading to the fetching of a significant amount of data for aggregation.

July 19, 2024

July 16, 2024

Why German Strings are Everywhere

German Strings

Strings are conceptually very simple: It’s essentially just a sequence of characters, right? Why, then, does every programming language have their own slightly different string implementation? It turns out that there is a lot more to a string than “just a sequence of characters”1.

We’re no different and built our own custom string type that is highly optimized for data processing. Even though we didn’t expect it when we first wrote about it in our inaugural Umbra research paper, a lot of new systems adopted our format. They are now implemented in DuckDB, Apache Arrow, Polars, and Facebook Velox.

July 11, 2024

Supabase Security Suite

Learn how to use range columns in Postgres to simplify time-based queries and add constraints to prevent overlaps.

July 10, 2024

Introducing Rate Limiting for Tinybird APIs

Today, we introduce Rate Limiting for Tinybird API Endpoints. With this new feature, you can limit how often your users can fetch Tinybird APIs on a per-endpoint or per-user basis.

Dealing with large tables

Large databases often have a small number of very large tables that makes scaling difficult. How can you scale with these while keeping your database performant? This article covers three techniques.

July 08, 2024