database.news | database technology news aggregator

January 16, 2022

Database Architects

Are you sure you want to use MMAP in your database management system?

Many database management systems carefully manage disk I/O operations and explicitly cache pages in main memory. Operating systems implement a page cache to speed up recurring disk accesses as well, and even allow transparent access to disk files through the mmap system call. Why do most database systems then even implement I/O handling and a caching component if the OS provides these features through mmap? Andrew Pavlo, Andrew Crotty, and myself tried to answer this question in a CIDR 2022 paper. This is quite a contentious question as the Hacker News discussion of the paper shows.

The paper argues that using mmap in database systems is almost always a bad idea. To implement transactions and crash recovery with mmap, the DBMS has to write any change out-of-place because there is no way to prevent write back of a particular page. This makes it impossible to implement classical ARIES-style transactions. Furthermore, data access through mmap can take a handful of nanoseconds (if the data is in the CPU cache) or milliseconds (if it's on disk). If a page is not cached, it will be read through a synchronous page fault and there is no interface for asynchronous I/O. I/O errors, on the other hand, are communicated through signals rather than a local error code. These problems are caused by mmap's interface, which is too high-level and does not give the database system enough control.

In addition to discussing these interface problems, the paper also shows that Linux' page cache and mmap implementation cannot achieve the bandwidth of modern storage devices. One PCIe 4.0 NVMe SSD can read over 6 GB/s and upcoming PCIe 5.0 SSDs will almost double this number. To achieve this performance, one needs to schedule hundreds or even thousands (if one has multiple SSDs) of concurrent I/O requests. Doing this in a synchronous fashion by starting hundreds of threads will not work well. Other kernel-level performance issues are single-threaded page eviction and TLB shootdowns. Overall, this is an example of OS evolution lagging behind hardware evolution.

The OS has one big advantage over the DBMS though: it has control over the page table. Once a page is mapped, accessing it becomes transparent and as fast as ordinary memory. Any manually-implemented buffer manager, in contrast, will have some form of indirection, which causes some overhead. Pointer swizzling as implemented in LeanStore and Umbra is a fast alternative but is also more difficult to implement than a traditional buffer manager and only supports tree-like data structures. Therefore, an interesting question is whether it would be possible to have an mmap-like interface, but with more control and better performance. Generally I believe this kind of research between different areas should be more common.

by Viktor Leis (noreply@blogger.com)

January 11, 2022

Phil Eaton

dsq: Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

This is an external post of mine. Click here if you are not redirected.

Tinybird Engineering Blog

From Kafka streams to data products

Productize data ingested via Kafka using API endpoints with Tinybird

Vitess Blog

Vitess Schema Tracking

What is Schema Tracking? # In a distributed relational database system, like Vitess, a central component is responsible for serving queries across multiple shards. For Vitess, it is VTGate. One of the challenges this component faces is being aware of the underlying SQL schema being used. This awareness facilitates query planning. Table schemas are stored in MySQL’s information_schema, meaning that they are located in a VTTablet’s MySQL instance and not in VTGate.

January 06, 2022

Phil Eaton

Analyzing large JSON files via partial JSON parsing

This is an external post of mine. Click here if you are not redirected.

January 05, 2022

Phil Eaton

The year in books: 11 to recommend in 2021

Last year (2021) I finished 17 books, a five year low. But that's ok! 4 fiction and 13 non-fiction. Another 30 started but not finished.

Non-fiction

It seems I was pretty focused on business history books and history of tech. The 8 non-fiction books I liked the most:

Designing Data-Intensive Applications, a must-read for anyone interacting with a database
My Years with General Motors, the business school classic; truly a good read. But sad to know that shortly after written, GM succumbs to the Japanese and South Korean competition
No Rules Rules: Netflix and the Culture of Reinvention
Working Backwards: Insights, Stories, and Secrets from Inside Amazon
Working in Public: The Making and Maintenance of Open Source Software, my review here
The Intel Trinity, an early history of Intel
The HP Way
Play Nice But Win, the story of Dell computers
West with the Night, beautiful memoir recommended by Ernest Hemingway and written in a similar style. Much more enjoyable than the other more popular colonial-African memoir, Out of Africa.

The rest

Fiction

The 3 fiction books I liked the most:

A Very British Coup, hilarious and depressing. A great companion to the TV show "Yes, Minister"
Mort, Terry Pratchett is a very funny author
Selected Stories of Philip K Dick, depressing and dystopian but very well written. I would not read again because it's too depressing

The rest

There and NEVER, EVER BACK AGAIN: A Dark Lord's Diary, I was looking for more parodies like Bored of the Rings (which itself wasn't great). This was worse

2022

This year I'm interested in continuing to find good business books and good books on the history of tech. I'm also getting into more American history to make up for all the years of not paying attention in high school.

I'm continuing to try to find good memoirs and fiction by non-English authors.

Starting the blog-year off gently with my recap of 2021 in books.

I spent too much time watching TV and trying new video games to keep up with past years 😅https://t.co/5mfXbBnihk pic.twitter.com/ZHmPsUcr3g
— Phil Eaton (@phil_eaton) January 5, 2022

January 04, 2022

Tinybird Engineering Blog

Operational Analytics in Real Time with Tinybird and Retool

Three steps to create a Retool app or dashboard in minutes to display your operational analytics in real time.

January 01, 2022

Avinash Sajjanshetty

Recurse Center: Winter Break

the Recurse Center winter break

December 28, 2021

Andy Pavlo

Databases in 2021: A Year in Review

Andy's take on 2021 database industry happenings - PostgreSQL, Performance Wars, Passings, and Larry Ellison.

by Andy Pavlo

Phil Eaton

Writing a minimal Lua implementation with a virtual machine from scratch in Rust

By the end of this guide we'll have a minimal, working implementation of a small part of Lua from scratch. It will be able to run the following program (among others):

function fib(n)
   if n < 2 then
      return n;
   end

   local n1 = fib(n-1);
   local n2 = fib(n-2);
   return n1 + n2;
end

print(fib(30));

This is my second project in Rust and only the third time I've invented an instruction set so don't take my style as gospel. However, I have found some Rust parsing tutorials overly complex so I'm hoping you'll find this one simpler.

All source code is available on Github.

Entrypoint

Running cargo init will give the boilerplate necessary. In src/main.rs we'll accept a file name from the command line, perform lexical analysis to retrieve all tokens from the file, perform grammar analysis on the tokens to retrieve a tree structure, compile the tree to a linear set of virtual machine instructions, and finally interpret the virtual machine instructions.

mod eval;
mod lex;
mod parse;

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();
    let contents = fs::read_to_string(&args[1]).expect("Could not read file");

    let raw: Vec<char> = contents.chars().collect();

    let tokens = match lex::lex(&raw) {
        Ok(tokens) => tokens,
        Err(msg) => panic!("{}", msg),
    };

    let ast = match parse::parse(&raw, tokens) {
        Ok(ast) => ast,
        Err(msg) => panic!("{}", msg),
    };

    let pgrm = eval::compile(&raw, ast);

    eval::eval(pgrm);
}

Easy peasy. Now let's implement lex.

Lexical analysis

Lexical analysis drops whitespace (Lua is not whitespace sensitive) and chunks all source code characters into their smallest possible meaningful pieces like commas, numbers, identifiers, keywords, etc.

In order to have useful error messages, we'll keep track of state in the file with a Location struct that implements increment and debug.

This goes in src/lex.rs.

#[derive(Copy, Clone, Debug)]
pub struct Location {
    col: i32,
    line: i32,
    index: usize,
}

The increment function will update line and column numbers as well as the current index in the file.

impl Location {
    fn increment(&self, newline: bool) -> Location {
        if newline {
            Location {
                index: self.index + 1,
                col: 0,
                line: self.line + 1,
            }
        } else {
            Location {
                index: self.index + 1,
                col: self.col + 1,
                line: self.line,
            }
        }
    }

And the debug function will dump the current line with a pointer in text to the current column along with a message.

    pub fn debug<S: Into<String>>(&self, raw: &[char], msg: S) -> String {
        let mut line = 0;
        let mut line_str = String::new();
        // Find the whole line of original source
        for c in raw {
            if *c == '\n' {
                line += 1;

                // Done discovering line in question
                if !line_str.is_empty() {
                    break;
                }

                continue;
            }

            if self.line == line {
                line_str.push_str(&c.to_string());
            }
        }

        let space = " ".repeat(self.col as usize);
        format!("{}\n\n{}\n{}^ Near here", msg.into(), line_str, space)
    }
}

The smallest individual unit after lexical analysis is a token which is either a keyword, number, identifier, operator, or syntax. (This implementation is clearly skipping lots of real Lua syntax like strings.)

#[derive(Debug, PartialEq, Eq, Clone)]
pub enum TokenKind {
    Identifier,
    Syntax,
    Keyword,
    Number,
    Operator,
}

#[derive(Debug, Clone)]
pub struct Token {
    pub value: String,
    pub kind: TokenKind,
    pub loc: Location,
}

The top-level lex function will iterate over the file and call a lex helper for each kind of token, returning an array of all tokens on success. In between lexing it will "eat whitespace".

pub fn lex(s: &[char]) -> Result<Vec<Token>, String> {
    let mut loc = Location {
        col: 0,
        index: 0,
        line: 0,
    };
    let size = s.len();
    let mut tokens: Vec<Token> = vec![];

    let lexers = [
        lex_keyword,
        lex_identifier,
        lex_number,
        lex_syntax,
        lex_operator,
    ];
    'outer: while loc.index < size {
        loc = eat_whitespace(s, loc);
        if loc.index == size {
            break;
        }

        for lexer in lexers {
            let res = lexer(s, loc);
            if let Some((t, next_loc)) = res {
                loc = next_loc;
                tokens.push(t);
                continue 'outer;
            }
        }

        return Err(loc.debug(s, "Unrecognized character while lexing:"));
    }

    Ok(tokens)
}

Whitespace

Eating whitespace is just incrementing the location while we see a space, tab, newline, etc.

fn eat_whitespace(raw: &[char], initial_loc: Location) -> Location {
    let mut c = raw[initial_loc.index];
    let mut next_loc = initial_loc;
    while [' ', '\n', '\r', '\t'].contains(&c) {
        next_loc = next_loc.increment(c == '\n');
        if next_loc.index == raw.len() {
            break;
        }
        c = raw[next_loc.index];
    }

    next_loc
}

Numbers

Lexing numbers iterates through the source starting at a position until it stops seeing decimal digits (this implementation only supports integers).

fn lex_number(raw: &[char], initial_loc: Location) -> Option<(Token, Location)> {
    let mut ident = String::new();
    let mut next_loc = initial_loc;
    let mut c = raw[initial_loc.index];
    while c.is_digit(10) {
        ident.push_str(&c.to_string());
        next_loc = next_loc.increment(false);
        c = raw[next_loc.index];
    }

If there are no digits in the string then this is not a number.

    if !ident.is_empty() {
        Some((
            Token {
                value: ident,
                loc: initial_loc,
                kind: TokenKind::Number,
            },
            next_loc,
        ))
    } else {
        None
    }
}

Identifiers

Identifiers are any collection of alphabet characters, numbers, and underscores.

fn lex_identifier(raw: &Vec<char>, initial_loc: Location) -> Option<(Token, Location)> {
    let mut ident = String::new();
    let mut next_loc = initial_loc;
    let mut c = raw[initial_loc.index];
    while c.is_alphanumeric() || c == '_' {
        ident.push_str(&c.to_string());
        next_loc = next_loc.increment(false);
        c = raw[next_loc.index];
    }

But they cannot start with a number.

    // First character must not be a digit
    if ident.len() > 0 && !ident.chars().next().unwrap().is_digit(10) {
        Some((
            Token {
                value: ident,
                loc: initial_loc,
                kind: TokenKind::Identifier,
            },
            next_loc,
        ))
    } else {
        None
    }
}

Keywords

Keywords are alphabetical like identifiers are but they cannot be reused as variables by the user.

fn lex_keyword(raw: &[char], initial_loc: Location) -> Option<(Token, Location)> {
    let syntax = ["function", "end", "if", "then", "local", "return"];

    let mut next_loc = initial_loc;
    let mut value = String::new();
    'outer: for possible_syntax in syntax {
        let mut c = raw[initial_loc.index];
        next_loc = initial_loc;
        while c.is_alphanumeric() || c == '_' {
            value.push_str(&c.to_string());
            next_loc = next_loc.increment(false);
            c = raw[next_loc.index];

            let n = next_loc.index - initial_loc.index;
            if value != possible_syntax[..n] {
                value = String::new();
                continue 'outer;
            }
        }

        // Not a complete match
        if value.len() < possible_syntax.len() {
            value = String::new();
            continue;
        }

        // If it got to this point it found a match, so exit early.
        // We don't need a longest match.
        break;
    }

    if value.is_empty() {
        return None;
    }

Aside from matching a list of strings we have to make sure there is a complete match. For example function1 is not a keyword, it's a valid identifier. Whereas function 1 is a valid set of tokens (the keyword function and the number 1), even if it's not a valid Lua grammar.

    // If the next character would be part of a valid identifier, then
    // this is not a keyword.
    if next_loc.index < raw.len() - 1 {
        let next_c = raw[next_loc.index];
        if next_c.is_alphanumeric() || next_c == '_' {
            return None;
        }
    }

    Some((
        Token {
            value: value,
            loc: initial_loc,
            kind: TokenKind::Keyword,
        },
        next_loc,
    ))
}

Syntax

Syntax (in this context) is just language junk that isn't operators. Things like commas, parenthesis, etc.

fn lex_syntax(raw: &[char], initial_loc: Location) -> Option<(Token, Location)> {
    let syntax = [";", "=", "(", ")", ","];

    for possible_syntax in syntax {
        let c = raw[initial_loc.index];
        let next_loc = initial_loc.increment(false);
        // TODO: this won't work with multiple-character syntax bits like >= or ==
        if possible_syntax == c.to_string() {
            return Some((
                Token {
                    value: possible_syntax.to_string(),
                    loc: initial_loc,
                    kind: TokenKind::Syntax,
                },
                next_loc,
            ));
        }
    }

    None
}

Operators

Operators are things like plus, minus, and less than symbols. Operators are syntax but it helps us later on to break these out into a seperate type of token.

fn lex_operator(raw: &[char], initial_loc: Location) -> Option<(Token, Location)> {
    let operators = ["+", "-", "<"];

    for possible_syntax in operators {
        let c = raw[initial_loc.index];
        let next_loc = initial_loc.increment(false);
        // TODO: this won't work with multiple-character operators like >= or ==
        if possible_syntax == c.to_string() {
            return Some((
                Token {
                    value: possible_syntax.to_string(),
                    loc: initial_loc,
                    kind: TokenKind::Operator,
                },
                next_loc,
            ));
        }
    }

    None
}

And now we're all done lexing!

Grammar analysis

Parsing finds grammatical (tree) patterns in a flat list of tokens. This is called a syntax tree or abstract syntax tree (AST).

The boring part is defining the tree. Generally speaking (and specifically for this project), the syntax tree is a list of statements. Statements can be function definitions or expression statements or if statements or return statements or local declarations.

This goes in src/parse.rs.

#[derive(Debug)]
pub enum Statement {
    Expression(Expression),
    If(If),
    FunctionDeclaration(FunctionDeclaration),
    Return(Return),
    Local(Local),
}

pub type Ast = Vec<Statement>;

There's almost nothing special at all about the rest of the tree definitions.

#[derive(Debug)]
pub enum Literal {
    Identifier(Token),
    Number(Token),
}

#[derive(Debug)]
pub struct FunctionCall {
    pub name: Token,
    pub arguments: Vec<Expression>,
}

#[derive(Debug)]
pub struct BinaryOperation {
    pub operator: Token,
    pub left: Box<Expression>,
    pub right: Box<Expression>,
}

#[derive(Debug)]
pub enum Expression {
    FunctionCall(FunctionCall),
    BinaryOperation(BinaryOperation),
    Literal(Literal),
}

#[derive(Debug)]
pub struct FunctionDeclaration {
    pub name: Token,
    pub parameters: Vec<Token>,
    pub body: Vec<Statement>,
}

#[derive(Debug)]
pub struct If {
    pub test: Expression,
    pub body: Vec<Statement>,
}

#[derive(Debug)]
pub struct Local {
    pub name: Token,
    pub expression: Expression,
}

#[derive(Debug)]
pub struct Return {
    pub expression: Expression,
}

And that's it for the AST!

Some helpers

Lastly before the fun part, we'll define a few helpers for validating each kind of token.

fn expect_keyword(tokens: &[Token], index: usize, value: &str) -> bool {
    if index >= tokens.len() {
        return false;
    }

    let t = tokens[index].clone();
    t.kind == TokenKind::Keyword && t.value == value
}

fn expect_syntax(tokens: &[Token], index: usize, value: &str) -> bool {
    if index >= tokens.len() {
        return false;
    }

    let t = tokens[index].clone();
    t.kind == TokenKind::Syntax && t.value == value
}

fn expect_identifier(tokens: &[Token], index: usize) -> bool {
    if index >= tokens.len() {
        return false;
    }

    let t = tokens[index].clone();
    t.kind == TokenKind::Identifier
}

Now on to the fun part, actually detecting these trees!

Top-level parse

The top-level parse function and it's major helper, parse_statement, dispatch very similarly to the top-level lex function. For each statement in the file we look for function declarations, if statements, return statements, local declarations, and expression statements.

fn parse_statement(raw: &[char], tokens: &[Token], index: usize) -> Option<(Statement, usize)> {
    let parsers = [
        parse_if,
        parse_expression_statement,
        parse_return,
        parse_function,
        parse_local,
    ];
    for parser in parsers {
        let res = parser(raw, tokens, index);
        if res.is_some() {
            return res;
        }
    }

    None
}

pub fn parse(raw: &[char], tokens: Vec<Token>) -> Result<Ast, String> {
    let mut ast = vec![];
    let mut index = 0;
    let ntokens = tokens.len();
    while index < ntokens {
        let res = parse_statement(raw, &tokens, index);
        if let Some((stmt, next_index)) = res {
            index = next_index;
            ast.push(stmt);
            continue;
        }

        return Err(tokens[index].loc.debug(raw, "Invalid token while parsing:"));
    }

    Ok(ast)
}

Expression statements

Expression statements are just a wrapper for the Rust type system. They call parse_expression (which we'll define shortly), expect a semicolon afterward, and wrap the expression in a statement.

fn parse_expression_statement(
    raw: &[char],
    tokens: &[Token],
    index: usize,
) -> Option<(Statement, usize)> {
    let mut next_index = index;
    let res = parse_expression(raw, tokens, next_index)?;

    let (expr, next_next_index) = res;
    next_index = next_next_index;
    if !expect_syntax(tokens, next_index, ";") {
        println!(
            "{}",
            tokens[next_index]
                .loc
                .debug(raw, "Expected semicolon after expression:")
        );
        return None;
    }

    next_index += 1; // Skip past semicolon

    Some((Statement::Expression(expr), next_index))
}

Expressions

Expressions in this minimal Lua are only one of function calls, literals (numbers, identifiers), or binary operations. To keep things very simple, binary operations cannot be combined. So instead of 1 + 2 + 3 we'd need to do local tmp1 = 1 + 2; local tmp2 = tmp1 + 3; and so on.

fn parse_expression(raw: &[char], tokens: &[Token], index: usize) -> Option<(Expression, usize)> {
    if index >= tokens.len() {
        return None;
    }

    let t = tokens[index].clone();
    let left = match t.kind {
        TokenKind::Number => Expression::Literal(Literal::Number(t)),
        TokenKind::Identifier => Expression::Literal(Literal::Identifier(t)),
        _ => {
            return None;
        }
    };

If what follows the first literal is an open parenthesis then we try to parse a function call.

    let mut next_index = index + 1;
    if expect_syntax(tokens, next_index, "(") {
        next_index += 1; // Skip past open paren

        // Function call
        let mut arguments: Vec<Expression> = vec![];

We need to call parse_expression recursively for every possible argument passed to the function.

        while !expect_syntax(tokens, next_index, ")") {
            if arguments.is_empty() {
                if !expect_syntax(tokens, next_index, ",") {
                    println!(
                        "{}",
                        tokens[next_index]
                            .loc
                            .debug(raw, "Expected comma between function call arguments:")
                    );
                    return None;
                }

                next_index += 1; // Skip past comma
            }

            let res = parse_expression(raw, tokens, next_index);
            if let Some((arg, next_next_index)) = res {
                next_index = next_next_index;
                arguments.push(arg);
            } else {
                println!(
                    "{}",
                    tokens[next_index]
                        .loc
                        .debug(raw, "Expected valid expression in function call arguments:")
                );
                return None;
            }
        }

        next_index += 1; // Skip past closing paren

        return Some((
            Expression::FunctionCall(FunctionCall {
                name: tokens[index].clone(),
                arguments,
            }),
            next_index,
        ));
    }

Otherwise if there isn't an opening parenthesis then we could be parsing either a literal expression or a binary operation. If the token that follows is an operator token then we know it's a binary operation.

    // Might be a literal expression
    if next_index >= tokens.len() || tokens[next_index].clone().kind != TokenKind::Operator {
        return Some((left, next_index));
    }

    // Otherwise is a binary operation
    let op = tokens[next_index].clone();
    next_index += 1; // Skip past op

    if next_index >= tokens.len() {
        println!(
            "{}",
            tokens[next_index]
                .loc
                .debug(raw, "Expected valid right hand side binary operand:")
        );
        return None;
    }

    let rtoken = tokens[next_index].clone();

It is at this point that we could (but won't) call parse_expression recursively. I don't want to deal with operator precedence right now so we'll just require that the right hand side is another literal.

    let right = match rtoken.kind {
        TokenKind::Number => Expression::Literal(Literal::Number(rtoken)),
        TokenKind::Identifier => Expression::Literal(Literal::Identifier(rtoken)),
        _ => {
            println!(
                "{}",
                rtoken
                    .loc
                    .debug(raw, "Expected valid right hand side binary operand:")
            );
            return None;
        }
    };
    next_index += 1; // Skip past right hand operand

    Some((
        Expression::BinaryOperation(BinaryOperation {
            left: Box::new
                        
                        
                            ←
                        
                        
                        
                            1
                            
                                ...
                            
                        
                        
                        
                            
                                56
                            
                        
                            
                                57
                            
                        
                            
                                58
                            
                        
                            
                                59
                            
                        
                            
                                60
                            
                        
                            
                                61
                            
                        
                            
                                62
                            
                        
                            
                                63
                            
                        
                            
                                64
                            
                        
                            
                                65
                            
                        
                        
                        
                            
                                ...
                            
                            76
                        
                        
                        
                        
                            →