Introduction

L is a rust-inspired language written for educational purposes. This document contains a language reference as well as some details regarding implementation.

Lexical Syntax

Keywords

Lexer

  • fn
  • struct
  • enum
  • if
  • else
  • true
  • false
  • let
  • pub
  • return
  • continue
  • break
  • mut
  • for
  • in

Identifiers

Lexer

[_a-zA-Z][a-zA-Z0-9]*

Comments

Lexer

line-comment: //

block-comment : /* .. */

L Programs

Syntax

<program> ::= <item>*

An L program consists of a collection of items where one of them is the main function.

Items

Syntax

<item> ::= <vis> <item-kind>

<item-kind>
    ::= <function>
      | <struct-decl>

Functions

Syntax:

<fn> ::= fn <ident> <generics>? (<params>?) <return-type> ? <block-expr>

<params> ::= <param> ( , <param> )* ,?

<param> ::= <expr> : <type>

<return-type> ::= -> <type>

Examples

fn identity<T>(t: T) -> T {
    t
}
fn naive_fib(n: number) -> number {
    if n < 2 { n } else { naive_fib(n-1) + naive_fib(n-2) }
}

ADT Declarations

Type parameters

Statements

Syntax:

<stmt> ::= <let-stmt> | <expr-stmt>

Let Statement

Syntax

<let-stmt> ::= let <pattern> ( : <type> ) ( = <expr> ) ;

Let statements introduce a new name into the current scope (i.e. in the enclosing block). The type annotation and initializer are syntactically optional.

Expression Statements

Syntax:

<expr-stmt> ::= <expr> ;

An expression statement is an expression which has its value suppressed and always evaluate to a unit type. They are used only for their side effects.

Expressions

Syntax

<expr> ::=
      | <literal-expr>
      | <path-expr>
      | <group-expr>
      | <block-expr>
      | <operator-expr>
      | <tuple-expr>
      | <return-expr>
      | <closure-expr>
      | <call-expr>
      | <conditional-expr>
      | <struct-expr>

Lvalues and Rvalues

Lvalues

Rvalues

Literal Expressions

Path Expressions

Group Expressions

Syntax

<group-expr> ::= ( <expr> )

Block Expressions

Syntax

<block-expr> ::= {
      <stmt> *
      <expr> ?
}

A block expression is a control flow construct for sequencing multiple statements with an optional final expression. A block also introduces a new lexical scope where variable declarations are visible only from after the let statement that declares it until the end of the block.

Each statement is executed sequentially and the type of the entire block is the type of the final expression, or () if the expression is omitted.

Note how the differentiating factor between an expression statement and an expression is the trailing semicolon.

let x: int = { 5 };

let x: () = { 5; };

Operator Expressions

Syntax

<operator-expr> ::=
      | <assignment-expr>

Assignment Expressions

Syntax

<assignment-expr> ::= <expr> = <expr>

The expression on the left of the = must be an lvalue. Following in the style of C, the result of an assignment expression is the rhs.

// assigns `x` to 5
// main also evaluates to 5
fn main() -> int {
    let x = 0;
    x = 5
}

Tuple Expressions

Syntax

<tuple-expr> ::= ( <elements>? )

<elements> ::= ( <expr> , ) <expr>?

Note to disambiguate a singleton tuple from a grouping, you should add a trailing comma.

Return Expressions

Syntax

<return-expr> ::= return <expr> ?

Closure Expressions

Call Expressions

Conditional Expressions

Struct Expressions

Patterns

Syntax

<pattern> ::=
      | <identifier-pattern>
      | <wildcard-pattern>

Identifier Pattern

Wildcard Pattern

The Type System

Types

Syntax

<type> ::=
      | <primitive-type>
      | <parenthesized-type>
      | <function-type>

Primitive Types

Syntax

<primitive-type> ::= bool | number | char | !

bool - true or false

number (internally a IEEE 754 double-precision floating point number)

char - character type, unimplemented

! - the uninhabited never type

Parenthesized Types

Syntax

<parenthesized-type> ::= ( <type> )

Function Types

Syntax

<function-type> ::= fn( <types>? ) <return-type>?

<types> ::= <type> ( , <type>)

<return-type> ::= -> <type>

Examples

let f: fn();
let g: fn(bool) -> bool;
let h: fn(num) -> fn(bool, bool) -> bool;

Implementation

The implementation of L is heavily inspired and influenced by rustc. Many implementation and design ideas are taken from rust and in particular, the syntax is intentionally nearly identical.

The full source code can be found here

Passes

The L compiler is currently a pass based compiler. The passes are currently approximately as follows:

  • Lexical Analysis: source code -> token stream
  • Parsing: token stream -> abstract syntax tree
  • AST Lowering: AST -> IR
  • Typechecking: IR -> Typed IR (TIR)
  • TIR Lowering: TIR -> Midlevel IR (MIR)
  • MIR Lowering: MIR -> LLVM IR
  • Code Generation: LLVM IR -> Output

Lexical Analysis

The lexer breaks the input source (&str) into a sequence of tokens. The lexer uses a simple library that is also utilised in the rustc compiler which returns an iterator of TokenKinds. This is then transformed into my representation of Tokens and returned as a Vector.

Importantly, the lexer maintains the positions of each token in the source code by keeping track of the Span. A Span is nothing more than the start (inclusive) and end (exclusive) index of the token in the input string.

The relevant source code can be found in the lexer module.

Syntactic Analysis (Parsing)

Conceptually, the parser takes the stream of Tokens generated by the lexer and constructs an abstract syntax tree.

Relevant source code can be found in the parser module.