pass-lang/docs/syntax.md

4.8 KiB

Syntax Reference

Grammar

The grammar is LL(1).

block = open-block, block-body, close-block
block-body = "stmt, [{ terminator, stmt }] ;
stmt = assignment | expr ;
assignment = var, [":", expr], "=", expr ;
expr = "if", expr, block, [ "else", block ]
     | "loop", [ label ], control-vars, block
     | "next", [ label ]
     | "exit", [ label ], expr
     | "return", expr
     (* these expressions can be used as the LHS of *)
     (* a function application or binary operator. *)
     | "(", expr, ")", expr-cont
     | unop, expr, expr-cont
     | string, expr-cont
     | num, expr-cont
     | var, expr-cont
     ;
(* an optional binary operator or function application *)
expr-cont = [ binop, expr | expr ] ;
control-vars = [ control-var, [{ ",", control-var }] ] ;
control-var = assignment | var ;

Lexemes

If you use {, }, and ; for open-block, close-block, and terminator, then the lexer is regular. If you use indentation-sensitive syntax, then lexing is context-sensitive.

open-block = "{" | ":", ? indentation-based ?
close-block = "}" | ? indentation-based ?
terminator = ";" | ? indentation-based ?
unop = "-" | "~" | "!" ;
      (* arithmetic *)
binop = "+" | "-" | "*" | "/" | "%"
      (* bitwise *)
      | "&" | "|" | "^" | "<<" | ">>" | ">>>"
      (* logical *)
      | "=" | ">" | "<" | ">=" | "<=" | "!="
      (* types *)
      | ":" | "->"
      ;
num = ["-"], { decimal-digit | "," }, ["#", { digit | "," }] ;
string = '"', [{ -('"' | newline  }], '"' ;
label = "'", identifier ;
identifier = alpha, [{ alphanumeric | "_" }] ;

alpha = ? 'A'..'Z' | 'a'..'z' ? ;
decimal-digit = ? '0'..'9' ? ;
alphanumeric = decimal-digit | alpha ;
digit = alphanumeric ;
newline = "\r" | "\n" ;

A number is a series of base 10 digits by default. You may use a different base using the syntax base#digits, e.g. 2#100101, 16#DEADBEEF.

Blocks & Terminators

The rules for blocks and terminators.

  1. A terminator is emitted when the indentation level is the same as a previous indentation level in a block:

    Example:

    if x:
        x = 10
        print "hello, world!"
        y = 3
    
  2. A block is opened when : occurs at the end of a line.

    The indentation level for the block will be the indentation level of the following line.

    Example:

    loop:
        pass
    
  3. A block is closed when the indentation level of a line is less than the indentation level of a block.

    Example:

    loop:
        ...
    ...
    
  4. If a new indentation level is introduced and it is not the start of a block, then it is the continuation of an expression, not a new block.

    All lines and nested indentation levels are ignored and no tokens are emitted.

    Example:

    some_variable = some long expression
                  + some other long expression
    

Indentation Levels

The rules for indentation levels:

  1. All whitespace on an empty line is ignored.

    Example:

        ...
    <TAB>    <TAB>   // very bad indentation, but no error
        ...
    
  2. All additional indentation on a line is combined into one indentation level.

    Example:

    if x:
                         ... // *one* level deeper!
    
  3. All tabs on a line must precede all spaces. (One level of mixed indentation is allowed.)

    Good:

    <TAB><TAB>if x:
    <TAB><TAB>    ...
    

    Good:

    loop:
    <TAB>  if x:
    <TAB>      ....
    

    Bad:

    <TAB>    if x:
    <TAB>    <TAB>...
    
  4. All indentation must match preceding lines (except for new indentation levels):

    Good:

    <TAB>if x:
    <TAB>   ...
    <TAB>   ... // same as previous line
    <TAB>...    // matches indentation of `if x`
    

    Bad:

    <TAB>... // this line used a tab...
         ... // but this line used a space, even if it looks the same
    

    Bad:

    if x:
          ...
       ...     // this line doesn't match the level of the previous line *or* the `if`
    

Operators

The full list of operators is specified in "Lexemes".

Operator precedence:

  • Unary operators always have greatest precedence.
  • Arithmetic operators: * = / = % > + = -
  • Bitwise operators: & > (| ? ^)
  • Logical operators: (= ? != ? > ? < ? >= ? <=) > all arithmetic or bitwise operators
  • Type operators: : > all other binary operators

Operator associativity:

  • left-associative: *, /, +, -, &, |, ^, :
  • right-associative: ->
  • non-associative: =, !=, >, <, >=, <=

If two operators are not related by a precedence (either ? or not specified), then they cannot be used in the same expression without grouping using parentheses.

There are no ternary, postfix, or mixfix operators.

User-defined operators are not allowed (at least not for now).