4.8 KiB
Syntax Reference
Grammar
The grammar is LL(1).
block = open-block, block-body, close-block
block-body = "stmt, [{ terminator, stmt }] ;
stmt = assignment | expr ;
assignment = var, [":", expr], "=", expr ;
expr = "if", expr, block, [ "else", block ]
| "loop", [ label ], control-vars, block
| "next", [ label ]
| "exit", [ label ], expr
| "return", expr
(* these expressions can be used as the LHS of *)
(* a function application or binary operator. *)
| "(", expr, ")", expr-cont
| unop, expr, expr-cont
| string, expr-cont
| num, expr-cont
| var, expr-cont
;
(* an optional binary operator or function application *)
expr-cont = [ binop, expr | expr ] ;
control-vars = [ control-var, [{ ",", control-var }] ] ;
control-var = assignment | var ;
Lexemes
If you use {
, }
, and ;
for open-block
, close-block
, and terminator
,
then the lexer is regular. If you use indentation-sensitive syntax, then lexing
is context-sensitive.
open-block = "{" | ":", ? indentation-based ?
close-block = "}" | ? indentation-based ?
terminator = ";" | ? indentation-based ?
unop = "-" | "~" | "!" ;
(* arithmetic *)
binop = "+" | "-" | "*" | "/" | "%"
(* bitwise *)
| "&" | "|" | "^" | "<<" | ">>" | ">>>"
(* logical *)
| "=" | ">" | "<" | ">=" | "<=" | "!="
(* types *)
| ":" | "->"
;
num = ["-"], { decimal-digit | "," }, ["#", { digit | "," }] ;
string = '"', [{ -('"' | newline }], '"' ;
label = "'", identifier ;
identifier = alpha, [{ alphanumeric | "_" }] ;
alpha = ? 'A'..'Z' | 'a'..'z' ? ;
decimal-digit = ? '0'..'9' ? ;
alphanumeric = decimal-digit | alpha ;
digit = alphanumeric ;
newline = "\r" | "\n" ;
A number is a series of base 10 digits by default.
You may use a different base using the syntax base#digits
,
e.g. 2#100101
, 16#DEADBEEF
.
Blocks & Terminators
The rules for blocks and terminators.
-
A terminator is emitted when the indentation level is the same as a previous indentation level in a block:
Example:
if x: x = 10 print "hello, world!" y = 3
-
A block is opened when
:
occurs at the end of a line.The indentation level for the block will be the indentation level of the following line.
Example:
loop: pass
-
A block is closed when the indentation level of a line is less than the indentation level of a block.
Example:
loop: ... ...
-
If a new indentation level is introduced and it is not the start of a block, then it is the continuation of an expression, not a new block.
All lines and nested indentation levels are ignored and no tokens are emitted.
Example:
some_variable = some long expression + some other long expression
Indentation Levels
The rules for indentation levels:
-
All whitespace on an empty line is ignored.
Example:
... <TAB> <TAB> // very bad indentation, but no error ...
-
All additional indentation on a line is combined into one indentation level.
Example:
if x: ... // *one* level deeper!
-
All tabs on a line must precede all spaces. (One level of mixed indentation is allowed.)
Good:
<TAB><TAB>if x: <TAB><TAB> ...
Good:
loop: <TAB> if x: <TAB> ....
Bad:
<TAB> if x: <TAB> <TAB>...
-
All indentation must match preceding lines (except for new indentation levels):
Good:
<TAB>if x: <TAB> ... <TAB> ... // same as previous line <TAB>... // matches indentation of `if x`
Bad:
<TAB>... // this line used a tab... ... // but this line used a space, even if it looks the same
Bad:
if x: ... ... // this line doesn't match the level of the previous line *or* the `if`
Operators
The full list of operators is specified in "Lexemes".
Operator precedence:
- Unary operators always have greatest precedence.
- Arithmetic operators:
*
=/
=%
>+
=-
- Bitwise operators:
&
> (|
?^
) - Logical operators: (
=
?!=
?>
?<
?>=
?<=
) > all arithmetic or bitwise operators - Type operators:
:
> all other binary operators
Operator associativity:
- left-associative:
*
,/
,+
,-
,&
,|
,^
,:
- right-associative:
->
- non-associative:
=
,!=
,>
,<
,>=
,<=
If two operators are not related by a precedence (either ?
or not specified),
then they cannot be used in the same expression without grouping using parentheses.
There are no ternary, postfix, or mixfix operators.
User-defined operators are not allowed (at least not for now).