Commit Graph

21 Commits (master)

Author SHA1 Message Date
James T. Martin 86fea958f4
(Algebraic language) Fibonacci sequence works. 2023-07-28 20:03:10 -07:00
James T. Martin 40f88918ef
WIP on new algebraic language, going to change directions again. 2023-07-28 14:09:23 -07:00
James T. Martin 1863d420b6
WIP IR (broken) 2022-12-03 16:49:30 -08:00
James T. Martin 894c10903c
Old, incomplete, broken stuff from a month ago. 2022-11-30 13:04:03 -08:00
James T. Martin d9edbab10c
Fix calculation of labels and jump destinations.
Previously, all calculations regarding stack depth and label depth
were broken. *All* of them. Off-by-ones, the logic was wrong, etc.
Variables are still screwed up, but I'm *almost* there, where
"there" is being able to generate functioning programs, hopefully?
2022-10-19 15:42:04 -07:00
James T. Martin 68ce32a6df
Slightly unfuck codegen for relative jumps and blocks.
Still pretty horribly broken. I was generating dispacements
backwards, failing to emit stuff like the beginning of
loops, the logic for `if` jumps was backwards, etc.
I was also completely forgetting to increment file_here
for appends. I don't remember what all I did; check the diff.
2022-10-19 12:21:46 -07:00
James T. Martin 1383484e06
No assertions fail during codegen (lang.c) for test file.
The generated executable is still incorrect
for reasons which I have not yet investigated,
but this is still a step forward.
2022-10-17 11:45:38 -07:00
James T. Martin f951e8ce08
Small refactor of indentation lexer. 2022-10-17 10:02:43 -07:00
James T. Martin 8808c41250
Factor out executable format handling into a new file. 2022-09-10 22:06:21 -07:00
James T. Martin 3fe367675a
Removed radix#int syntax, added keywords to lexer. 2022-09-10 14:58:22 -07:00
James T. Martin 4c4ebeecfc
Fixed bug: use references to mutate the stack instead of values. 2022-09-10 09:07:38 -07:00
James T. Martin fcd61f6c5f
Wrote IR gen!! (Literally untested, though.)
Next stages are to wire it into the parser so I can test it,
to implement operator precedence so exprs actually exist,
and then implement operators and builtins until I can
start writing basic programs.
2022-09-08 21:07:40 -07:00
James T. Martin 9b41081c71
Indentation-sensitive syntax! 2022-09-08 16:02:30 -07:00
James T. Martin 8b251bd1d6
Added documentation for the upcoming indentation-sensitive syntax. 2022-09-08 11:40:10 -07:00
James T. Martin bce39fdc22
Greatly simplify lexer thanks to new knowledge of lookahead.
Now I know that the parser is LL(1) and the lexer also only
needs one-character lookahead, which allows me to dramatically
simplify the interface for IO input, and improve the interface
to the lexer. Even if I did want unbounded peek, I'd want
the interface to be `peek(off)`, not that awful buffer.

I intend to use the new lexer interface to make the parser
states more stateful, and potentially read multiple tokens
in a row. Then, states would only be needed for recursive
structures, without the awkward intermediate states like
ST_LABEL which exists only to let me burn a token.

I also removed the nonsense related to base 64 parsing,
because it was unclear how to handle it, and the advantages
of having it weren't clear. I kept up to base 36, but honestly
I might want to consider getting rid of everything but decimal,
hex, and binary anyway. I'm not sure if I'd want to keep using
the current syntax for the radix either.
2022-09-07 23:02:15 -07:00
James T. Martin d7c0eef7ae
Implemented parser! Recognition only, no output.
Also no top-level declarations or operator precedence.

The syntax is LL(1). LL syntax seems necessary because
our codegen requires emitting certain code (e.g. entering control)
prior to any codegen inside that context, whereas something like
LR would presumably parse the inner expression before recognizing
the control structure. There may be some way to work around this;
I don't know, I'm not a parsing expert.

Certain parts of the syntax are wonky, e.g. juxtaposition as
function application means a missing semicolon can give confusing
results. I suspect indentation-sensitive syntax would work
more nicely, and intend to implement it some time in the future.
2022-09-07 20:42:37 -07:00
James T. Martin 162683d63e
Hacked together a god-awful hand-written lexer. 2022-09-07 11:07:05 -07:00
James T. Martin 46640b6204
Remove gratuitous platform-specific IO.
I don't need some fancy atomic output file updating or
posix_fadvise. I removed all platform-specific code
except for a single `chmod`.

That's not to say there's no advantage to atomically
reading or writing files, but for this project, the first
rule needs to be KISS. It's premature optimization and
overengineering.
2022-09-06 23:16:23 -07:00
James T. Martin 57aa667000
Completely rewrite stack management.
Now we always use the stack instead of keeping a TOS register.
This is very inefficient, but I'll worry about register
allocation later.

The new block model is inspired by x86's `enter` and `leave`
instructions. I intend to support nested procedures at some point
in the future.
2022-09-06 19:47:46 -07:00
James T. Martin 4e06f8d00f
Separate instruction encoding into a separate file.
I describe the intended file structure in comments
at the top of each file.
2022-09-06 02:20:10 -07:00
James T. Martin b5667c61ec
Initial commit. 2022-09-05 23:48:56 -07:00