Make the parser more forgiving, re-add `let`, update README.

2020-11-02 15:59:35 -08:00 · 2020-11-02 15:59:35 -08:00 · 3b2dd67fe7
parent 0d821ccce1
commit 3b2dd67fe7
8 changed files with 74 additions and 111 deletions
--- a/README.md
+++ b/README.md
@ -1,109 +1,32 @@
 # James Martin's Lambda Calculus
-This project is a tool for me to learn about implementing various concepts from programming language theory, in particular those that relate to the lambda calculus.
-I also hope to equip it with enough useful features that it is a usable for real programming tasks, at least as a toy.
-Ideally, this will also be a useful tool for *others* to learn about the lambda calculus through experimentation.
+This is a simple implementation of the untyped lambda calculus
+with an emphasis on clear, readable Haskell code.

 ## Usage
 Type in your expression at the prompt: `>> `.
-The result of the evaluation of that expression will then be printed out.
-Exit the prompt with `Ctrl-C` (or however else you kill a program in your terminal).
-
-Bound variables will be printed followed by a number representing the number of binders
-between it and its definition for disambiguation.
+The expression will be evaluated to normal form and then printed.
+Exit the prompt with `Ctrl-c` (or equivalent).

 ### Example session
 ```
->> let D = (\x. x x) in let F = (\f. f (f y)) in D (F ())
-y y
->> let T = (\f x. f (f x)) in (\f x. T (T (T (T T))) f x) () y
+>> let D = (\x. x x) in let F = (\f. f (f y)) in D (F (\x. x))
+(y y)
+>> let T = (\f x. f (f x)) in (\f x. T (T (T (T T))) f x) (\x. x) y
 y
->> \x. \y. y x
-\x. \y. y:0 x:1
+>> (\x y z. x y) y
+(\y'. (\z. (y y')))
 >> ^C
 ```

 ## Notation
 [Conventional Lambda Calculus notation applies](https://en.wikipedia.org/wiki/Lambda_calculus_definition#Notation),
-with the exception that variable names are mmultiple characters long,
-and `\` is used in lieu of `λ` for convenience.
+with the exception that variable names are multiple characters long,
+`\` is used in lieu of `λ` to make it easier to type,
+and spaces are used to separate variables rather than commas.

 * Variable names are alphanumeric, beginning with a letter.
 * Outermost parentheses may be dropped: `M N` is equivalent to `(M N)`.
 * Applications are left-associative: `M N P` may be written instead of `((M N) P)`.
-* The body of an abstraction extends as far right as possible: `\x. M N` means `\x.(M N)` and not ``(\x. M) N`.
+* The body of an abstraction or let expression extends as far right as possible: `\x. M N` means `\x.(M N)` and not `(\x. M) N`.
 * A sequence of abstractions may be contracted: `\foo. \bar. \baz. N` may be abbreviated as `\foo bar baz. N`.
-* Variables may be bound using let expressions: `let x = N in M` abbreviates `(\x. N) M`.
-
-### Violations of convention
-* I use spaces to separate variables in abstractions instead of commas because I think it looks better.
-
-### Additional extensions to notation
-Since `\x. x` is the left identity of applications and application syntax is left-associative,
-I (syntactically) permit unary and nullary applications so that `()` is `\x. x`, and `(x)` is `x`.
-
-On the same principle, the syntax of a lambda of no variables `\. e` is `e`.
-
-## Roadmap
-### Complete
-* Type systems:
-  * Untyped
-* Representations:
-  * The syntax tree
-  * Reverse de Bruijn
-* Syntax:
-  * Basic syntax
-  * Let expressions
-* Evaluation strategies:
-  * Lazy (call-by-name to normal form)
-
-### Planned
-Not all of these will necessarily (or even probably) be implemented.
-This is more-or-less a wishlist of things I'd like to try to implement some day.
-* Built-ins:
-  * Integers
-  * Characters
-  * Strings
-  * Lists
-* Type systems:
-  * all of the systems of the Lambda Cube
-  * Hindley-Milner
-  * and the calculus of (co)inductive constructions
-    * and something based on cubical TT
-    * and something with universe polymorphism
-    * and something with insanely dependent types
-    * and support for tactics
-  * and something with non-trivial subtyping
-  * and something with row polymorphism
-  * and something with typeclasses/constraints
-  * and something with irrelevance (runtime, true irrelevance, prop)
-  * and something with iso/equirecursive types?
-  * (classical?) linear types
-    * something with lifetimes, like Rust
-    * something that would work on a quantum computer, at least in theory
-    * something with proof nets?
-* Macros, fexprs
-* (Delimited) continuations
-  * Something based on lambda-mu?
-* Effects:
-  * A (co)effects system.
-  * Call-by-push-value.
-* Representations:
-  * A more conservative syntax tree that would allow for better error messages
-* Evaluation strategies:
-  * The evaluation strategies documented by Thierry(?)
-    * Full laziness
-    * Complete laziness
-  * Optimal
-* Syntax:
-  * Top-level definitions
-  * Type annotations
-  * `let*`, `letrec`
-  * Pretty-printing mode.
-  * Indentation-based syntax.
-* Features:
-  * A better REPL (e.g. the ability to edit the line buffer)
-  * The ability to import external files
-    * A good module system?
-  * The ability to choose the type system or evaluation strategy
-  * Better error messages for parsing and typechecking
-  * Reduction stepping
+* Variables may be bound using let expressions: `let x = N in M` is syntactic sugar for `(\x. N) M`.
--- a/app/Main.hs
+++ b/app/Main.hs
@ -4,7 +4,7 @@ module Main where
 import Control.Monad (forever)
 import Data.Text
 import qualified Data.Text.IO as TIO
-import LambdaCalculus.Expression (lazyEval)
+import LambdaCalculus.Expression (eagerEval)
 import LambdaCalculus.Parser (parseExpression)
 import System.IO (hFlush, stdout)

@ -17,4 +17,4 @@ prompt text = do
 main :: IO ()
 main = forever $ parseExpression <$> prompt ">> " >>= \case
  Left parseError -> putStrLn $ "Parse error: " ++ show parseError
-  Right expr -> print $ lazyEval expr
+  Right expr -> print $ eagerEval expr
--- a/package.yaml
+++ b/package.yaml
@ -4,8 +4,8 @@ github:              "jamestmartin/lambda-calculus"
 license:             GPL-3
 author:              "James Martin"
 maintainer:          "james@jtmar.me"
-copyright:           "2019 James Martin"
-synopsis:            "Implementations of various Lambda Calculus evaluators and type systems."
+copyright:           "2019-2020 James Martin"
+synopsis:            "A simple implementation of the lambda calculus."
 category:            LambdaCalculus
 description:         Please see the README on GitHub at <https://github.com/jamestmartin/lambda-calculus#readme>

@ -16,7 +16,7 @@ default-extensions:
 - OverloadedStrings

 dependencies:
- base >= 4.12 && < 5
+- base >= 4.13 && < 5
 - parsec >= 3.1 && < 4
 - text >= 1.2 && < 2
 - text-show >= 3.8 && < 4
--- a/src/LambdaCalculus/Expression.hs
+++ b/src/LambdaCalculus/Expression.hs
@ -19,7 +19,7 @@ data Expression
 instance TextShow Expression where
  showb (Variable var) = fromText var
  showb (Application ef ex) = "(" <> showb ef <> " " <> showb ex <> ")"
-  showb (Abstraction var body) = "(^" <> fromText var <> "." <> showb body <> ")"
+  showb (Abstraction var body) = "(\\" <> fromText var <> ". " <> showb body <> ")"

 instance Show Expression where
  show = T.unpack . showt
--- a/src/LambdaCalculus/Parser.hs
+++ b/src/LambdaCalculus/Parser.hs
@ -12,28 +12,51 @@ spaces :: Parser ()
 spaces = void $ many1 space

 variableName :: Parser Text
-variableName = T.pack <$> many1 letter
+variableName = do
+  notFollowedBy anyKeyword
+  T.pack <$> many1 letter
+  where anyKeyword = choice $ map keyword keywords
+        keywords = ["let", "in"]
+
+-- | Match an exact string which is not just a substring
+-- of a larger variable name.
+keyword :: Text -> Parser ()
+keyword kwd = try $ do
+  void $ string (T.unpack kwd)
+  notFollowedBy letter

 variable :: Parser Expression
 variable = Variable <$> variableName

 application :: Parser Expression
-application = foldl1 Application <$> sepBy1 applicationTerm spaces
+application = foldl1 Application <$> sepEndBy1 applicationTerm spaces
  where applicationTerm :: Parser Expression
-        applicationTerm = variable <|> abstraction <|> grouping
+        applicationTerm = variable <|> abstraction <|> let_ <|> grouping
          where grouping :: Parser Expression
                grouping = between (char '(') (char ')') expression

 abstraction :: Parser Expression
 abstraction = do
-  char '^'
-  names <- sepBy1 variableName spaces
+  char '\\'
+  optional spaces
+  names <- sepEndBy1 variableName spaces
  char '.'
+  optional spaces
  body <- expression
  pure $ foldr Abstraction body names

+let_ :: Parser Expression
+let_ = do
+  keyword "let"
+  name <- between spaces (optional spaces) variableName
+  char '='
+  value <- expression
+  keyword "in"
+  body <- expression
+  pure $ Application (Abstraction name body) value
+
 expression :: Parser Expression
-expression = abstraction <|> application <|> variable
+expression = optional spaces *> (abstraction <|> let_ <|> application <|> variable) <* optional spaces

 parseExpression :: Text -> Either ParseError Expression
-parseExpression code = parse (expression <* eof) "input" code
+parseExpression = parse (expression <* eof) "input"
--- a/stack.yaml
+++ b/stack.yaml
@ -1,3 +1,3 @@
-resolver: lts-14.17
+resolver: lts-16.20
 packages:
 - .
--- a/stack.yaml.lock
+++ b/stack.yaml.lock
@ -6,7 +6,7 @@
 packages: []
 snapshots:
 - completed:
-    size: 524799
-    url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/14/17.yaml
-    sha256: 1d72b33c0fc048e23f4f18fd76a6ad79dd1d8a3c054644098a71a09855e40c7c
-  original: lts-14.17
+    size: 532177
+    url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/16/20.yaml
+    sha256: 0e14ba5603f01e8496e8984fd84b545a012ca723f51a098c6c9d3694e404dc6d
+  original: lts-16.20
--- a/test/Spec.hs
+++ b/test/Spec.hs
@ -15,12 +15,19 @@ instance Arbitrary Expression where
 instance Arbitrary T.Text where
  arbitrary = T.pack <$> listOf1 (elements $ ['A'..'Z'] ++ ['a'..'z'])

+-- These are terms which have complex reduction steps and
+-- are likely to catch bugs in the substitution function, if there are any.
+-- However, they don't have any particular computational *meaning*,
+-- so the names for them are somewhat arbitrary.
+
+-- This should evaluate to `y y`.
 dfi :: Expression
 dfi = Application d (Application f i)
  where d = Abstraction "x" $ Application (Variable "x") (Variable "x")
        f = Abstraction "f" $ Application (Variable "f") (Application (Variable "f") (Variable "y"))
        i = Abstraction "x" $ Variable "x"

+-- This should evalaute to `y`.
 ttttt :: Expression
 ttttt = Application (Application (Application f t) (Abstraction "x" (Variable "x"))) (Variable "y")
  where t = Abstraction "f" $ Abstraction "x" $
@ -47,10 +54,20 @@ main = defaultMain $
    ]
  , testGroup "Parser tests"
    [ testGroup "Unit tests"
-      [ testCase "identity" $ parseExpression "^x.x" @?= Right (Abstraction "x" $ Variable "x")
+      [ testCase "identity" $ parseExpression "\\x.x" @?= Right (Abstraction "x" $ Variable "x")
+      -- This syntax is forbidden because it interacts poorly with other syntax, e.g. `let x=in` becoming a valid program.
+      --, testCase "nullary application" $ parseExpression "()" @?= Right (Abstraction "x" $ Variable "x")
+      , testCase "unary application" $ parseExpression "(x)" @?= Right (Variable "x")
      , testCase "application shorthand" $ parseExpression "a b c d" @?= Right (Application (Application (Application (Variable "a") (Variable "b")) (Variable "c")) (Variable "d"))
-      , testCase "ttttt" $ parseExpression "(^T f x.(T (T (T (T T)))) f x) (^f x.f (f x)) (^x.x) y"
+      , testCase "let" $ parseExpression "let x = \\y.y in x" @?= Right (Application (Abstraction "x" (Variable "x")) (Abstraction "y" (Variable "y")))
+      , testCase "ttttt" $ parseExpression "(\\T f x.(T (T (T (T T)))) f x) (\\f x.f (f x)) (\\x.x) y"
          @?= Right ttttt
+      , testGroup "Redundant whitespace"
+        [ testCase "around variable" $ parseExpression " x " @?= Right (Variable "x")
+        , testCase "around lambda" $ parseExpression " \\ x   y . x " @?= Right (Abstraction "x" $ Abstraction "y" $ Variable "x")
+        , testCase "around application" $ parseExpression " ( x   (y ) ) " @?= Right (Application (Variable "x") (Variable "y"))
+        , testCase "around let" $ parseExpression "  let x=(y)in x  " @?= Right (Application (Abstraction "x" (Variable "x")) (Variable "y"))
+        ]
      ]
    , testProperty "parseExpression is the left inverse of show" prop_parseExpression_inverse
    ]