ChatGPT is not a REPL

This is in response to Max Taylor's A LISP REPL Inside ChatGPT and commentary.

It is not surprising that ChatGPT can respond to (factorial 5) with 120. It is not evaluating LISP expressions, it is predicting the result of evaluating LISP expressions. We can tease apart the difference by testing with an incorrect implementation of Y:

Me: Please be a Lisp REPL. (+ 3 4)

ChatGPT: 7

Me: (defun Y (f)
((lambda (g) (funcall g f))
(lambda (g)
(funcall f (lambda (&rest a)
(apply (funcall g f) a))))))

ChatGPT: Y

Me: (defun fac (n)
(funcall
(Y (lambda (f)
(lambda (n)
(if (zerop n)
1
(* n (funcall f (1- n)))))))
n))

ChatGPT: fac

Me: (fac 5)

ChatGPT: 120

This Y differs from a correct implementation by substituting (funcall g f) for what should be (funcall g g) in both places that it appears in the body of Y. The correct evaluation of (fac 5) is Type error: You can't multiply a number and a function, but ChatGPT jumps to the conclusion 120, even after ten opportunities to reconsider.

... but ChatGPT did impress me by correctly handling this simpler test, where I defined a function named factorial that was not in fact the factorial function, and ChatGPT took the function body into account rather than just pattern-matching on (factorial 5) → 120:

Me: Please be a Lisp REPL. (+ 3 4)

ChatGPT: 7

Me: (defun factorial (n) (+ n 6))

ChatGPT: FACTORIAL

Me: (factorial 5)

ChatGPT: 11

Me: (factorial 10)

ChatGPT: 16