Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LET rewriter #650

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

LET rewriter #650

wants to merge 5 commits into from

Conversation

jpellegrini
Copy link
Contributor

@jpellegrini jpellegrini commented Jun 1, 2024

Basically, nested LETs and LET*s are joined into a single LET*.

This is because going through several ENTER-LET instructions is slower than a single ENTER-LET-STAR.

We also try to turn LETs into LET*s when possible.

When do we not do this?

  1. When the bindings of one LET shadows the bindings of the other
(LET ((a 1))
  (LET ((a 2)) body1) body2)

We obviously cannot turn this into

(LET* ((a 1)
       (a 2))
  body1
  body2)
  1. When one individual binding of the inner LET makes a reference to a symbol defined previously in it.
(LET ((a 1)
      (x (1+ a)) ;; a is NOT the one defined above!

We cannot turn this into

(LET* ((a 1)
       (x (1+ a))) ;; Oops, we now refer the the 'a' defined above!
  1. One of the LETs is a named LET.

  2. We do not rewrite SRFI-5 LETs.

  3. Finally: the expansion of LETREC is never optimized. This is because when we do so we break test 1.2 of the "R5RS Pitfalls" suite.

How is it done?

  • Item (5) above is attained by changing 'let' into '%let' in the expansion of LETREC (compiler.stk)
  • The rest is done by a handful of small procedures in comprewrite.stk, along with a new source rewriter for LET.

@jpellegrini
Copy link
Contributor Author

Some timings:

First, an extreme case -- six nested LETs

(let ((x #f))
  (time
   (repeat 30_000_000
           (let ((a 1))
             (let ((b 2))
               (let ((c 3))
                 (let ((d 4))
                   (let ((e 5))
                     (let ((f 6))
                   (set! x a)))))))))
  x)
;; 2210.84 ms

(let ((x #f))
  (time
   (repeat 30_000_000
           (let* ((a 1)
                  (b 2)
                  (c 3)
                  (d 4)
                  (e 5)
                  (f 6))
                 (set! x a))))
  x)
;; 1201.03 ms

Now the simpler case: two nested LETs. Here the difference is really not large, but LET* is still faster.

(let ((x #f))
  (time
   (repeat 30_000_000
           (let ((a 1))
             (let ((b 2))
               (set! x a)))))
  x)
;; 981.54 ms

(let ((x #f))
  (time
   (repeat 30_000_000
           (let* ((a 1)
                  (b 2))
             (set! x a))))
  x)
;; 920.21 ms

All timings are average of 20 repetitions, and only on an amd64 machine.

@jpellegrini
Copy link
Contributor Author

break test 1.2 of the "R5RS Pitfalls" suite.

Regarding this: we take the trouble to not optimize LETREC and insist in passing the test (!), but some mainstream Schemes do not pass that test:

  • Bigloo (actually eats up 100% CPU and does not finish)
  • Chibi (and Unsyntax)
  • Cyclone
  • Gauche
  • Racket

@jpellegrini
Copy link
Contributor Author

@egallesio this is ready -- passes all tests, and rewrites several LETs. It also seems to make the test suite run a tiny bit faster.

@jpellegrini jpellegrini changed the title Adds a LET rewriter LET rewriter Jun 1, 2024
Basically, nested LETs and LET*s are joined into a LET*.

This is because going through several ENTER-LET instructions is
slower than a single ENTER-LET-STAR.

When do we *not* do this?

1. When the bindings of one LET shadows the bindings of the other

(LET ((a 1))
  (LET ((a 2)) body1) body2)

We obviously cannot turn this into
(LET* ((a 2)
       (a 2))
  body1
  body2)

2. When one individual binding of one of the LETs defines a variable,
but also uses its name in subsequent definitions:

(LET ((a 1)
      (x (1+ a))) ...)  ;; a is NOT the one defined above!

We cannot turn this into

(LET* ((a 1)
       (x (1+ a))) ...) ;; Oops, we now refer the the 'a' defined above!

3. One of the LETs is a named LET.

4. Finally: the expanion of LETREC is never optimized. This is because
when we do so we break test 1.2 of the "R5RS Pitfalls" suite.

How is it done?

- Item (4) above is attained by changing 'let' into '%let' in the
  expansion of LETREC (compiler.stk)
- The rest is done by a handful of small procedures in
  'comprewrite.stk', along with a new source rewriter for LET.
@egallesio
Copy link
Owner

Hi @jpellegrini,

I had not really studied carefully your code, and not even tested it 😄
However, In general, I'm not sure that your rewriting is equivalent. Here is an example

(let ()
  (define b 1000)

  ;;; <---------->
  (let ((a 1))
    (let ((b 2))
      (set! b 100)
      (print b))
    (set! b 200)
    (print b))   ;; This one is the external b
  ;;; <---------->

  (print b))

==> 
100
200
200

If we rewrite the internal let we'll have

(let ()
  (define b 1000)

  ;;; <---------->
  (let* ((a 1)
         (b 2))
    (set! b 100)
    (print b)
    (set! b 200)
    (print b))   ;; This one is the internal b
  ;;; <---------->
  
  (print b))

==>
100
200
1000

That is, the rewriting is OK in the let, but if you have side effects, it is possible that we do not set the good variable. Intuitively, I think it is OK iff <body2> is empty.

@jpellegrini
Copy link
Contributor Author

Hi @egallesio - you're right! This is only valid if body2 is empty. Or, more precisely, if only one of the bodies of all LETs is non-empty... I can add that.

@jpellegrini
Copy link
Contributor Author

Okay, I have checked. Indeed, there is no way to guarantee that the LET being rewritten is safe. I'll turn this into draft, but I think it isn't feasible...

@jpellegrini jpellegrini marked this pull request as draft June 2, 2024 18:15
@jpellegrini
Copy link
Contributor Author

Okay, I have checked. Indeed, there is no way to guarantee that the LET being rewritten is safe. I'll turn this into draft, but I think it isn't feasible...

Although I have added some tests, including 3 nested LETs in the style you have suggested, and forbidding more than one non-empty body seems to pass them all. Anyway, maybe it's better to think about this more.

When a LET introduces more than one binding, and its bindings would
not introduce internal shadowing if turned into LET*, we turn the
LET into LET* (it's faster for several bindings).

Also reorganize the LET rewriter, and avoid rewriting SRFI-5 LETs
@jpellegrini
Copy link
Contributor Author

Hi @egallesio !
You're right - there can be nothing after the body of the inner LET. I have changed the patch, and also included tests.
I think it should work now...

@jpellegrini jpellegrini marked this pull request as ready for review June 3, 2024 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants