Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-47 - Clause Interleaving #47

Merged
merged 8 commits into from
Oct 21, 2022
Merged
178 changes: 178 additions & 0 deletions content/clause-interleaving.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
layout: sip
title: SIP-47 Clause Interleaving

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to mention once again that it should be "Parameter Clause Interleaving" as we're not doing that with if-clauses here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your concern, however this has not caused confusion for anyone else, and there is precedent for omitting "parameter" from the name:
https://docs.scala-lang.org/scala3/reference/contextual/using-clauses.html
I will therefore not change the name for now, but I encourage affected others to comment on this potential problem

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your concern, however this has not caused confusion for anyone else, and there is precedent for omitting "parameter" from the name: https://docs.scala-lang.org/scala3/reference/contextual/using-clauses.html I will therefore not change the name for now, but I encourage affected others to comment on this potential problem

it's not about omitting the word "parameter", it's about naming the type of clause. the word "using" names the clause in your other example.

stage: implementation
status: waiting-for-implementation
julienrf marked this conversation as resolved.
Show resolved Hide resolved
permalink: /sips/clause-interleaving.html
---

**By: Quentin Bernet and Guillaume Martres and Sébastien Doeraene**

## History

| Date | Version |
|---------------|-----------------------|
| May 5th 2022 | Initial Draft |
| Aug 17th 2022 | Formatting |
| Sep 22th 2022 | Type Currying removed |

## Summary

We propose to generalize method signatures to allow any number of type parameter lists, interleaved with term parameter lists and using parameter lists. As a simple example, it would allow to define
~~~ scala
def pair[A](a: A)[B](b: B): (A, B) = (a, b)
~~~
Here is also a more complicated and contrived example that highlights all the possible interactions:
~~~ scala
def foo[A](using a: A)(b: List[A])[C <: a.type, D](cd: (C, D))[E]: Foo[A, B, C, D, E]
~~~


## Motivation

We motivate the feature with two use cases:

* a `getOrElse` method for a heterogeneous key-value store, which is an occurrence of wanting a type parameter whose bounds are path-dependent on a term parameter, and

### Heterogeneous key-value store
Consider an API for a heterogenous key-value store, where keys know what type of value they must be associated to:
~~~ scala
trait Key:
type Value

class Store:
def get(key: Key): key.Value = …
def put(key: Key)(value: => key.Value): Unit = …
~~~
We want to provide a method `getOrElse`, taking a default value to be used if the key is not present. Such a method could look like
~~~ scala
def getOrElse(key: Key)(default: => key.Value): key.Value = …
~~~
However, at call site, it would prevent from using as default value a value that is not a valid `key.Value`. This is a limitation compared to other `getOrElse`-style methods such as that of `Option`, which allow passing any supertype of the element type.

In current Scala, there is no way to define `Store.getOrElse` in a way that supports this use case. We may try to define it as
~~~ scala
def getOrElse[V >: key.Value](key: Key)(default: => V): V = …
~~~
but that is not valid because the declaration of `V` needs to refer to the path-dependent type `key.Value`, which is defined in a later parameter list.

We might also try to move the type parameter list after `key` to avoid that problem, as
~~~ scala
def getOrElse(key: Key)[V >: key.Value](default: => V): V = …
~~~
but that is also invalid because type parameter lists must always come first.

A workaround is to return an intermediate object with an `apply` method, as follows:
~~~ scala
class Store:
final class StoreGetOrElse[K <: Key](val key: K):
def apply[V >: key.Value](default: => V): V = …
def getOrElse(key: Key): StoreGetOrElse[key.type] = StoreGetOrElse(key)
~~~
This definition provides the expected source API at call site, but it has two issues:
* It is more complex than expected, forcing a user looking at the API to navigate to the definition of `StoreGetOrElse` to make sense of it.
* It is inefficient, as an intermediate instance of `StoreGetOrElse` must be created for each call to `getOrElse`.
* Overloading resolution looks at clauses after the first one, but only in methods, the above is ambiguous with any `def getOrElse(k:Key): ...`, whereas the proposed signature is not ambiguous with for example `def getOrElse(k:Key)[A,B](x: A, y: B)`

Another workaround is to return a polymorphic function, for example:
~~~scala
def getOrElse(k:Key): [V >: k.Value] => (default: V) => V =
[V] => (default: V) => ???
~~~
While again, this provides the expected API at call site, it also has issues:
* The behavior is not the same, as `default` has to be a by-value parameter
* The definition is hard to visually parse, as users are more used to methods (and it is our opinion this should remain so)
* The definition is cumbersome to write, especially if there are a lot of term parameters
* Methods containing curried type clauses like `def foo[A][B](x: B)` cannot be represented in this way, as polymorphic methods always have to have a term parameter right after.
* It is inefficient, as many closures must be created for each call to `getOrElse` (one per term clause to the right of the first non-initial type clause).
* Same problem as above with overloading

## Proposed solution
### High-level overview

To solve the above problems, we propose to generalize method signatures so that they can have multiple type parameter lists, interleaved with term parameter lists and using parameter lists.

For the heterogeneous key-value store example, this allows to define `getOrElse` as follows:
~~~ scala
def getOrElse(key: Key)[V >: key.Value](default: => V): V = …
~~~
It provides the best of all worlds:
* A convenient API at call site
* A single point of documentation
* Efficiency, since the method erases to a single JVM method with signature `getOrElse(Object,Object)Object`

### Specification
We amend the syntax of def parameter clauses as follows:

~~~
DefDcl ::= DefSig ‘:’ Type
DefDef ::= DefSig [‘:’ Type] ‘=’ Expr
DefSig ::= id [DefParamClauses] [DefImplicitClause]
DefParamClauses ::= DefParamClauseChunk {DefParamClauseChunk}
DefParamClauseChunk ::= [DefTypeParamClause] TermOrUsingParamClause {TermOrUsingParamClause}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that this is not valid anymore:

  def foo[A](implicit a: A) = ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nicely spotted, my intent was for it to be allowed,
but it appears I made a mistake with the grammar

Since the implementation team (which is also myself) can make changes to the proposal, I believe this can fall under this umbrella
I will think about what's the clearest way to fix the grammar

Copy link
Contributor Author

@Sporarum Sporarum Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix it:

  DefDef                 ::=  DefSig [‘:’ Type] ‘=’ Expr
- DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]
+ DefSig                 ::=  id [DefParamClauses] [DefTypeParamClause] [DefImplicitClause]
  DefParamClauses        ::=  DefParamClauseChunk {DefParamClauseChunk}
  DefParamClauseChunk    ::=  [DefTypeParamClause] TermOrUsingParamClause {TermOrUsingParamClause}

There is another quirk of this definition however, it allows to parse def f(a: Int)(b: Int) in two different ways (one vs two DefParamClauseChunk), is this an issue ?

Copy link
Contributor

@julienrf julienrf Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t have a look at the complete grammar but what you propose does not look correct, what about the following?

DefDef                 ::=  DefSig [‘:’ Type] ‘=’ Expr
DefSig                 ::=  id [DefParamClauses]
DefParamClauses        ::=  DefParamClauseChunk {DefParamClauseChunk}
DefParamClauseChunk    ::=  DefTypeParamClause | TermOrUsingParamClause | DefImplicitClause

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, this allows two consecutive type clauses, or multiple implicit clauses.

here's another option:

DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]
DefParamClauses        ::=  DefParamClauseChunkTypeOpt {DefParamClauseChunkTypeReq}
DefParamClauseChunkTypeOpt    ::=  [DefTypeParamClause] {TermOrUsingParamClause}
DefParamClauseChunkTypeReq    ::=  DefTypeParamClause {TermOrUsingParamClause}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that doesn't work either... perhaps this:

DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]
DefParamClauses        ::=  [DefTypeParamClause] {DefParamClauseChunk} [TermOrUsingParamClause]
DefParamClauseChunk    ::=  TermOrUsingParamClause {TermOrUsingParamClause} DefTypeParamClause

Copy link
Contributor Author

@Sporarum Sporarum Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix it:

  DefDef                 ::=  DefSig [‘:’ Type] ‘=’ Expr
- DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]
+ DefSig                 ::=  id [DefParamClauses] [DefTypeParamClause] [DefImplicitClause]
  DefParamClauses        ::=  DefParamClauseChunk {DefParamClauseChunk}
  DefParamClauseChunk    ::=  [DefTypeParamClause] TermOrUsingParamClause {TermOrUsingParamClause}

There is another quirk of this definition however, it allows to parse def f(a: Int)(b: Int) in two different ways (one vs two DefParamClauseChunk), is this an issue ?

I didn’t have a look at the complete grammar but what you propose does not look correct

@julienrf Could you expand on this, I can't manage to find a failing example with the above syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked about it some more with @julienrf and we came to a very easy and clean solution:
Use the pre-"no type currying" syntax, and add an end of line comment that says "and type clauses cannot be adjacent"

Copy link
Contributor Author

@Sporarum Sporarum Nov 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]
+ DefSig                 ::=  id [DefParamClauses] [DefImplicitClause]    -- and two DefTypeParamClause cannot be adjacent
- DefParamClauses        ::=  DefParamClauseChunk {DefParamClauseChunk}
- DefParamClauseChunk    ::=  [DefTypeParamClause] TermOrUsingParamClause {TermOrUsingParamClause}
- TermOrUsingParamClause ::=  DefTermParamClause
-                          |  UsingParamClause
+ DefParamClauses        ::= DefParamClause { DefParamClause }
+ DefParamClause         ::= DefTypeParamClause
+                          | DefTermParamClause
+                          |  UsingParamClause

I think this syntax is easier to understand than the formal one

@julienrf How should I go about updating the proposal ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open a PR in this repository with your changes.

TermOrUsingParamClause ::= DefTermParamClause
| UsingParamClause
DefTypeParamClause ::= [nl] ‘[’ DefTypeParam {‘,’ DefTypeParam} ‘]’
DefTypeParam ::= {Annotation} id [HkTypeParamClause] TypeParamBounds
DefTermParamClause ::= [nl] ‘(’ [DefTermParams] ‘)’
UsingParamClause ::= [nl] ‘(’ ‘using’ (DefTermParams | FunArgTypes) ‘)’
DefImplicitClause ::= [nl] ‘(’ ‘implicit’ DefTermParams ‘)’
Comment on lines +118 to +119

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also rename

TermOrUsingParamClause ::=  DefTermParamClause
                         |  UsingParamClause
UsingParamClause       ::=  [nl] ‘(’ ‘using’ (DefTermParams | FunArgTypes) ‘)’
DefImplicitClause      ::=  [nl] ‘(’ ‘implicit’ DefTermParams ‘)’

to

DefNonTypeParamClause  ::=  DefTermParamClause |  DefUsingParamClause
DefUsingParamClause    ::=  [nl] ‘(’ ‘using’ (DefTermParams | FunArgTypes) ‘)’
DefImplicitParamClause ::=  [nl] ‘(’ ‘implicit’ DefTermParams ‘)’

DefTermParams ::= DefTermParam {‘,’ DefTermParam}
DefTermParam ::= {Annotation} [‘inline’] Param
Param ::= id ‘:’ ParamType [‘=’ Expr]
~~~

The main rules of interest are `DefParamClauses` and `DefParamClauseChunk`, which now allow any number of type parameter clauses, term parameter clauses and using parameter clauses, in any order as long as there are no two adjacent type clauses.

Note that these are also used for the right-hand side of extension methods, clause interleaving thus also applies to them.

It is worth pointing out that there can still only be at most one implicit parameter clause, which, if present, must be at the end.

The type system and semantics naturally generalize to these new method signatures.

### Restrictions

#### Type Currying
Type parameters cannot be curried (having two type clauses next to each other), as this would allow partial type inference, there is a big concern it would become a recommended norm to _always_ curry type parameters.
Sporarum marked this conversation as resolved.
Show resolved Hide resolved

Note however that, if absolutely necessary, it is still possible to curry type parameters as such: `def foo[A](using A =:= A)[B]`, since the implicit search for `A =:= A` should always succeed.
Sporarum marked this conversation as resolved.
Show resolved Hide resolved
This is sufficiently unwieldy that it is unlikely the above becomes the norm.

#### Class Signatures
Class signatures are unchanged. Classes can still only have at most one type parameter list, which must come first. For example, the following definition is still invalid:
~~~ scala
class Pair[+A](val a: A)[+B](val b: B)
~~~
Class signatures already have limitations compared to def signatures. For example, they must have at least one term parameter list. There is therefore precedent for limiting their expressiveness compared to def parameter lists.

The rationale for this restriction is that classes also define associated types. It is unclear what the type of an instance of `Pair` with `A` and `B` should be. It could be defined as `Foo[A][B]`. That still leaves holes in terms of path-dependent types, as `B`'s definition could not depend on the path `a`. Allowing interleaved type parameters for class definitions is therefore restricted for now. It could be allowed with a follow-up proposal.

Note: As `apply` is a normal method, it is totally possible to define a method `def apply[A](a: A)[B](b: B)` on `Pair`'s companion object, allowing to create instances with `Pair[Int](4)[Char]('c')`.

#### LHS of extension methods
The left hand side of extension methods remains unchanged, since they only have one explicit term clause, and since the type parameters are very rarely passed explicitly, it is not as necessary to have multiple type clauses there.

Currently, Scala 2 can only call/override methods with at most one leading type parameter clause, which already forbids calling extension methods like `extension (x: Int) def bar[A](y: A)`, which desugars to `def bar(x: Int)[A](y: A)`. This proposal does not change this, so methods like `def foo[A](x: A)[B]` will not be callable from Scala 2.

### Compatibility
The proposal is expected to be backward source compatible. New signatures currently do not parse, and typing rules are unchanged for existing signatures.

Backward binary compatibility is straightforward.

Backward TASTy compatibility should be straightforward. The TASTy format is such that we can extend it to support interleaved type parameter lists without added complexity. If necessary, a version check can decide whether to read signatures in the new or old format. For typechecking, like for source compatibility, the typing rules are unchanged for signatures that were valid before.

Of course, libraries that choose to evolve their public API to take advantage of the new signatures may expose incompatibilities.

## Alternatives
The proposal is a natural generalization of method signatures.
We could have extended the proposal to type currying (allowing partial type inference), but have not due to the concerns mentionned in [Restrictions](#restrictions).
This might be the subject of a follow up proposal, if the concerns can be addressed.

As discussed above, we may want to consider generalizing class parameter lists as well. However, we feel it is better to leave that extension to a follow-up proposal, if required.

## Related work
* Pre-SIP: https://contributors.scala-lang.org/t/clause-interweaving-allowing-def-f-t-x-t-u-y-u/5525
* An implementation of the proposal is available as a pull request at https://github.com/lampepfl/dotty/pull/14019

## FAQ
Currently empty.