Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stream basic type #406

Closed
jclark opened this issue Jan 25, 2020 · 17 comments
Closed

Add stream basic type #406

jclark opened this issue Jan 25, 2020 · 17 comments
Assignees
Labels
enhancement Enhancement to language design lang Relates to the Ballerina language specification status/inprogress Fixes are in the process of being added
Milestone

Comments

@jclark
Copy link
Collaborator

jclark commented Jan 25, 2020

stream<T,E> is a new basic type representing a lazily constructed sequence of values of type T; it is iterable with error type E.

Define IteratorResult<T,E> to be record {| T value; |} | E?.

The stream:next(stream<T,E>) lang library function returns IteratorResult<T,E>. Other lang library functions as follows:

  • map - returns a stream and works lazily
  • filter - returns a stream and works lazily
  • foreach
  • reduce
  • iterator - returns Iterator<T,E>

The semantics of all the above are defined in terms of next. There is no length method.

Stream constructor has syntax like an anonymous function.

stream-constructor-expr := "stream" block-function-body

The return type of the function body block for a stream<T,E> is IteratorResult<T,E>.

This is part of #340. It depends on #402.

@jclark jclark added enhancement Enhancement to language design lang Relates to the Ballerina language specification labels Jan 25, 2020
@jclark jclark added this to the 2020R1 milestone Jan 25, 2020
@jclark jclark self-assigned this Jan 25, 2020
@grainier
Copy link

According to the spec;

stream-constructor-expr := "stream" function-body-block

So, in a situation where the stream is defined with var, how we suppose to infer the constraint type?

    int j = 0;

    var intStream = stream {
        j += 1;
        return j;
    };

However, if we take a function or a lambda, the return-type-descriptor of its function-signature gives the return-type. To keep that consistency and overcome above, can't we make the constraint type mandatory, after the "stream" keyword, similar to below?

stream-constructor-expr := "stream" < constraint-type > function-body-block

    int j = 0;

    var intStream = stream<int> {
        j += 1;
        return j;
    };

@jclark
Copy link
Collaborator Author

jclark commented Feb 10, 2020

You made a little mistake in your example: the body of stream is supposed to return the same kind of value as an iterator. So it should be return { value: j }; not return j;.

As currently designed:

int j = 0;

var intStream = stream {
      j += 1;
      return { value: j };
};

would be a compile-time error. This is similar to how arrow functions require a contextually expected type, e.g.

var func = x => x + 1;

is also a compile-time error. So you would have to write

int j = 0;

stream<int> intStream = stream {
        j += 1;
        return { value: j };
};

or (if for some reason you cannot specify a variable type)

int j = 0;

 var intStream = <stream<int>>stream {
        j += 1;
        return { value: j };
};

I recognize that this is not great, but

  • I don't like your suggested solution, because no existing constructor expression in the language uses a type parameter (a type in angle brackets)
  • using var is just a convenience; the normal case is that the variable has a type
  • using a stream constructor is a relatively advanced feature; most ordinary users will consume streams created by others

But I would welcome other ideas....

@jclark
Copy link
Collaborator Author

jclark commented Feb 21, 2020

We need to deal with the possibility that a stream will only be partially consumed, i.e. there is no call to the stream’s next method that returns nil or error. In this case, the stream does not have the opportunity to release resources (file handles, database connections) that it is using. We can provide a close operation on stream, but our stream constructor syntax does not provide a mechanism to specify code that runs on close. I think we may need to change our approach to stream constructor. Any thoughts @grainier?

@jclark
Copy link
Collaborator Author

jclark commented Feb 21, 2020

Another way to do the constructor would be for the spec to define an abstract basic type StreamProvider<T,E> which extends Iterator<T,E> with a close method. Then the stream constructor would be a stream(E) where E is an expression of type StreamProvider<T,E> or Iterator<T,E>. This would avoid the problem with var because the object would explicitly declare the function return types.

@jclark
Copy link
Collaborator Author

jclark commented Feb 23, 2020

We should also allow stream<T,*> just as we allow error<*>. With #426 this will allow e.g.

stream<Customer,*> = dbClient->query("SELECT xxx");

The * will allow the sql:Error to be inferred from the return type of query function, and the Customer will allow the typedesc parameter in the query method to default to a value of typedesc.

@gimantha
Copy link
Contributor

Another way to do the constructor would be for the spec to define an abstract basic type StreamProvider<T,E> which extends Iterator<T,E> with a close method. Then the stream constructor would be a stream(E) where E is an expression of type StreamProvider<T,E> or Iterator<T,E>. This would avoid the problem with var because the object would explicitly declare the function return types.

We have pushed the initial implementation of stream constructor and other stream related functions. In the current implementation, we desugar the stream constructor body to a lambda function and invoke that lambda function, every time the next function is called. So far this works fine, but as you mentioned, when the stream is partially consumed, we don't have a way to release the resources attached with the stream. For example, there is no API to close a stream returned from the new JDBC connector.
If we introduce this new basic type StreamProvider<T,E>, we can have two implementations; an internal implementation extending this StreamProvider<T,E> for stream constructor expressions and a different implementation extending StreamProvider<T,E> for JDBC connector. So this second implementation (which is for JDBC connector) can have its own close function to release the resources.

With this new basic type, will the current stream constructor syntax be changed to stream(StreamProvider<T,E> sp) ?

@jclark
Copy link
Collaborator Author

jclark commented Feb 24, 2020

The stream constructor syntax would become:

stream := "stream" "(" [expression] ")"

where the static type of expression would be required to belong to the built-in abstract object type StreamProvider<T,E> (note that the syntax StreamProvider<T,E> is not legal in Ballerina programs) or the built-in abstract object type Iterator<T,E>. If you leave out expression, you would get an empty stream (which would be of type stream<never,never>). If you supply an Iterator<T,E>, then the close on the stream would just null out the reference to the iterator.

I would expect, as you suggest, to have multiple implementations of a Java interface corresponding to StreamProvider.

As well as close, I suspect we may need to add methods relating to watermarks to support event streaming. This approach gives us the flexibility to do that.

@sameerajayasoma
Copy link
Contributor

sameerajayasoma commented Feb 25, 2020

I see StreamProvider<T, E> as an Iterator<T, E> that is closable.

If we introduce a new abstract object type called "Closable", we can use the term ClosableIterator instead of StreamProvider

    abstract object {
         public close() returns error?;
    }

Since we have a plan to introduce a mechanism to release resources (#427), defining a separate abstract object type will be useful.

@jclark
Copy link
Collaborator Author

jclark commented Feb 25, 2020

It's a spec-internal name, so it doesn't affect the language what it is called.

A Closeable abstract object type for #427 would need to have a __close method to be consistent.

@jclark
Copy link
Collaborator Author

jclark commented Feb 26, 2020

Another approach would be to define stream<T,E> as a predeclared, built-in name for an object type declared in lang.object using a variant of @typeParam. With this approach, new would be used to construct streams as normal with objects.

type Foo object {
    public function next() returns record {| xml value; |}|error? {
      ...
    }
}; 

function main() {
    Foo foo = new();
        var x = new stream<xml,error>(foo);
    // or
    stream<xml,error> x = new(foo);
}

lang.object would then include something along the lines of:

# Like @typeParam but scope is object type definition rather than function definition
# default for scope is "function"
@typeParam { scope: "object" }
type ErrorType error;

@typeParam { scope: "object" }
type ItemType any|error;

// This is for map and is method-scoped
@typeParam
type Type1 any|error;

type EmptyIterator object {
  public function next() returns () {
     return ();
  };
};

// This should probably not be public, i.e. users should stream<T,E>
// The binding from stream<T,E> to object:Stream is in the spec
type Stream object {
    private abstract object {
       public function next() returns record {| ItemType value; |}|ErrorType?;
    } impl;
    public function __init(abstract object { public function next() returns record {| ItemType value; |}|ErrorType?; } impl
                           = new EmptyIterator) {
       self.impl = impl;
    }
    public function __iterator() returns abstract object { public function next() returns record {| ItemType value; |}|ErrorType?; } {
       final var impl = self.impl;
       return new (object {
         public function next() returns record {| ItemType value; |}|ErrorType? {
	    return impl.next();
	 }
       });
    }
    public function next() returns record {| ItemType value; |}|ErrorType? {
      return self.impl.next();
    }
    public function 'map(function(ItemType val) returns Type1 func) returns stream<Type1,ErrorType> = external;
};

Phew!

@jclark
Copy link
Collaborator Author

jclark commented Feb 26, 2020

After talking to @hasithaa, we think making it an object type is doable, but requires compiler changes beyond what is realistic for 1.2. We accordingly suggest the following plan:

For 1.2:

  • stream is a distinct basic type; map, forEach etc functions for stream are defined in lang.stream module (similarly to lang.array)
  • we do not add special stream constructor syntax; instead stream reuses the current new-expr syntax, which is currently used for objects; the spec simply allows the type referenced in a new-expr to refer to a stream type as well as an object type
  • stream has preview status

For 1.3:

  • stream becomes a subtype of object
  • functions like map, forEach on streams become methods defined within the Stream object type

The change from 1.2 and 1.3 should not make a difference to most user code.

@sanjiva, @sameerajayasoma, @grainier What do you think?

@mohanvive
Copy link

@jclark @sameerajayasoma Then, shall we proceed with the above plan? There are some dependent tasks pending on this change.

@sanjiva
Copy link
Contributor

sanjiva commented Feb 26, 2020

Looks good .. +1. Lets go for it :).

@mohanvive
Copy link

Looks good .. +1. Lets go for it :).

Cool. We'll proceed with this..

jclark added a commit that referenced this issue Feb 26, 2020
jclark added a commit that referenced this issue Feb 26, 2020
Fix grammar for stream.
Add placeholder for explanation.
Add reference to lang.stream.
Rewrite stream.bal to match new stream design.
Part of #406.
@jclark jclark added the status/inprogress Fixes are in the process of being added label Feb 26, 2020
@jclark
Copy link
Collaborator Author

jclark commented Mar 10, 2020

@mohanvive @gimantha Changes for stream basic type are in 487b382. Please review and add any problems to this issue.

@grainier
Copy link

@jclark, if I understand correctly, according to the spec stream-type-parameters is optional;

stream-type-descriptor := stream [stream-type-parameters]

So, this means both implicit & explicit declaration of streams should support stream without a type parameter, correct?

type Foo object {
    int i = 0;
    public function next() returns record {| int value; |}? {
        self.i += 1;
        return { value: self.i };
    }
};

stream streamA = new(foo); // [1] implicit
var streamB = new stream(foo); // [2] explicit

Further, according to the spec;

A value belongs to a type stream (without the type-parameter) if it has basic type stream.

Here, can you please clarify a bit on what you meant by basic type stream? And, If the type-parameter is not present, what would be the value type?

@jclark
Copy link
Collaborator Author

jclark commented Mar 10, 2020

Basic type is defined in the spec.

Any stream value belongs to type stream, so it is the same as stream<any|error,error>, just as map is the same as map<any|error>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement to language design lang Relates to the Ballerina language specification status/inprogress Fixes are in the process of being added
Projects
None yet
Development

No branches or pull requests

6 participants