Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protobuf compilation: schema and protocol #66

Merged
merged 98 commits into from
Feb 18, 2019
Merged

Conversation

rafaparadela
Copy link
Member

@rafaparadela rafaparadela commented Feb 13, 2019

Given a proto file, we wanted to generate of Scala code, and this PR does this.

Included goodness:

  • Proto3 and proto2 compatible
  • Infers messages across imported protos
  • Infers services
  • Deleted ScalaPB dependency.
  • Infers nested types
  • Map types
  • Coproducts
  • Add unit test to prove the generation of scala code
  • Update the docs to show an example

This is the example:

Protobuf

Parsing nested .proto files and converting into Scala code

Given these proto files below:

author.proto

syntax = "proto3";
package com.acme;

message Author {
    string name = 1;
    string nick = 2;
}

book.proto

syntax = "proto3";
package com.acme;
import "author.proto";

message Book {
    reserved 4, 8;
    reserved 12 to 15;
    int64 isbn = 1;
    string title = 2;
    repeated Author author = 3;
    BindingType binding_type = 9;
}

message GetBookRequest {
    int64 isbn = 1;
}

message GetBookViaAuthor {
    Author author = 1;
}

message BookStore {
    string name = 1;
    map<int64, string> books = 2;
    repeated Genre genres = 3;

    oneof payment_method {
        int64 credit_card_number = 4;
        int32 cash = 5;
        string iou_note = 6;
        Book barter = 7;
    }
}

enum Genre {
    option allow_alias = true;
    UNKNOWN = 0;
    SCIENCE_FICTION = 1;
    SPECULATIVE_FICTION = 1;
    POETRY = 2;
    SCI_FI = 1;
}

enum BindingType {
    HARDCOVER = 0;
    PAPERBACK = 1;
}

service BookService {
    rpc GetBook (GetBookRequest) returns (Book) {}
    rpc GetBooksViaAuthor (GetBookViaAuthor) returns (stream Book) {}
    rpc GetGreatestBook (stream GetBookRequest) returns (Book) {}
    rpc GetBooks (stream GetBookRequest) returns (stream Book) {}
}

We can parse and convert them into Scala code as:

val source = ParseProto.ProtoSource("book.proto", "resources")
val nativeDescriptor: NativeFile = parseProto[IO].parse(source).unsafeRunSync()

val parseNative: NativeFile => Protocol[Mu[ProtobufF]] = Protocol.fromProto(_)

val parseProtocol: Protocol[Mu[ProtobufF]] => mu.Protocol[Mu[MuF]] =
{ p: Protocol[Mu[ProtobufF]] => mu.Protocol.fromProtobufProto(p) }

val printProtocol: mu.Protocol[Mu[MuF]] => String = { p: mu.Protocol[Mu[MuF]] =>
  higherkindness.skeuomorph.mu.print.proto.print(p)
}

(parseNative andThen parseProtocol andThen printProtocol)(nativeDescriptor))

It would generate:

package com.acme

object book {

  @message final case class Author(name: String, nick: String)
  @message final case class Book(isbn: Long, title: String, author: List[Author], binding_type: BindingType)
  @message final case class GetBookRequest(isbn: Long)
  @message final case class GetBookViaAuthor(author: Author)
  @message final case class BookStore(name: String, books: Map[Long, String], genres: List[Genre], payment_method: Cop[Long :: Int :: String :: Book:: TNil])

  sealed trait Genre
  object Genre {
    case object UNKNOWN extends Genre
    case object SCIENCE_FICTION extends Genre
    case object POETRY extends Genre
  }

  sealed trait BindingType
  object BindingType {
    case object HARDCOVER extends BindingType
    case object PAPERBACK extends BindingType
  }

  @service(Protobuf) trait BookService[F[_]] {
    def GetBook(req: GetBookRequest): F[Book]
    def GetBooksViaAuthor(req: GetBookViaAuthor): Stream[F, Book]
    def GetGreatestBook(req: Stream[F, GetBookRequest]): F[Book]
    def GetBooks(req: Stream[F, GetBookRequest]): Stream[F, Book]
  }
}

rlmark and others added 30 commits November 29, 2018 10:56
…ess/skeuomorph into protobuf_compilation

# Conflicts:
#	src/main/scala/protobuf/schema.scala
@AdrianRaFo
Copy link

Few questions:

sealed trait Genre
  object Genre {
    case object UNKNOWN extends Genre
    case object SCIENCE_FICTION extends Genre
    case object POETRY extends Genre
  }

Why not use scala enum instead?

How do you define an option on proto3?

Should the scala generated code have some imports?

Is this supporting bigdecimal?

@rafaparadela
Copy link
Member Author

@juanpedromoreno Interesting. I barely haven't changed the printer, but yes, I should have detected this issue. I'm on it.

@rafaparadela
Copy link
Member Author

rafaparadela commented Feb 13, 2019

@AdrianRaFo

  • Do we want to replace the ADT by scala Enumeration? @pepegar @juanpedromoreno
  • In proto3, all fields are "optional"
  • Yesterday at some point I was printing the imports in the Scala representation. But it was adding an extra level of complexity dealing with packages, so I decided to follow another approach (similar that Avro's) and I merged the dependent messages in one single protocol.
  def dependency: Printer[String] = konst("import ") *< string >* konst("._")
  • No BigDecimal at the moment.

@AdrianRaFo
Copy link

AdrianRaFo commented Feb 13, 2019

@rafaparadela I know all fields in proto3 are optional but if you are translating them to non-optional Scala fields, how actually can I set a field as optional? On proto2 is easy because you have the keyword optional but, on proto3, you haven't. Unless you're translating them as an Option, what could result a little ugly but more aligned what proto3 really offers.

@rafaparadela
Copy link
Member Author

@juanpedromoreno Solved printer of the operations at 769ff31

@rafaparadela
Copy link
Member Author

@AdrianRaFo The long debate about the lack of optional and required in proto 3, of course, affect our implementation. I don't think that we should, by default, translate all the fields into scala Options. As has been implemented, it supposes that all the fields are required and in the case that you want to have an Option in the scala result, you should set as a coproduct with only one value. Thanks to the optimization made by @paco, will be converted into an Option[T]. I hope this answers your question.

@AdrianRaFo
Copy link

Thanks, rafa just make sure to include into the doc 😉👏

@pepegar
Copy link
Member

pepegar commented Feb 14, 2019

I'd avoid the Enum, I prefer having a proper ADT. @rafaparadela

Copy link
Member

@pepegar pepegar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @rafaparadela, thank you very much to push this forward! Also, thanks to all other people involved, @rlmark @oli-kitty, you rock!

@rafaparadela , I added a couple of comments and suggestions, none of them are a blocker though. If you prefer merging & addressing those later on, I'm OK with that too!

fields
.filter(t => t.hasOneofIndex && t.getOneofIndex == index)
.map(fromFieldDescriptorProto(_, source, files))
.collect { case b: NativeField => b })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this b: NativeField is doing a ClassTag check at runtime. Can you change it with a pattern match like:

Suggested change
.collect { case b: NativeField => b })
.collect { case b@NativeField(_, _, _, _, _, _) => b })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

.map(_.msg)
}

implicit class LabelOps(self: Label) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this implicit class private? I'd like to avoid polluting the namespace in case someone imports NativeField._.

Suggested change
implicit class LabelOps(self: Label) {
private implicit class LabelOps(self: Label) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

def isRepeated: Boolean = self.name() == "LABEL_REPEATED"
}

implicit class JavaListOps[B](self: util.List[B]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Suggested change
implicit class JavaListOps[B](self: util.List[B]) {
private implicit class JavaListOps[B](self: util.List[B]) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0


final case class NativeOneOfField(name: String, tpe: NativeOneOf) extends NativeFieldF

sealed trait NativeDescriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following with the conversation in #64 (comment),

What's the use of this ADT? Is it just intermediate representation that will not be exposed to the user? If we want to keep it, can we make it non-recursive at least? We can throw more recursion schemes if so! 😋

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was probably wrong, but I assumed that if I need to define a Coalgebra[ProtobufF, A] to go from A to ProtobufF[A], then A needs to be recursive. At least, I couldn't express Coalgebra[ProtobufF, FileDescriptorSet] <- being FileDescriptorSet the original Java descriptor for the schema of protobuf.
Actually, NativeDescriptor is not a 1-to-1 representation of FileDescriptorSet, but contains a lot of business logic to aggregate data, and to discard unnecessary fields down the road.
Frankly, I'd love to try Coalgebra[ProtobufF, FileDescriptorSet] again with your help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 7ce48d6

def toTOption(no: NativeOption): ProtobufF.Option =
ProtobufF.Option(no.name, no.value)

def toJson: Algebra[ProtobufF, Json] = Algebra {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we using this for? Is there a JSON representation for Protobuf?

If we're using this to translate to Avro, I believe that the correct path would be:

Trans[ProtobufF, AvroF] and then Algebra[AvroF, Json]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can tackle this in a different PR, I have removed the Algebra[ProtobufF, Json], since is not being used so far.

| def GetBooks(req: Stream[F, GetBookRequest]): Stream[F, Book]
|}
|
|}""".stripMargin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 👏 👏 👏 👏 !!!!!! 🎉 🎉 🎈 🎉!!!

THIS IS AWESOME


final case class NativeOneOfField(name: String, tpe: NativeOneOf) extends NativeFieldF

sealed trait NativeDescriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

file.getServiceList.j2s.map(s => toNativeService(s, files))
)
}
.getOrElse(throw new Exception(s"Could not find descriptors for: $descriptorFileName"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use some of the defined errors here, rather than Exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

val (hasAlias, noAlias) = valuesAndAliases.groupBy(_.getNumber).values.partition(_.lengthCompare(1) > 0)
val separateValueFromAliases: Iterable[(EnumValueDescriptorProto, List[EnumValueDescriptorProto])] = hasAlias.map {
case h :: t => (h, t)
case _ => throw new Exception(s"Wrong number of aliases")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: Could we use some of the defined errors here, rather than Exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

maybeKey <- getMapField(maybeMsg, "key")
maybeValue <- getMapField(maybeMsg, "value")
} yield NativeMap(fromFieldType(maybeKey, files), fromFieldType(maybeValue, files)))
.getOrElse(throw new Exception(s"Could not find map entry for: $name"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here: Could we use some of the defined errors here, rather than Exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

.filter(t => t.hasOneofIndex && t.getOneofIndex == index)
.map(fromFieldDescriptorProto(_, source, files))
.collect { case b: NativeField => b })
.getOrElse(throw new Exception(s"Empty set of fields in OneOf: ${oneof.getName}"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here: Could we use some of the defined errors here, rather than Exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved at 78581e0

@juanpedromoreno
Copy link
Member

Regarding #66 (comment), I'd prefer ADT for enums too.

rafaparadela and others added 6 commits February 14, 2019 13:51
# Conflicts:
#	src/main/scala/higherkindness/skeuomorph/protobuf/Native.scala
#	src/main/scala/higherkindness/skeuomorph/protobuf/ParseProto.scala
#	src/main/scala/higherkindness/skeuomorph/protobuf/schema.scala
@rafaparadela
Copy link
Member Author

@juanpedromoreno Finally the intermediate NativeDescriptor has been removed with the great help of @pepegar . I think this is ready to be merged (once we got a green from CI)

Copy link
Member

@pepegar pepegar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome to me! 👍

@rafaparadela rafaparadela merged commit 6688e3c into master Feb 18, 2019
@rafaparadela rafaparadela deleted the all-protobuf-support branch February 18, 2019 16:35
@juanpedromoreno juanpedromoreno mentioned this pull request Mar 18, 2019
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants