Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generalized interface generation architecture proposal #310

Open
wants to merge 7 commits into
base: gh-pages
Choose a base branch
from

Conversation

hidmic
Copy link
Contributor

@hidmic hidmic commented Jan 11, 2021

First step towards addressing ros2/rosidl#560 (might be worth going through the discussion there before reviewing). This is a first draft proposal to get some eyes on it. Lacks csp schema extension proposal.

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@hidmic hidmic changed the title Add generalized interface generation architecture draft proposal Add generalized interface generation architecture proposal Jan 11, 2021
@hidmic
Copy link
Contributor Author

hidmic commented Jan 11, 2021

@azeey @sloretz @IanTheEngineer FYI

@hidmic
Copy link
Contributor Author

hidmic commented Jan 11, 2021

@dirk-thomas if you happen to have a bit of free time (if that even exists 😅), would you take a look and share your thoughts? I'd very much appreciate it.

@koonpeng
Copy link

Should rosidl_build_configure generate the build system files directly instead of the common build specifiction (CPS) files?

The rational is that node-gyp for example contains many platform specific and other conditional configurations, trying to correctly translate all the conditionals into a CPS does not look like a simple job, not to mention the effort needed to maintain it so that it correctly tracks new configurations node-gyp adds in the future.

If we assume that language bindings are strongly tied to the build system (this seems to be the case for most languages. cython, nodejs, cgo all use specialized build tools afaik), it makes sense for rosidl_build_configure to generate files specific to a language + build system. It makes writing plugins much simpler as you don't have to reverse engineer all the configurations these specialized tools implicitly provides. The downside is that each plugin will only be tailored for a specific language + build system combination.

@ivanpauno
Copy link
Member

I second @koonpeng, I think that introducing a common build specification (cbs) makes everything more complex with little extra value.
And without introducing that "cbs", I don't see much value of splitting the configuration step from the build step, I would only have one step.

IMO having a non-cmake plugin mechanism would be a huge improvement, but I think it's ok if some plugins continue generating cmake (or files for whatever build system is convenient for that language).


All positional arguments are paths to interface definition files.
If no output directory is specified, the current working directory is used.
If no type representation / type support generators are specified, all available generators available are used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If no type representation / type support generators are specified, all available generators available are used.
If no type representation / type support generators are specified, all available generators are used.

@sloretz
Copy link
Contributor

sloretz commented Jan 12, 2021

If we assume that language bindings are strongly tied to the build system (this seems to be the case for most languages. cython, nodejs, cgo all use specialized build tools afaik), it makes sense for rosidl_build_configure to generate files specific to a language + build system

@koonpeng It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

@hidmic
Copy link
Contributor Author

hidmic commented Jan 12, 2021

The rational is that node-gyp for example contains many platform specific and other conditional configurations

Would you mind sharing an example? I'd like to understand where the difficulties show up.

The downside is that each plugin will only be tailored for a specific language + build system combination.

That's precisely the problem that this is trying to address. It is definitely simpler for generators to be coupled with a build system, but then you're forced to port or extend them [1]. As of today, we have three (3) type representation generators and eight (8) type support generators in the core alone.

[1] As I see it, if a given generator allows extension to support other build systems, then some form of intermediate representation must necessarily exist. If we can standardize that representation, we get massive port effort reduction.

I don't see much value of splitting the configuration step from the build step, I would only have one step.

I disagree. Building interfaces outside the package build system requires additional logic to ensure proper incremental builds, to use generated artifacts (e.g. shared libraries) within the package build, and to land them along with their metadata (e.g. CMake exported targets) in the package install space.

Copy link
Contributor

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have a few more comments, but I'm not sure I understand the flow correctly. I think the design doc could really use a diagram of how all the data flows. I'm not sure I understand when each tool is invoked either.

Here's what I think the design doc is describing

+----+ +-----+ +-------+ +----+                            +----+ +-----+ +-------+ +----+                                   +-------+  +-------+  +-----------+ +----+ +------------+
|.msg| |.srv | |.action| |.idl|                            |.msg| |.srv | |.action| |.idl|                                   |headers|  |sources|  |directories| |.cps| |build system|
+---++ +--+--+ ++------+ ++---+                            +---++ +--+--+ ++------+ ++---+                                   +--+----+  ++------+  +-----+-----+ +--+-+ +----+-------+
    |     |     |         |                                    |     |     |         |                                          |        |               |          |        |
+-------------------------------------------------------+   +-------------------------------------------------------------+     |        |               |          |        |
|   |     |     |         |                             |   |  |     |     |         |                                    |   +-------------------------------------------------+
|   v     v     v         |                             |   |  v     v     v         |                                    |   | |        |               |          |        |  |
|  ++-----+-----+-+       |                             |   | ++-----+-----+-+       |                                    |   | +--------+---+-----------+----------+        |  |
|  |rosidl_adapter|       |                             |   | |rosidl_adapter|       |                                    |   |              |                               |  |
|  +----------+---+       | rosidl_generate             |   | +----------+---+       | rosidl_build_configure             |   |              |        ament_meta_build       |  |
|             |           | rosidle_typesupport_generate|   |            |           | rosidle_typesupport_build_configure|   |              |                               |  |
|             v           |                             |   |            v           |                                    |   |              v                               |  |
|           +-+--+        |                             |   |          +-+--+        |                                    |   |       +------+---------+                     |  |
|           |.idl|        |                             |   |          |.idl|        |                                    |   |       |select generator+<--------------------+  |
|           +--+-+        |                             |   |          +--+-+        |                                    |   |       +-+--+--------+-++                        |
|              |          |                             |   |             |          |                                    |   |         |  |        | |                         |
|              +--------->+                             |   |             +--------->+                                    |   |         |  |        | |                         |
|                         |                             |   |                        |                                    |   |     +---+  |        | |                         |
|             +-------+---+---+-------+                 |   |            +-------+---+---+-------+                        |   |     |      |        | |                         |
|             |       |       |       |                 |   |            |       |       |       |                        |   |     |      |        | +---------+               |
|             v       v       v       v                 |   |            v       v       v       v                        |   |     |      |        |           |               |
|           +-+--+  +-+--+  +-+-+  +--+-+               |   |          +-+--+  +-+--+  +-+-+  +--+-+                      |   |     v      v        v           v               |
|           |gen1|  |gen2|  |...|  |genN|               |   |          |gen1|  |gen2|  |...|  |genN|                      |   |  +--+--+ +-+---+ +--+-------+ +-+-+             |
|           +-+--+  +-+--+  ++--+  +-+--+               |   |          +-+--+  +-+--+  ++--+  +-+--+                      |   |  |cmake| |bazel| |setuptools| |...|             |
|             |       |      |       |                  |   |            |       |      |       |                         |   |  +--+--+ +--+--+ +----+-----+ +-+-+             |
|             |       v      v       |                  |   |            |       v      v       |                         |   |     |       |         |         |               |
|             +-------+---+--+-------+                  |   |            +-------+---+--+-------+                         |   |     |       |         |         |               |
|                         |                             |   |                        |                                    |   |     |       |         |         |               |
+-------------------------------------------------------+   +-------------------------------------------------------------+   +-------------------------------------------------+
                          |                                                          |                                              |       |         |         |
            +---------+---+----------+                                               v                                              v       v         v         v
            |         |              |                                            +--+-+                                          +-+-+   +-+-+     +-+-+     +-+-+
            v         v              v                                            |.cps|                                          |???|   |???|     |???|     |???|
        +---+---+  +--+----+  +------+----+                                       +----+                                          +---+   +---+     +---+     +---+
        |headers|  |sources|  |directories|
        +-------+  +-------+  +-----------+

It seems like the .cps generation is separated from the code generation. When is each called? In a CMake project when are the three tools called? The cps created called during CMake configure time, the code generated at build time, and ament_meta_build at build time? What does ament_meta_build produce? Who uses it's output?


### A. Common Build Specification

This specification is based on the [Common Package Specification](https://mwoehlke.github.io/cps/) format, a simple and relatively standard format that describes packages for downstream consumption in a build system and build tool agnostic way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rational is that node-gyp for example contains many platform specific and other conditional configurations, trying to correctly translate all the conditionals into a CPS does not look like a simple job

It looks like cps limits itself to describe only a subset of configurations

[...] the CPS file is not intended to cover all possible configurations of a package; rather, it is meant to be generated by the build system and to describe the artifacts of one or more extant configurations for a single architecture.

@koonpeng would it be feasible to generate multiple CPS files that can be selected from based on those conditions, or are there too many conditionals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be feasible to generate multiple CPS files that can be selected from based on those conditions

That's the path forward with this. There will be conditionals and that's fine, it's just that you cannot use the build system to resolve (most of) those conditionals.

articles/generalized_interface_generation.md Outdated Show resolved Hide resolved
articles/generalized_interface_generation.md Show resolved Hide resolved
articles/generalized_interface_generation.md Outdated Show resolved Hide resolved
articles/generalized_interface_generation.md Outdated Show resolved Hide resolved
articles/generalized_interface_generation.md Outdated Show resolved Hide resolved
articles/generalized_interface_generation.md Outdated Show resolved Hide resolved
ament_meta_build ((-o|--output-file) PATH)? ((-b|--build-system) IDENTIFIER)? PATH?
```

Its only positional argument is a path to a common build specification file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming cps files have relative paths, is there another assumed argument of the working directory from which to resolve those paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I wrote this under the assumption that the build system or build system generator this tool generates code for already has a default path resolution logic (like CMake with its binary and source trees). But perhaps that's not enough and this tool could be given that information. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the CMake output plugin to ament_meta_build would assume relative paths in Sources come from CMAKE_CURRENT_SOURCE_DIR, things from Artifacts are relative to CMAKE_CURRENT_BINARY_DIR, and installed things from Distribution are relative to CMAKE_INSTALL_PREFIX?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one thing to correct:

That is, the CMake output plugin to ament_meta_build would assume relative paths in Sources come from CMAKE_CURRENT_SOURCE_DIR or CMAKE_CURRENT_BINARY_DIR, things from Artifacts are relative to CMAKE_CURRENT_BINARY_DIR, and installed things from Distribution are relative to CMAKE_INSTALL_PREFIX.

This is because a component's artifacts may be another component's sources. The lack of a distinction between source and build trees is because not all build system support this notion. Build specifications must make sure there's no ambiguity upon source lookup.

@koonpeng
Copy link

koonpeng commented Jan 13, 2021

Would you mind sharing an example? I'd like to understand where the difficulties show up.

I did some research on the build systems for some language bindings.

  1. nodejs: uses node-gyp, lots of conditionals https://github.com/nodejs/node-gyp/blob/master/addon.gypi.

  2. electron: uses node-gyp with special options. https://www.electronjs.org/docs/tutorial/using-native-node-modules#manually-building-for-electron. Arguments like --target and --dist-url might not be possible to represent in CPS.

  3. cpython: https://github.com/python/cpython/tree/master/Lib/distutils/, many platform specific logic, may also insert implicit flags.

  4. cgo: no build system files, compiler flags are defined as comments for each source file and built by the go compiler. The go compiler probably adds implicit flags as well.

I'm sure it can work even if we don't fully follow the logic of each specialized tool but then we won't be doing 1:1 builds.

It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

We can still have multiple build system for each language by having multiple plugins. I'm not sure if there is any worth with the flexibility to generate files for any language bindings to any build system. We rarely want to build nodejs bindings with maven for example.

would it be feasible to generate multiple CPS files that can be selected from based on those conditions, or are there too many conditionals?

the CPS file is not intended to cover all possible configurations of a package; rather, it is meant to be generated by the build system and to describe the artifacts of one or more extant configurations for a single architecture.

I'm inclined to say that there are too many conditions. From what I see, it is not just platform specific conditions, there are often conditions like the version of the interpreter, the existence of certain libraries, the location of libraries, libraries specified by name or relative path or absolute path etc. Given all these conditionals and the footnote on CPS that it's intended use is to be generated by a build system for a particular environment, the best way would be for rosidl_build_configure to generate the build system files, then use the build system to generate the CPS file. But then I would argue that the 2 steps should be different tools.

@ivanpauno
Copy link
Member

I disagree. Building interfaces outside the package build system requires additional logic to ensure proper incremental builds, to use generated artifacts (e.g. shared libraries) within the package build, and to land them along with their metadata (e.g. CMake exported targets) in the package install space.

I'm not sure what you mean.

That's precisely the problem that this is trying to address. It is definitely simpler for generators to be coupled with a build system, but then you're forced to port or extend them [1]. As of today, we have three (3) type representation generators and eight (8) type support generators in the core alone.

[1] As I see it, if a given generator allows extension to support other build systems, then some form of intermediate representation must necessarily exist. If we can standardize that representation, we get massive port effort reduction.

By an extension, do you mean for example rosidl_generator_py using the output of rosidl_generator_c?
I don't agree that an intermediate representation "must" exist, most build system have a mechanism to import an external dependencies (if not all of them).

@koonpeng It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right?
I don't understand the value of building the generated code with bazel instead of cmake, what is it?

@hidmic
Copy link
Contributor Author

hidmic commented Jan 13, 2021

I think the design doc could really use a diagram of how all the data flows.

@sloretz Agreed.

It seems like the .cps generation is separated from the code generation. When is each called? In a CMake project when are the three tools called? The cps created called during CMake configure time, the code generated at build time, and ament_meta_build at build time? What does ament_meta_build produce? Who uses it's output?

@sloretz Build specifications use source code generators. In a CMake project, one would invoke rosidl_*_build_configure followed by ament_meta_build during the configuration stage to generate CMake code that in turn generates and builds source code:

In a CMakeLists.txt

execute_process(COMMAND rosidl_*_build_configure OUTPUT_FILE build.spec)
execute_process(COMMAND ament_meta_build -b cmake INPUT_FILE build.spec OUTPUT_FILE build.cmake)
include(build.cmake)

In build.cmake

add_custom_command(
    OUTPUT ...
    COMMAND rosidl_*_generate ...
)

@hidmic
Copy link
Contributor Author

hidmic commented Jan 13, 2021

I did some research on the build systems for some language bindings.

@koonpeng Thank you !

  1. nodejs: uses node-gyp, lots of conditionals https://github.com/nodejs/node-gyp/blob/master/addon.gypi.

It has several conditionals, but they seem rather simple to resolve. The format itself is pretty simple too: the way it lists compiler and linker flags is almost a 1-to-1 mapping to the generic build spec I'm proposing.

  1. electron: uses node-gyp with special options. https://www.electronjs.org/docs/tutorial/using-native-node-modules#manually-building-for-electron. Arguments like --target and --dist-url might not be possible to represent in CPS.

Those arguments seem specific to electron i.e. the build system generator. I don't think that's a requirement on the build specification, but on the translation of that specification. One build specification could be translated to different .gypi files, each suitable for an specific version of electron. The same would apply for different versions of CMake (a translation that's compatible with >CMake 3.5, a translation that requires >CMake 3.10).

cpython: https://github.com/python/cpython/tree/master/Lib/distutils/, many platform specific logic, may also insert implicit flags.

Hmm, for Python I'd think the build specification would be translated to a setup.py suitable for distutils invocation.

cgo: no build system files, compiler flags are defined as comments for each source file and built by the go compiler. The go compiler probably adds implicit flags as well.

This one's interesting. In this case, the build specification would only state that a build system has to run the go compiler on the set of generated files. Nothing more.


Just to be super clear, the build spec is not a substitute for build systems. It's a blueprint to generate what each build system needs (CMake code, Starlark code, Groovy code, setup.py files, .gypi files, etc.).


We can still have multiple build system for each language by having multiple plugins. I'm not sure if there is any worth with the flexibility to generate files for any language bindings to any build system. We rarely want to build nodejs bindings with maven for example.

@koonpeng That's an option. If it turns out we cannot afford a standard build specification. I'd really try to avoid per generator templates (and the combinatorial explosion that comes along).

Given all these conditionals and the footnote on CPS that it's intended use is to be generated by a build system for a particular environment, the best way would be for rosidl_build_configure to generate the build system files, then use the build system to generate the CPS file. But then I would argue that the 2 steps should be different tools.

I think we're mixing things up a bit. CPS is a format that describes package installs. Here I'm proposing a variation of the CPS format to describe package builds. Build configuration generates the latter, not CPS.

By an extension, do you mean for example rosidl_generator_py using the output of rosidl_generator_c?
I don't agree that an intermediate representation "must" exist, most build system have a mechanism to import an external dependencies (if not all of them).

@ivanpauno I mean that if we have build system-specific plugins for rosidl_generator_c, it's quite likely we'll have some intermediate representation (e.g. a Python dict) to resolve templates. The build specification I'm proposing is a (non-trivial) generalization of that.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right?
I don't understand the value of building the generated code with bazel instead of cmake, what is it?

@ivanpauno some build system A can delegate to another build system B, sure. But (a) you need to properly embed B in A (ensuring rebuilds keep working; pushing some of A's output into B, specially if a call to B needs to build on top of a previous one; pulling B's output into A such that it can be used during the build and on install, etc.), (b) you force a dependency on B, and (c) there's a performance penalty in doing (a). For Bazel, properly embedding CMake also means staying within the build sandbox.

@koonpeng
Copy link

Hmm, for Python I'd think the build specification would be translated to a setup.py suitable for distutils invocation.

What if I run ament_meta_build with --build-system node-gyp? It must know all the conditionals and implicit flags added by distutils to generate the .gypi file, we probably also need to know all conditionals and implicit flags of node-gyp to unset those that are not compatible for a python module. Or it could still generate setup.py and use a gypi that delegates to distutils, but then that would create the same disadvantages discussed between cmake/bazel delegation.

@hidmic
Copy link
Contributor Author

hidmic commented Jan 14, 2021

What if I run ament_meta_build with --build-system node-gyp? It must know all the conditionals and implicit flags added by distutils to generate the .gypi file, we probably also need to know all conditionals and implicit flags of node-gyp to unset those that are not compatible for a python module. Or it could still generate setup.py and use a gypi that delegates to distutils, but then that would create the same disadvantages discussed between cmake/bazel delegation.

@koonpeng Yours is a good example. Correct build spec granularity is key. While you could specify a shared library build for a CPython extension, I'd rather treat it as a component type of its own. That way, we push build details to each ament_meta_build plugin, for which there will be infrastructure (or not, in which case the implementation can delegate, as you say, or not support that component type at all).

@koonpeng
Copy link

That way, we push build details to each ament_meta_build plugin, for which there will be infrastructure (or not, in which case the implementation can delegate, as you say, or not support that component type at all).

@hidmic I think I may have misunderstood the goal of build system plugins, I thought it was to support generating build system files for any language with CBS, but is it correct that the goal is actually just to support many code generator for languages they are built for? A build system plugin may choose to support multiple languages but most commonly they would only support one (node-gyp for nodejs, distutils for python). You still get the advantage for being able to generate build system files for multiple code generator plugins of one language, e.g. one python code generator may require c++14, but a new fancy one may require c++20.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right?
I don't understand the value of building the generated code with bazel instead of cmake, what is it?

@ivanpauno An example of the problem I faced is that node-gyp knows how to build a node module, but doesn't know the link libraries. cmake knows the link libraries but doesn't know how to build a node module. There are some ways around this like node-gyp -> cmake -> node-gyp but things start to get ugly.

@hidmic
Copy link
Contributor Author

hidmic commented Jan 15, 2021

I thought it was to support generating build system files for any language with CBS, but is it correct that the goal is actually just to support many code generator for languages they are built for? A build system plugin may choose to support multiple languages but most commonly they would only support one (node-gyp for nodejs, distutils for python). You still get the advantage for being able to generate build system files for multiple code generator plugins of one language, e.g. one python code generator may require c++14, but a new fancy one may require c++20.

I think we're close. Clearly, the design document needs better, more thorough explanations (and examples, many examples).

As I see it, build system adapter plugins (i.e. ament_meta_build plugins that read CBS and output build systems) deal with components. Some components may be language-agnostic, like a binary shared library, an executable, or an arbitrary collection of files [1], and some components may be specific to a given language, like a Python package or a Java jar file.

It is up to each plugin whether and how it supports a given component type. For example:

  • A node-gyp plugin could choose to deal with Node.js packages and bindings natively, and not support any other component type.
  • A cmake plugin could choose to deal with binary static/shared libraries, executables (where the supported languages are those that the underlying compiler can handle e.g. C, C++, Fortran, Objective-C to name a few gcc supports), and CPython bindings natively.
  • A distutils plugin could delegate everything that's not a Python package nor binding to cmake (i.e. literally calling cmake before/during/after the build).

There's no restriction here, only a path for proper integration. Some plugins will be better at it than others.

[1] I have not fleshed out the schema yet, but it is my intention to allow source code generators to force a build system via arbitrary command support in CBS. It completely defeats the purpose of ament_meta_build, but it'll make the transition smoother. It also lowers the bar for folks that just need to get things done (e.g. someone with an unsupported, snowflake build tool and tight schedules).

@hidmic
Copy link
Contributor Author

hidmic commented Jan 15, 2021

An example of the problem I faced is that node-gyp knows how to build a node module, but doesn't know the link libraries. cmake knows the link libraries but doesn't know how to build a node module.

I will say that this design does not fully address that problem. node-gyp will still have to figure out how to get that information on its own. I have another design in mind, to get ament to export CPS files for cross build-system package consumption, but that's orthogonal to the interface generation pipeline.

@peterpolidoro
Copy link

Could it make sense to use something like a GNU Guix Package as a generalized interface, using lisp s-expressions rather than JSON to describe the package?

@hidmic
Copy link
Contributor Author

hidmic commented Jan 15, 2021

@peterpolidoro Interesting, I didn't know about Guix. I would have to look into it in more depth. I will say that revamping ROS 2's package management system is way massively out of scope here, but we can entertain the idea of using code for tool-agnostic package build (and install) descriptions. IMO a good description should be general yet easy to consume (low implementation effort, soft dependencies if any, human readable [1]). Would a build system be able to consume it? Would a tool be able to generate a build system out of it? Can a scrapper consume it?

[1] Which makes me I'm a bit wary about lisp. I don't have anything against it, but most people see parenthesis and run 😅.

@peterpolidoro
Copy link

peterpolidoro commented Jan 16, 2021

Take a good look at Guix before running away screaming from the parentheses.
Here is an interesting video about it for example: Guix in the age of containers
It may not be appropriate for this particular topic and I would never presume to suggest revamping the entire ROS package management system. Although you may be tempted when you see what it can do. The benefits of functional package management with the ability to rollback versions and create minimal containers are kind of mind blowing once you have used it for a while. It generalizes packages written in any language. Writing package declarations in lisp seems crazy at first, but it is actually super elegant and powerful.

@koonpeng
Copy link

I will say that this design does not fully address that problem. node-gyp will still have to figure out how to get that information on its own. I have another design in mind, to get ament to export CPS files for cross build-system package consumption, but that's orthogonal to the interface generation pipeline.

@hidmic how about providing the plugins a function to resolve a message/package's required build flags? This function could use the CPS files or any other methods under the hood. As it is now, only cmake knows the build flags so the only way for any other build system to build rosidl related targets is to delegate to cmake, that wouldn't make this tool very helpful.

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@hidmic
Copy link
Contributor Author

hidmic commented Jan 18, 2021

I've address some comments and questions by clarifying the proposal. It's still lacking a nice diagram and examples. I'll add those next.

@hidmic
Copy link
Contributor Author

hidmic commented Jan 18, 2021

how about providing the plugins a function to resolve a message/package's required build flags?

@koonpeng at the package-level, that information can be propagated by the build system. I understand your use case, but as I said in ros2/rosidl#560 (comment) what you describe is not a generator if we go by its current definition (see architecture review in this proposal), and I intend to scope this work to that definition.

Perhaps you can get ahead of me and kickstart a design to better support multi build-system ROS workspaces? E.g. by exporting package information in a tool-agnostic format like CPS.

The benefits of functional package management with the ability to rollback versions and create minimal containers are kind of mind blowing once you have used it for a while. It generalizes packages written in any language. Writing package declarations in lisp seems crazy at first, but it is actually super elegant and powerful.

@peterpolidoro It is quite cool. But I think it doesn't solve the problem we're trying to address here. This generalization is meant to support users that are already committed to (or constrained by) some build-system and now they want to use ROS 2 interfaces. Doing that right now means porting a lot of CMake logic, or just let CMake do the job (which is not nearly as easy nor convenient as it sounds).

Guix (and please correct me if I'm wrong) looks like a very interesting alternative to the typical ROS workspace-based workflow. And yes, it could be used as a vehicle to export package information across build-systems. But that's a solution for a different problem.

That is not to say that it wouldn't be cool if you were willing to propose a design to build ROS packages using Guix. Not sure how the @ros2/team feels about this though.

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@hidmic
Copy link
Contributor Author

hidmic commented Jan 19, 2021

FYI I added pipeline diagrams in 78cd763. Would a data-flow diagram that shows how the current architecture can be cast into the proposed one help?

Copy link
Member

@jacobperron jacobperron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits:

  • Use one sentence per line
  • For posterity, use permalinks for GitHub references


From a structural point-of-view, the `rosidl` interface generation pipeline is comprised of:

- **Generator packages**. These packages provide tooling to generate language-specific, in-memory representations of ROS interfaces. Common runtime (i.e. interface agnostic) functionality is conventionally but not necessarily split into a separate package (e.g. [`rosidl_runtime_c`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_c), [`rosidl_runtime_cpp`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_cpp)). It is not unusual for a generator to use another generator's output via language-specific binding machinery (e.g. [`rosidl_generator_py`](https://github.com/ros2/rosidl_python/blob/master/rosidl_generator_py/), which uses Python C bindings to wrap [`rosidl_generator_c`](https://github.com/ros2/rosidl/tree/master/rosidl_generator_c) representations), although this dependency is not explicit: generator packages do depend on each other but it is assumed that their generated code will all be built within the same package so that build system dependencies can be established.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Generator packages**. These packages provide tooling to generate language-specific, in-memory representations of ROS interfaces. Common runtime (i.e. interface agnostic) functionality is conventionally but not necessarily split into a separate package (e.g. [`rosidl_runtime_c`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_c), [`rosidl_runtime_cpp`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_cpp)). It is not unusual for a generator to use another generator's output via language-specific binding machinery (e.g. [`rosidl_generator_py`](https://github.com/ros2/rosidl_python/blob/master/rosidl_generator_py/), which uses Python C bindings to wrap [`rosidl_generator_c`](https://github.com/ros2/rosidl/tree/master/rosidl_generator_c) representations), although this dependency is not explicit: generator packages do depend on each other but it is assumed that their generated code will all be built within the same package so that build system dependencies can be established.
- **Generator packages**. These packages provide tooling to generate language-specific, in-memory representations of ROS interfaces. Common runtime (i.e. interface agnostic) functionality is conventionally, but not necessarily, split into a separate package (e.g. [`rosidl_runtime_c`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_c), [`rosidl_runtime_cpp`](https://github.com/ros2/rosidl/tree/master/rosidl_runtime_cpp)). It is not unusual for a generator to use another generator's output via language-specific binding machinery (e.g. [`rosidl_generator_py`](https://github.com/ros2/rosidl_python/blob/master/rosidl_generator_py/), which uses Python C bindings to wrap [`rosidl_generator_c`](https://github.com/ros2/rosidl/tree/master/rosidl_generator_c) representations), although this dependency is not explicit: generator packages do depend on each other but it is assumed that their generated code will all be built within the same package so that build system dependencies can be established.

@jacobperron
Copy link
Member

jacobperron commented Jan 19, 2021

Would a data-flow diagram that shows how the current architecture can be cast into the proposed one help?

+1
Also matching the proposed tools to the stage in the pipeline would make it more clear I think.

I'm still a little fuzzy on the proposal. In practice, it sounds like we would be replacing all of the CMake logic with a set of Python tools that others can extend via plugins. Here's a list of the proposed tools with inputs and outputs as I understand:

  • rosidl_generate + rosidl_typesupport_generate
    • Input: IDL files
    • Output: generated language-specific code
  • rosidl_build_configure + rosidl_typesupport_build_configure
    • Input: IDL files
    • Output: generates CBS files
  • ament_meta_build
    • Input: CBS files
    • Output: generated build system files

Does that look correct?

As an exercise, I try to map these to plugins necessary for Java support:

  • rosidl_generate_java
    • Generates Java and native C code (*.java, *.c, *.h)
  • rosidl_build_configure_java
    • Lists expected source files (*.java, *.c, *.h)
    • Describe dependencies of the generated code (e.g. rcljava_common, rcl_interfaces, rosidl_generate_c)
      • This seems odd since we're listing dependencies of rosidl_generate_java in this separate package
      • I don't know what else would happen here?
  • ament_meta_build_java-gradle
    • Generates Gradle package files for building the generated source files.
  • Optionally, we could also have ament_meta_build_java-cmake
    • Generates CMake packages files for building the generated source files.

Am I somewhat close to what you are imagining?

@hidmic
Copy link
Contributor Author

hidmic commented Jan 20, 2021

Also matching the proposed tools to the stage in the pipeline would make it more clear I think.

+1

Am I somewhat close to what you are imagining?

You're spot on.

This seems odd since we're listing dependencies of rosidl_generate_java in this separate package

Code generation and build specification/configuration are tightly coupled. Today both live on the same packages (as python generators and ament_cmake extensions). I expect plugins to follow the same pattern.

I don't know what else would happen here?

Describe what the build system has to do to generate, build, and install generated artifacts e.g.: generate source files calling the code generation tooling, then build them into a shared library, then install that shared library somewhere.

@jacobperron
Copy link
Member

Describe what the build system has to do to generate, build, and install generated artifacts e.g.: generate source files calling the code generation tooling, then build them into a shared library, then install that shared library somewhere.

Okay, thanks. I guess in my example I wouldn't supply an exact build command, since it would be dependent on what system is chosen (CMake or Gradle), but we could supply things like Java version and compile flags. Thanks for clarifying.

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@hidmic
Copy link
Contributor Author

hidmic commented Jan 25, 2021

@sloretz @jacobperron I've updated the proposal, added a draft CBS schema, added examples. PTAL!

@hidmic
Copy link
Contributor Author

hidmic commented Jan 25, 2021

I guess in my example I wouldn't supply an exact build command, since it would be dependent on what system is chosen (CMake or Gradle), but we could supply things like Java version and compile flags.

That's the idea. In most cases, one shouldn't specify a command but an appropriate component type. It's the meta builder plugin's responsibility to figure out how to build that component type later on.

@hidmic hidmic mentioned this pull request Jan 26, 2021
21 tasks
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@niosus
Copy link

niosus commented Feb 4, 2021

@hidmic @jacobperron guys, I hate to hijack the discussion here but I think we at Apex.AI did something very relevant to this and I wanted to convey how important I find this proposal to be. We did an experimental bazelization of our internal ROS2 fork with some pretty neat results. Bazel is not important here though. The main part is that I had do dig through the whole message generation pipeline and compose it in such a way that bazel could build all of the messages natively. So I just want to support the statement that:

In most cases, one shouldn't specify a command but an appropriate component type. It's the meta builder plugin's responsibility to figure out how to build that component type later on.

It would have made my last couple of weeks much easier if there would be a single configurable binary (or script for that matter) that would spit out all the headers, sources and additional files that I would then just need to pipe into my build system of choice.

If you guys need any specific help (I know I'm late to the party) please do not hesitate to tag me.

@hidmic hidmic marked this pull request as ready for review February 9, 2021 13:25
articles/generalized_interface_generation.md Show resolved Hide resolved
articles/generalized_interface_generation.md Show resolved Hide resolved
"Configurations": [ "Optimized", "Debug" ],
"Default-Components": [ "c" ],
"Components": {
"idls": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the component names arbitrary? It seems like the component names are tied to the build-configure plugins. That is, c_code below knows their is a build-configure plugin that will create a component called idls, and it will depend on that.

What if there are no .msg/.srv/.action files to convert? Will .idls still exist and be a noop, or does that mean there will be no idls component? If the latter, how does the build-configure plugin generating c_code know then to not include a dependency on .idls (assuming the order that build-configure plugins are invoked is random)?

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the component names arbitrary?

They are, though consistency is required to establish dependencies between them (e.g. to build Python generated code, C generated code may be necessary).

There's also something to be said about components shared by multiple build specifications. It is likely that components like idls will be required by many source code generators and since each build specification is carried out independently, replication will ensue. By keeping names consistent, deduplication becomes feasible when multiple generators are configured at once (e.g. when targeting all available generators by not specifying any on rosidl build-configure).

What if there are no .msg/.srv/.action files to convert? Will .idls still exist and be a noop, or does that mean there will be no idls component?

Either would work. During build specification, the tool has access to sources, so it can adapt the resulting .cbs accordingly (e.g. adding or removing c_code dependencies).

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming the order that build-configure plugins are invoked is random

One note about this. Some topological ordering will necessarily happen somewhere: in rosidl build-configure by ordering plugins or when merging build specifications, in the build system after build specifications have been transformed, in the build tool if the user splits build specifications across multiple packages. I'm inclined to do so upon merging build specifications, simply to make things easier for the build system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are no .msg/.srv/.action files to convert? Will .idls still exist and be a noop, or does that mean there will be no idls component?

Either would work. During build specification, the tool has access to sources, so it can adapt the resulting .cbs accordingly (e.g. adding or removing c_code dependencies).

By keeping names consistent, deduplication becomes feasible when multiple generators are configured at once (e.g. when targeting all available generators by not specifying any on rosidl build-configure).

So if all the build-configure plugins know their respective generator plugins require .idl files but only .msg files are available, then the build-configure plugins all depend on :idls and add an idls component to translate any .msg/.srv/.action files to .idl files? Then rosidl build-configure de-duplicates by keeping only one :idls components? If so, it seems like that requires all build-configure plugins to have exactly the same idea about how to convert files. If something changes and a plugin in the wild doesn't get updated, then all would be affected if rosidl build-configure deduplicates and keeps the out-of date :idls component.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if all the build-configure plugins know their respective generator plugins require .idl files but only .msg files are available, then the build-configure plugins all depend on :idls and add an :idls component to translate any .msg/.srv/.action files to .idl files?

Correct.

Then rosidl build-configure de-duplicates by keeping only one :idls components?

Correct, assuming these components are all equivalent.

it seems like that requires all build-configure plugins to have exactly the same idea about how to convert files.

That's right, if de-duplication is naive and only looks at component names.

IIRC this design document does not describe a build specification merge procedure. I didn't get that far. Thinking out loud, to safely de-duplicate two or more components, these have to be equal. Equality here means that name and description are equal, as well as that of its local dependencies, recursively.

For instance, say we have two :foo components:

":foo": {
     "Requires":  {":bar", "fizz:buzz"},
     ...
}

These can be deemed equal if their descriptions match. But that's not enough, since each could be using a different :bar (note that's not a problem for fizz:buzz, as it is a fully qualified component in an external package[1]), so those have to be equal too. And the same goes for :bar dependencies, and the dependencies of its dependencies, and so on. In practice, I highly doubt we'll have seemingly equivalent components like these.

[1] One could argue that two build specifications could depend on different versions of the same external package. The simplest solution is to ban merging them. If there's a need, the procedure could try to pick a version that satisfies both constraints, but it seems overkill to me.

"idls": {
"Type": "generic",
"Assembly": {
"Command": "rosidl_adapt -p foo_msgs $(sources)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that there would be a build-configure plugin that describes the conversion from .msg/.srv/.action to .idl, but there's no rosidl generate plugin that does the conversion?

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not really. Converting from .msg/.srv/.action to .idl is more of a translation than a generation. Translation that may or may not be required depending on the generator, and thus it is left to each generator to do so at will. This is an aspect I'm refining though, as rosidl_typesupport_connext_cpp forced me to. Right now, the plan of record is to:

  • Add an extensible rosidl translate verb to expose this functionality.
  • Have each rosidl generate plugin perform an internal rosidl translate if and only it cannot process a given interface file.
  • Have each rosidl build-configure plugin separate the translation and generation steps to have access to the translated files (e.g. to install them) and to use the build system as a cache (i.e. re-translate once if sources have changed).

"Requires": [ "action_msgs:idls", ":idls" ],
"Command": "rosidl_generate --language c -p foo_msgs -I $(location action_msgs:idls) $(sources)",
"Sources": [
"msg/foo.idl",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since :c_code requires msg/foo.idl and :idls provides that, is "Requires": [":idls"] redundant?

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but I figured it'd be easier on ament_meta_build if it doesn't have to deduce that if the underlying build system needs the information.

},
"Requires": {
// Base common dependencies
"rmw": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would ament_meta_build output for CMake translate this? find_package(rmw REQUIRED), but then which CMake target gets used? Do all components making shared libraries get linked against whatever CMake target rmw provides?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Some component should be requiring some other component within rmw. Good catch !

},
"Debug": {
// Use debug Python interpreter to resolve numpy include path.
"System-Includes": [ "$(shell pythond -c \"import numpy; print(numpy.get_include())\")" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows the command is python_d, but AFAIK ROS 2 only uses the debug interpreter on Windows. How would that platform specific quirk be specified? Is the generated .cbs supposed to be the same on all platforms, or is it only applicable to the platform it's run on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit is not fully fleshed out, I must say, but I'd leave this to ament_meta_build. That is, use pythond to refer to Python's debug interpreter and let each ament_meta_build plugin figure out what to put in there. Arguably, performing substitutions in a shell command line may be brittle, and perhaps introducing a $(pythond ...) substitution is a better, simpler solution. I'm open to ideas :)

A few details worth noting:

- The explicit dependency on other `foo_msgs_*` generated packages, which the tooling should resolve if all are to be present on the same specification.
- The requirement on the `python-dev` package, which is a hint for the build system to build against CPython development libraries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is python-dev in this context a rosdep key? Can any rosdep key be used here, or is this a keyword that gets special treatment?

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL; DR not a rosdep key, just a package this build specification depends on.

Generally speaking, a package requirement abstracts away a dependency. A package may exist for the target build system, or it may not i.e. it may be replaced by ament_meta_build with some target build system-specific code. In this case, python-dev is a package that abstracts away a dependency on CPython development libraries and that will likely be replaced with build system code by each ament_meta_build plugin.

It's worth mentioning that this is an exceptional case. Having each ament_meta_build plugin handle a "package" in a custom way is expensive. I think this makes sense for CPython development libraries because there's a large variability across build systems when using them and CPython extensions are ubiquitous. For most cases, it'll be up to the user to make sure a package requirement makes sense for the target build system (e.g. a requirement on fastrtps in a .cbs file will translate to find_package(fastrtps REQUIRED) in CMake -- it is up to the user to make sure that such macro call succeeds and provides the necessary information).

"Configurations": {
"Optimized": {
// Use Python interpreter to resolve numpy include path.
"System-Includes": [ "$(shell python -c \"import numpy; print(numpy.get_include())\")" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python might be Python 2 or Python 3, depending on the platform. Is this an actual command, or is something else translating this to the actual command?

Copy link
Contributor Author

@hidmic hidmic Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in #310 (comment) i.e. to be substituted. In that sense, custom Python substitutions like $(python2.7 ...) and $(python3.6 ...) may be a simpler approach to handle multiple Python versions. WDYT?

ament_meta_build ((-o|--output-file) PATH)? ((-b|--build-system) IDENTIFIER)? PATH?
```

Its only positional argument is a path to a common build specification file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the CMake output plugin to ament_meta_build would assume relative paths in Sources come from CMAKE_CURRENT_SOURCE_DIR, things from Artifacts are relative to CMAKE_CURRENT_BINARY_DIR, and installed things from Distribution are relative to CMAKE_INSTALL_PREFIX?

- Add section on interface definition translators and rosidl translate
- Update CLI specifications for all verbs
- Update CLI examples

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants