Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/cue: rename 'cue cmd' to 'cue run' or 'cuerun' #1325

Open
myitcv opened this issue Oct 25, 2021 · 19 comments
Open

proposal: cmd/cue: rename 'cue cmd' to 'cue run' or 'cuerun' #1325

myitcv opened this issue Oct 25, 2021 · 19 comments
Labels
NeedsDesign Functionality seems desirable, but not sure how it should look like. Proposal roadmap/cli Specific tag for roadmap issue #337

Comments

@myitcv
Copy link
Member

myitcv commented Oct 25, 2021

In #337 and various other places we have mooted the idea of moving the logic from cue cmd to a different command, with cue run or another entry point entirely cuerun (originally an idea from @shykes)

Creating this issue as a WIP placeholder for a full proposal for such a move, with rough notes as follows:

  • Currently cue cmd does not have a means by which the user can specify the package (path) from which the command is derived: it is currently .. [Edit: See this comment] We might want to support this so that running a command from an alternative location within the main module is supported (for example the command could even be defined in a dependency of the main module). This is especially true in the world of modules, where one might want to do that equivalent of go run $pkg. Indeed we could go one step further and consider something equivalent to go run $pkg@$version and then the dependency does not even need to exist as a dependency in the main module
  • Per discussions with @mpvl, we also need to better define what it means for such a command to take package pattern arguments, and how the command operates on such arguments
  • Potentially rethink the use of _tool.cue to designate the command specification. We kicked around the idea of *_tool package space, much like Go's *_test external tests
  • Given the previous point, clear designation on how the package defining the commands interacts with any other packages (at the moment there is a rather complicated "merge")
  • Whether to move away from the model whereby $id-style fields are used to identify tasks
  • As noted by @eonpatapon, we should ideally be able to handle injection "arguments" to a command after the command name itself (as well as before). Consider this as part of argument handling generally
  • Per @mpvl, one advantage to keeping cue run as opposed to cuerun is that flags can precede the run in cue run. Whether we require this or not remains to be seen, just noting FYI
  • Better defined support for composing commands/tasks. i.e. importing a command/task from a package to allow for reuse. This might extend to include implicit command/task referencing. See bug in cmd/cue: cmd loses track of task dependencies through builtin calls #1834.
  • Ensure that we support command declaration even in the presence of embedded disjunctions. This does not work with cue cmd
  • A means by which commands/tasks can control levels of parallelism. This could be achieved by instances of mutexes/semaphores. A command/task having a reference to a mutex/semaphore would then control parallelism. cmd/cue: support for controlling parallelism in command tasks #709
  • Fixing the case of dependency calculation in the presence of (implicit) default values. See cmd/cue cmd: unable to depend on output of another task #1568
  • Better define when tasks are "ready" to run. See tools/flow: if comprehension bug #1593
  • Revisit and extend the notion of a task being considered done/complete (which we currently achieve via $after), so that other tasks can depend on the "exit code" for a task. Control flow could be introduced to handle failure cases, with a default behaviour of failing the entire command if nothing depends on the failure case. See cmd/cue cmd: unable to depend on output of another task #1568.
  • Related to a number of previous points: better model what the inputs to a command are.
    • The package which defines the command
    • An optional specification of a set of packages/directories where that task should be run (default to $CWD of command)
    • Some indication of how that list of packages/directories should be handled: iterate, ...
    • Other command line arguments []string that could be consumed by a task
  • tool/file.Glob currently passes the argument through to https://pkg.go.dev/path/filepath#Glob, which unfortunately allows ** as an equivalent to *, which can be confusing to users and makes it hard to later decide to start supporting "extended globbing" to recurse into subdirectories.
  • The embed proposal introduces the @embed attribute. The arguments to @embed are designed to follow closely the behaviour of cmd/cue with respect to its inputs. This consistency is intentional. However there is now a large gap in capability and convenience when it comes to loading files via workflows. We should therefore introduce a tool-based function (in an existing/new package) that provides the same functionality as @embed.
  • In Unclear error messsages when tag is not provided in _tool.cue commands #3357 @Kryan90 raises that it should be possible to be more explicit about the parameters for a workflow, more specifically which are required. Such that better error messages can be reported when it comes to running a workflow (as opposed to more generic CUE error messages "non-concrete X").
  • ...
@myitcv myitcv added Proposal Triage Requires triage/attention roadmap/cli Specific tag for roadmap issue #337 labels Oct 25, 2021
@seh
Copy link
Contributor

seh commented Oct 25, 2021

Should cue cmd allow importing and unifying non-CUE files like other commands that adhere to the interface described in cue flags?

@myitcv
Copy link
Member Author

myitcv commented Oct 25, 2021

@seh such non-CUE files cannot, by definition, apply to the command specification, so presumably you are referring to data that is consumed as part of the command execution? If so, yes, that is in scope. At the moment, the "merge" of command and package is rather confusing to both implement and understand. I've added a note above about (forgot to do so earlier) for clearer designation of the command specification vs the package on which the command operates. But if you have specific examples to help motivate the design that would be very useful, thanks.

@seh
Copy link
Contributor

seh commented Oct 25, 2021

But if you have specific examples to help motivate the design that would be very useful, thanks.

I had tried to use cue cmd to write a filter, where I wanted to supply several JSON file paths, have CUE import them and unify them against a definition (using the --path and --with-context command-line flags), and then run a few steps defined in CUE against that definition's value. I couldn't accomplish this easily using just cue export, as I wanted to tie in the output of calling on some other tools.

Perhaps I could have written the "import JSON and unify it with a definition" part in CUE myself, for lack of being able to use the --path and --with-context flags, but I was disappointed enough that I wound up using a different tool for the task instead.

@mpvl
Copy link
Member

mpvl commented Oct 26, 2021

Should cue cmd allow importing and unifying non-CUE files like other commands that adhere to the interface described in cue flags?

Note that this will not be a simple rename, but rather, we would use this opportunity to clean things up. One issue currently is that the files that define the command and the files on which the command is run are the same. This is not always desirable. I could envision having these separated, allowing the aforementioned flags for some of them.

It seems that your specific use case can be quite useful to work out the specifics for this, so it may be useful to get some more detail here.

@myitcv myitcv added NeedsDesign Functionality seems desirable, but not sure how it should look like. pre v1.0 and removed Triage Requires triage/attention labels Nov 19, 2021
@StevenACoffman
Copy link

Not sure if this belongs in this same issue, but I would love to be able to visualize a cue workflow. Even a transformation from the (tool/flow) DAG into graphviz dot or something would be handy.

There is this example that dumps the current state into a mermaid graph:

cue/tools/flow/flow_test.go

Lines 230 to 240 in fa141c2

func mermaidGraph(c *flow.Controller) string {
w := &strings.Builder{}
fmt.Fprintln(w, "graph TD")
for i, t := range c.Tasks() {
fmt.Fprintf(w, " t%d(\"%s [%s]\")\n", i, t.Path(), t.State())
for _, t := range t.Dependencies() {
fmt.Fprintf(w, " t%d-->t%d\n", i, t.Index())
}
}
return w.String()
}

But that's not easily reusable and you'd probably have to run it after the flow is done since tasks can be dynamically generated based on the output of another.

@StevenACoffman
Copy link

I had some further thoughts on workflows/pipelines. Since with Makefiles make supports -Bnd, there's a program make2graph that will parse that and render it as a graphviz dot file format such that even dynamically generated targets are accounted for. We could even use that approach to translate that into a skeleton Cue flow from a gnarly existing makefile.

make supports these options:

  • -B, --always-make Unconditionally make all targets.
  • -n, --just-print, --dry-run, --recon Print the commands that would be executed, but do not execute them.
  • -d Print debugging information in addition to normal processing. The debugging information says which files are being considered for remaking, which file-times are being compared and with what results, which files actually need to be remade, which implicit rules are considered and which are applied---everything interesting about how make decides what to do.

This is how I use make2graph on my local (mac) machine:

brew install openssl graphviz
export LDFLAGS="-L/usr/local/opt/openssl@3/lib"
export CPPFLAGS="-I/usr/local/opt/openssl@3/include"
git clone git@github.com:lindenb/makefile2graph.git
cd make2graph
make
mv make2graph /usr/local/bin/make2graph
cd ~/WHEREVERYOUMAKEFILEIS/
make -Bnd YOURTARGET |make2graph | dot -Tsvg -o deps.svg

@cedricgc
Copy link
Contributor

My thoughts based on what I want to see:

I would like command definitions to be simply data. Unless cue run reads in the command definitions, the data itself (and tool/* defs) should not hold any semantic meaning. Using conventions like _tool.cue or separate packages for commands inserts implicit context that ties CUE code to the file structure with special behavior.

Looking forward, we can imagine that a single CUE package may hold definitions that multiple separate tools would want to read, not only cue run. In that scenario I think it is especially important cue run behaves in a consistent way that 3rd party tools would: CUE definitions are just data, it is the programs that gives semantic meaning in their context.

There are nuances in how that would work but the general goal would be for cue run commands to simply be definitions that are treated the same as other definitions. They can be defined with other CUE code and imported from packages.

Semi-related note. _tool.cue is the mechanism to signal that a struct is a CUE command. Instead, should we consider defining that the data is a command in the definition? Here is an example based on Kubernetes:

{
apiVersion: "resources.cuelang.org/v1"
kind: "Command"
metadata: name: "foo"
spec:
 # Command schema here
 ...
}

We can see from above that we can avoid name collisions with a domain namespace which opens up a decentralized way for developers to define resources. This allows definitions to coexist based on a standard. A resource model is a discussion in itself but I thought this would be a good way to demonstrate a way for cue run configurations to live with other CUE code without special behavior.

@mvdan
Copy link
Member

mvdan commented Mar 26, 2023

Related to a redesign of cue cmd, I would argue that we should no longer make cue foo an alias for cue cmd foo. For example, I momentarily forgot that we don't have an analogous command to Go's go list, so I was stumped by these errors:

$ cue list -h
cannot find package "-h"

The top-level cue command also has to do quite a bit of heavy lifting to figure out whether the first argument might be a custom command or not. Sometimes this causes surprising slow-downs, like #2161. And in general, it makes cue's top-level command quite complex in its implementation, as it has quite a bit of intricate logic to treat "unknown command" as a fall-back to loading custom commands.

I've also been bitten by this when typoing standard top-level commands. For example:

$ cue exprot
[spends seconds loading the package]
command "exprot" is not defined
Ensure commands are defined in a "_tool.cue" file.
Run 'cue help' to show available commands.

I think this command should fail immediately with a better error message, and not even try to run a custom command. Otherwise, beyond the confusing error messages and time spent to load the current CUE package, whenever I run a top-level command there's the slight worry that maybe I'm running a custom command without realizing it.

I would personally not support shortcuts or aliases for running commands or tasks defined in CUE. Typing cue cmd foo, cue run foo, or cuerun foo is clear and explicit, and it's not that many keystrokes.

myitcv added a commit that referenced this issue Mar 28, 2023
Running cue export on the github package is a good way of tracking down
issues with workflow declarations that are otherwise impossible to find
when running cue cmd to regenerate the yml outputs. These bugs are known
about and effectively captured by #1325.

Right now, because we declare the field named base to be the package
value of the imported base package, such an export fails because there
are many fields of that package value that cannot be exported because
they are not concrete.

Fix that by making the field hidden, which allows the cue export trick
to work again.

Signed-off-by: Paul Jolly <paul@myitcv.io>
Change-Id: Icd9df2beab51ff2ae34ac3eb641d6c739b01e2ce
cueckoo pushed a commit that referenced this issue Mar 28, 2023
Running cue export on the github package is a good way of tracking down
issues with workflow declarations that are otherwise impossible to find
when running cue cmd to regenerate the yml outputs. These bugs are known
about and effectively captured by #1325.

Right now, because we declare the field named base to be the package
value of the imported base package, such an export fails because there
are many fields of that package value that cannot be exported because
they are not concrete.

Fix that by making the field hidden, which allows the cue export trick
to work again.

Signed-off-by: Paul Jolly <paul@myitcv.io>
Change-Id: Icd9df2beab51ff2ae34ac3eb641d6c739b01e2ce
Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/551826
Unity-Result: CUEcueckoo <cueckoo@cuelang.org>
TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
@myitcv
Copy link
Member Author

myitcv commented Apr 3, 2023

I just raised #2324 as an aide memoire so that I don't forget about it. It's related to this issue, but only insofar as cuerun would almost certainly be built on top of tools/flow or tools/flow/v2.

@myitcv
Copy link
Member Author

myitcv commented Aug 9, 2023

Adding details of a report from Slack of a clear confusion that arises when calling a command with a pattern argument.

What did you do?

# Export with itemsList_tool.cue in place
exec cue export ./...
cmp stdout stdout.export-pre.golden

# Verify that cue cmd ls ./... works
exec cue cmd ls ./...
cmp stdout stdout.ls-pre.golden

# Note that a single ls command is run, that combines
# the results of all the packages that result from the
# expansion of ./...

# Now rename itemsList_tool.cue -> itemsList.cue
mv itemsList_tool.cue itemsList.cue

# Export with itemsList.cue in place
exec cue export ./...
cmp stdout stdout.export-post.golden

# Attempt to ls in this state
exec cue cmd ls ./...

-- cue.mod/module.cue --
module: "mod.com"
-- x.cue --
package x

items: {}
-- ls_tool.cue --
package x

import (
	"tool/cli"
	"strings"
)

command: ls: cli.Print & {
	text: "ls: " + strings.Join(itemsList, " ") + "\n"
}
-- itemsList_tool.cue --
package x

itemsList: [ for _, v in items {v}]
-- a/x.cue --
package x

items: {
	a: "a"
}
-- b/x.cue --
package x

items: {
	b: "b"
}
-- stdout.export-pre.golden --
{
    "items": {}
}
{
    "items": {
        "a": "a"
    }
}
{
    "items": {
        "b": "b"
    }
}
-- stdout.ls-pre.golden --
ls: a b

-- stdout.export-post.golden --
{
    "itemsList": [],
    "items": {}
}
{
    "itemsList": [
        "a"
    ],
    "items": {
        "a": "a"
    }
}
{
    "itemsList": [
        "b"
    ],
    "items": {
        "b": "b"
    }
}

What did you expect to see?

There are any number of potential expectations (hence the plan to address this with cuerun):

  1. cue cmd ls runs once per instance, for each instance that results from the ./... expansion, with a working directory of the directory of the instance.
  2. cue cmd ls runs once per instance, for each instance that results from the ./... expansion, with an unchanged working directory.
  3. cue cmd ls runs a single command with all packages that result from the ./... expansion unified together.
  4. cue cmd ls runs a single command with all packages that belong to the same package that result from the ./... expansion unified together.
  5. ...

Part of the challenge is that it's not clear which cue cmd actually does, nor is it possible/clear to select one of the other interpretations.

Noting #1138 in the context of the expansion of ./....

What did you see instead?

# Export with itemsList_tool.cue in place (0.069s)
# Verify that cue cmd ls ./... works (0.077s)
# Note that a single ls command is run, that combines
# the results of all the packages that result from the
# expansion of ./...
# Now rename itemsList_tool.cue -> itemsList.cue (0.000s)
# Export with itemsList.cue in place (0.070s)
# Attempt to ls in this state (0.077s)
> exec cue cmd ls ./...
[stderr]
itemsList: incompatible list lengths (0 and 1)
[exit status 1]
FAIL: /tmp/testscript1691469317/repro.txtar/script.txtar:21: unexpected command failure

Per the expectation above, hard to know whether this is "right" or "wrong".

@from-nibly
Copy link

Not sure if this is helpful but (and this might be a bit of abuse) but I run a command cue snap ./... or cue cmd snap ./... which rolls up all the files in a repo into a single mega package/object/whatever and dumps junk out into various yaml files.

So my expectation would be if I run with ./... that not only will my _tool.cue file be somewhere in ./... but it will run one command with everything in ./... rolled into one.

Now all files in my repo are part of one mega package (again probably an abuse) but my expectation would be to run against the one package it finds (if it finds only one), OR fail and expect you to provide a package that you would like the package to run against.

I think running a command implicitly multiple times, once per package would not be intuitive.

I also think setting the working directory to the package root would only make sense if you were to run multiple instances of the command once per package.

Generally speaking when you run commands on the command line the perspective of what takes place is the cwd of where the command was run, not what files the command is executing against.

You can always add extra explicit arguments on top of the default to add things like parallel per package instances of running a command, or you know, write a bash command to do the same thing (unix style)

Those are my opinions, and expectations if that is helpful.

@verdverm
Copy link

verdverm commented Aug 29, 2023

This is helpful, there is a broader conversation about the UX/DX of the cue/flow | cue run

You can actually build your own cue/flow implementations and we have a hof flow that moves the flow/task defintions to regular CUE files. We ignore _tool.cue files. Those could go away in the default cue run implementation. I would support that anyhow, based on our experience. We also added bulk processing to our flow, so you can run a workflow on a series of inputs, independently, and added parallel control and building blocks. Some of this is experimentation for the sake of it, but also some "we want certain task types that will never make it into the default implementation" We anticipate some of these will have a version in the default implementation, like running a server and using flow for the handlers.

I'd be very curious to know if hof/flow addresses some of your problems. https://github.com/hofstadter-io/hof

@from-nibly
Copy link

So I don't have any problems with cue cmd snap ./... currently it behaves how I need it to. I can definitely understand why a revamp is needed though. There are some things that work, but feel implicit and vague. I've got it working how I want it to but that doesn't mean it would for everyone. I just wanted to lay out what I would expect from any kind of system like cue run/cmd.

@myitcv myitcv removed the zGarden label Dec 13, 2023
cueckoo pushed a commit that referenced this issue Apr 23, 2024
Thanks to Paul Jolly for writing the testscript in a comment on #1325.
It is added here with some minor tweaks: adding more comments,
and reflecting the current behavior with the last command failing.

Without such a test, it's hard to notice when we might break users
who rely on the existing behavior, intentionally or not.

Signed-off-by: Daniel Martí <mvdan@mvdan.cc>
Change-Id: I82d0229d085e8d63fc3e170e2c92f66434159117
cueckoo pushed a commit that referenced this issue Apr 24, 2024
Thanks to Paul Jolly for writing the testscript in a comment on #1325.
It is added here with some minor tweaks: adding more comments,
and reflecting the current behavior with the last command failing.

Without such a test, it's hard to notice when we might break users
who rely on the existing behavior, intentionally or not.

Signed-off-by: Daniel Martí <mvdan@mvdan.cc>
Change-Id: I82d0229d085e8d63fc3e170e2c92f66434159117
Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1193717
Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>
TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>
Reviewed-by: Paul Jolly <paul@myitcv.io>
@mvdan
Copy link
Member

mvdan commented Jun 20, 2024

Another sharp edge: tool/file.Glob currently passes the argument through to https://pkg.go.dev/path/filepath#Glob, which unfortunately allows ** as an equivalent to *, which can be confusing to users and makes it hard to later decide to start supporting "extended globbing" to recurse into subdirectories.

@infogulch
Copy link

One thing that might be nice is if there were a way to control flow for a whole group of tasks at once. For example, setting the $done key on a struct that is not a task itself but contains tasks, where its value is materialized when all child tasks are completed. And symmetrically, add a way to make all of the tasks in a group depend on something outside it. This would allow users to build groups of tasks into units that act like a single task; aka a very basic form of abstraction.

@myitcv
Copy link
Member Author

myitcv commented Jun 26, 2024

setting the $done key on a struct that is not a task itself but contains tasks, where its value is materialized when all child tasks are completed.

Whilst this wouldn't necessarily be done via a $done field, this is absolutely something we want to support more natively. It's entirely possible to build up a list of dependencies using a comprehension today, but we also want to be able to refer to sets of tasks, for example, by their containing struct and have that be equivalent.

add a way to make all of the tasks in a group depend on something outside it

This is possible today using pattern constraints:

package x

import (
	"tool/cli"
)

command: x: {
	print: cli.Print & {
		text: "hello"
	}
	after: [string]: {
		$after: print
	}
	after: print1: cli.Print & {text: "after 1"}
	after: print2: cli.Print & {text: "after 2"}
}

I'm not clear we need/want another mechanism for expressing this.

@infogulch
Copy link

infogulch commented Jun 26, 2024

Using $after with a reference to the dependency copies its entire definition into the dependent which absolutely decimates readability if you cue export it. (I assume it's memoized so it doesn't blow up memory.) This issue can be mitigated by using .$done, but then this pattern doesn't work if print itself contains both tasks and sets of tasks. We need a .$done that works recursively, IMO. Maybe this is still possible to write into a pattern.

@infogulch
Copy link

infogulch commented Jun 26, 2024

Some more ideas (edited):

  1. The mermaid export is nice for debugging execution flow, but there's not much to do if you need to debug your cue definition. It would be to have some way to export / print a cue expression as of a particular step. Maybe a new encoding/cue package + cli.Print?
  2. With many commands running at once can be quite some effort to determine what lines are from what command, so it would be nice to have a way to prefix stdout lines of a command with its task id.
  3. It would be nice to pause execution until after a command outputs some pattern to stdout, but before it exits. E.g. a flow like start http server -> wait for server to print "server started" -> run http tests. This can be done by outputting to a logfile and grepping it, but this is not very nice.
  4. It would be nice to have some "cleanup" commands that always run, even if prior commands fail. Something like a finally after a try block, or a go defer.

@rogpeppe
Copy link
Member

The initial text of this issue says:

  • Currently cue cmd does not have a means by which the user can specify the package (path) from which the command is derived: it is currently ..

This is not quite right. The command is derived from any *_tool.cue files mentioned explicitly on the command line unified together, but not their dependencies.

Here's a demonstration of the behaviour of packages mentioned on the command line. Note that there is no package in the current directory:

exec cue cmd print ./a ./b
cmp stdout want-stdout

-- want-stdout --
printed by package b
printed by package a 2
-- cue.mod/module.cue --
module: "test.example"
language: version: "v0.9.0"

-- a/a_tool.cue --
package a

import (
	"tool/cli"
	a "test.example/b:a"
)

command: print: {
	print: cli.Print & {
		text: "printed by package a \(a.x)"
	}
}

-- b/b.cue --
package a

x: *int | 1

-- b/b_tool.cue --
package a

import (
	"tool/cli"
)
x: 2

command: print: {
	other: cli.Print & {
		text: "printed by package b"
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDesign Functionality seems desirable, but not sure how it should look like. Proposal roadmap/cli Specific tag for roadmap issue #337
Projects
None yet
Development

No branches or pull requests

10 participants