Conditional statement #148

remaininlight · 2018-11-09T00:42:21Z

So this isn't actually a pull request yet, it's a rough sketch of what a conditional statement might look like (I thought Maxim could use one?). However, this is my first time writing Rust and I've only dabbled lightly in compilers before.. so apologies for the sharp edges and lack of it compiling atm.

I thought I'd ask your opinions @cpdt before I go too much further/ make any structural changes to what already exists. Any pointers are useful, I suspect some change might be necessary to codegen but I don't want to be too brutal - I imagine you probably have some reasons for the current architecture!

Also, how are you debugging the parser at the moment? I thought some tests (like https://github.com/TheDan64/inkwell/tree/master/tests/all) might be useful but haven't set any up yet.

Parsing

Maxim code is matched by regexes in parser/token_stream.rs. Added (broken I suspect) matchers for if, then and else.
Added tokens If, Then, Else to parser/token.rs
Added parse_conditional_expr to parser/mod.rs (loosely from https://github.com/TheDan64/inkwell/blob/master/examples/kaleidoscope/main.rs#L663)

AST

Added ConditionalExpression and new_conditional. Should the fields on this be SubExpressions or something else?

Mir

ast::Blocks with Expressions are converted into CompileResultmir::Blocks with Statements
Added Conditional Statement to mir/block/statement.rs
Added lower_conditional_expression and matcher for it to pass/lower_ast.rs. What is the idea behind lowering the ast? Haven't come across it before.

Codegen

Added matcher for Statement::Conditional to codegen/block/mod.rs
Created codegen/block/gen_conditional.rs (from https://github.com/TheDan64/inkwell/blob/master/examples/kaleidoscope/main.rs#L999). This is the part I'm probs most unsure about, in the inkwell example everything is converted to floats using compile_expr. I'm not sure how to do the conditional comparison here or how best to return a PointerValue?

cpdt

This is awesome! I've gone through and written a few comments on some places, but I'll also answer your questions/mention my thoughts on the overall goal here. Be prepared for several walls of text :D

Yeah, there aren't any tests for the compiler yet. Probably would've been a good idea to write those while I did the rewrite several months ago, but my free time was relatively scarce at the time :D I'll get around to writing proper tests for everything there so far eventually, if you want to set some up for the parts you're touching feel free (but there's no obligation to).

So, there's actually a reason why there's no conditional support in Maxim at the moment. It's completely open to discussion (and I would welcome your thoughts!) but basically my thinking is I'd rather have the ability to conditionally execute things on the node level (i.e turning on and off groups of nodes) than in code. There are a couple reasons:

Having it on the node level is much more powerful, since it would allow you to disable/enable more than one node based on the same condition (whereas having it in code means you need to duplicate the condition) and would allow you to turn on and off things like group nodes (which don't have code, they contain another surface inside of them). Extractors already kinda do something similar. If you're not familiar, they basically clone a group of nodes 32 times and then individually enable/disable each group - they're useful for things like polyphony, and can even be used as a primitive form of conditional execution (by just enabling/disabling one of the instances) although that's quite wasteful.
I'm worried that having branching inside a node could lead to some odd, counter-intuitive behavior around mutable things. For example, the compiler uses the code to determine which direction a control's value goes - if it's read by the code it's treated as an input, if it's written to it's treated as an output. This is used to determine some things like node evaluation order, if a control shares it's value among all extracted instances, etc. But let's say a control is only written to in one of the branches - while that branch is active the control acts as an output (overwriting whatever's put into it), but when it's not active it can suddenly be changed. From the compiler's perspective I don't think this would be a problem, but it could be hard to notice in the editor leading to unexpected things happening. Conditionals on the node level would still have this behaviour but it would be more visible as the disabled nodes would be greyed out (I'd imagine).
There's also the question of controls that do some processes each sample (like graph controls), and functions which store some state (like the delay function). Controls are updated before running the code in a node, which would mean they'd always be updated even if they're only used in a branch that isn't active. Functions, however, are run at the point where they're called, which means they wouldn't update if the branch wasn't active. This might be counterintuitive, but I'm not sure...

Let me know if that makes sense. I'd like to hear your thoughts.

Regarding structural changes, you can see my individual comments for more details. There is one thing I think you missed though:

In compiler/src/mir/var_type.rs, you'll need to add a branch to of_statement to determine the type that the conditional expression returns. Assuming you've checked both the consequence and alternative expressions return the same type, I guess you could just recurse to get the type of one of those.

compiler/src/ast/expression.rs

cpdt · 2018-11-09T10:19:50Z

compiler/src/codegen/block/gen_conditional.rs

+    index: usize,
+    condition: &usize,
+    consequence: &usize,
+    alternative: &usize,


Any reason why you're passing references to usizes here? Just usize should be fine.

compiler/src/codegen/block/gen_conditional.rs

cpdt · 2018-11-09T10:20:16Z

compiler/src/codegen/block/gen_conditional.rs

+    // TODO Why is float created here?
+    // TODO Find a way to invoke a compile_expr function
+    let condition = self.compile_expr(condition)?;
+    let condition = builder.build_float_compare(FloatPredicate::ONE, condition, zero_const, "ifcond");


Numbers in Maxim aren't actually just floats, rather they're a structure containing a u8 (the form) and an f32 2-vector (the left and right channels). There's a utility called NumValue that wraps a pointer to one of these structs and provides methods to get the components, e.g.

let condition_num = NumValue::new(condition); let condition_vec = condition_num.get_vec(builder);

This does bring up the question of how the vector is handled for the conditional branch. Normally when a function needs only a single float it takes the left channel value, but since here we just need a condition instead of an actual value I'd be open to making it AND or OR between the channels.

For the first case (taking the left channel), you can use build_extract_value to get one of the values out of the vector. Here's an example.

cpdt · 2018-11-09T10:20:49Z

compiler/src/mir/block/statement.rs

@@ -74,6 +81,7 @@ impl Statement {
            | Statement::Extract { .. }
            | Statement::Combine { .. }
            | Statement::LoadControl { .. }
+            | Statement::Conditional { .. }
            | Statement::CallFunc { .. } => false,
            Statement::StoreControl { .. } => true,


This is fine for the current structure (since the conditional doesn't 'contain' the consequence/alternative statements), but it would be incorrect if the statements where inside the conditional as described above since the statements inside could have side effects.

It's worth mentioning that this is used to remove dead code (if a statement doesn't have side effects and isn't used by another statement, it's removed), so it's pretty important to be correct!

Cool, I'll leave this comment open so I don't forget..

compiler/src/parser/mod.rs

cpdt · 2018-11-09T10:21:02Z

compiler/src/parser/mod.rs

+
+        let end_pos = Parser::expect_token(TokenType::CloseBracket, stream.next())?
+            .pos
+            .1;


Is there a reason it needs to end with a close bracket? Unless I'm reading this incorrectly, it looks like the syntax right now would be:

if cond then something else something_else)

Which I'm guessing isn't what you're intending.

cpdt · 2018-11-09T10:21:07Z

compiler/src/parser/token_stream.rs

+        // TODO Make conditional matchers match conditionals
+        get_matcher(r"if", TokenType::If),
+        get_matcher(r"then", TokenType::Then),
+        get_matcher(r"else", TokenType::Else),


The regex here is fine, although I think this needs to be earlier in the array. The token goes from top to bottom, which means as it stands it'll tokenize "if", "then" or "else" as an identifier instead of one of these tokens. Putting it after the multi-char tokens should be fine, probably under the header "indentifiers".

cpdt · 2018-11-09T10:21:11Z

compiler/src/pass/lower_ast.rs

+                consequence,
+                alternative,
+            })
+        )


A few things here:

You can do &expr.condition instead of expr.condition.as_ref() to get a reference to the value.

As I've mentioned previously, this is actually going to emit code that means both the consequence and alternative expressions are executed before the conditional statement, which probably isn't what you want. But the solution is complicated :)

You'll need to typecheck the condition to make sure it's a num - self.check_statement_type will do that.

You'll need to make sure both expressions have the same type.

remaininlight · 2018-11-09T19:11:52Z

Amazing, thanks for the detailed feedback! Interesting learning how it's all put together.

I completely agree with you on disabling/enabling functionality on the node level being a good way to go most of the time and look forward to checking out Extractors. My desire for a conditional is motivated by wanting to be able to create midi events for generative and interactive music ie.

if sensor_value == 1:
    // Send midi note

There shouldn't be anything too heavy on their branches and tbh I don't think I'd even use the else that much, so for now I think changing the MIR structure shouldn't be necessary atm - given that evaluating both then and else should be cheap.

I'd wondered how inputs and outputs were determined, that sounds neat - I'll check it out when I have time. I love the flexibility of being able to modulate all the controls with each other but I think it might be a bit confusing for your average musician/ casual user. If they form part of your intended audience I wonder if, at some point in the future, some way of hiding internal state from the control surface and overriding each control's status as an input or output could be a good idea - perhaps in the right-click menu for that control? Otherwise, if people are making nodes themselves imo they should know better than to plug signals into outputs etc.

Re: the difference in updates for controls and functions, I may be missing some subtlety here but it sounds like the functions would behave correctly and there would be some extra processing for a control which isn't active on that branch? I think that a constant cost for each control is not too unexpected, painting them on the screen probably gives you a comparable overhead. I admire your eye for detail in elegance/efficiency (I guess this is from demoscening?) but if Axiom is being used on a PC I don't think anyone would ever notice ;) + If they do then they could do their switching at the node level.

I look forward to progressing through your comments over the next few days.

cpdt · 2018-11-14T10:35:03Z

@remaininlight Hey, sorry for taking a while to get back to you... had a busy weekend :)

That makes sense with the branching. However, I am concerned that using that kind of "traditional" if/else structure will mean people will assume only one branch is evaluated instead of both... That's fine in a lot of cases, but it becomes a bit of a problem when the branches have side-effects (e.g. setting a control's value or calling a function like delay that has some internal state). It would probably also make the result of something like this quite confusing:

if some_cond:
    my_var = 2
else:
    my_var = 3

Since both branches are executed, my_var would end up always being 3 which doesn't make much sense!

Something that would be a bit easier to implement (wouldn't require changes in the parser or MIR) is to define a new if function, that looks like this:

my_var = if(some_cond, 2, 3)

I think that makes the semantics a bit more clear. What do you think?

Regarding the MIDI use-case specifically: yeah, that's been something I've been wanting to implement for a while, but I'm not 100% sure on the best way to do it yet. I wonder if something like:

out:midi = midi_note(gate, note, velocity)

would work better than having an explicit if? The way I'm imagining that is that internally the function would emit a note on event when gate rises, then a note off when it falls. Might be a bit more error-proof than providing functions to just emit MIDI events :)

As for inputs/outputs, yeah I have some plans around that to make it easier to understand. I think what you're talking about with hiding internal state is something that group nodes already handle nicely (if I'm understanding your point correctly) - they basically allow you to have a whole node area inside of them, and you can 'expose' the controls you want from inside that to show on the group node's control surface. I do think we can do a lot more around showing which controls are inputs and outputs, and which ways data is flowing through connections. Currently input number controls default to knobs while outputs default to "plugs" (which look kinda like read-only knobs), although I have considered adding arrows to knobs and connections to make that more clear. I think that added clarity will also allow us to add more behaviour around how connections interact, too. For example, if you currently connect two outputs into one input, one of the outputs will override the other (and it's not defined which). But a more useful behaviour here would probably be to add together the values of those two inputs, which could get quite confusing without some way to easily see the direction of connections.

As for control/function updates, it's less about performance (which is still very important when you're running the functions 44100 times per second :D) and more about side-effects. For example, the graph control has a bar moving across that shows the current time in the graph. The position of this bar (and the state of the graph) is updated in the control's update function. I was imagining a case where someone was only using the control inside a branch, and was possibly expecting the graph to become 'disabled' (i.e stop moving) when that branch isn't active, which wouldn't happen because control updates are always run. But I'm not too sure if that actually would be counter-intuitive, and it doesn't really matter with the "branchless" design anyway.

cpdt · 2018-11-14T10:38:17Z

compiler/src/codegen/block/gen_conditional.rs

+        (&else_val, &else_bb)
+    ]);
+
+    phi.as_basic_value().into_pointer_value()


It's worth mentioning, if we're going for a "branchless" approach (i.e both sides are evaluated), LLVM actually has an instruction called select for this that means we don't need to do any branching/basic blocks/phi nodes in the IR. You give it a boolean and two values of the same type, and it returns either the left or right of the values depending on the boolean.

Great, thanks - I've attempted to implement this in an if function on this pull request but it's not quite working yet..

remaininlight · 2018-11-15T00:14:07Z

No worries, I can relate! :) I think I'm slowly starting to get my head more into Maxim, now I see that in the case where both branches are executed it would not be possible to conditionally execute only one of them (by definition..).

I had actually started making an if function when I first came to this but abandoned it in favour of chasing the if statement. I agree the function is clearer for the case when both branches are executed, it looks like it might be used in the place some languages use a ternary statement:

my_var = some_cond ? 2 : 3

Though I guess that's still a little ambiguous as to what is executed. I'm not sure if Maxim can return different types from the same function (ie. can a function return a midi message or something which casts to no midi message?) - would it be possible to conditionally send a midi note with this? Perhaps something like:

out:midi = if(some_cond, midi_note(channel, note, velocity), 0)

A midi_note function could be a good way to go for many people's use cases, though if a note on/off is triggered each time the gate changes I guess you could end up with 44100 notes per second from a continuously changing signal? Maybe having gate non-zero being a note and zero being no output could work.

Note ons with zero velocity vs note offs are pragmatically the same thing in my experience (in fact I think it says something along those lines in the MIDI standard somewhere) but some people might also want to be able to send note offs with velocities. Most libraries I've seen have two functions, meaning the note_on function (which is used most of the time) doesn't have to have too many parameters and people can still use a note_off function when they want it.

Yes, now you mention it, groups do that well. Arrows sound good, as does adding the values of the connections together on one input. Re: controls and function updates I am guided by your knowledge of how these pieces fit together :)

For now I have implemented an if function which is on this pull request. It doesn't quite work.. it doesn't switch when the conditional input changes but I thought I'd send it over anyway - I have to be up early and can't quite finish it tonight.

cpdt

Hey, looks good! Yeah, it is roughly equivalent to a ternary expression, although those normally also conditionally execute only one side or the other.

MIDI values in Maxim aren't actually just a single MIDI event, instead they're an array of up to 16 events... so a single MIDI connection in the editor, or variable in Maxim, can carry up to 16 things happening at the same instant. This means that "no MIDI message" would just be an empty array. It also means you could theoretically combine two event values by concatenating the arrays if you wanted to, say, add a new event into a 'stream'.

Yeah, that was what I was meaning by rising/falling edge - when the gate goes from 0 to something that's not zero it generates note on, and vice versa with note off. The point about note off velocities is interesting, I don't think Axiom currently does anything with those since I wasn't too sure how it would actually be used. Maybe you can provide some insight to how that would be useful here?

I've left a few more comments on the new code, let me know your thoughts :)

compiler/src/codegen/functions/if_function.rs

cpdt · 2018-11-15T12:12:02Z

compiler/src/mir/block/function.rs

@@ -106,7 +106,8 @@ define_functions! {
    Note = "note" func![(Midi) -> Tuple(vec![Num, Num, Num, Num])],
    Voices = "voices" func![(Midi, VarType::new_array(Num)) -> VarType::new_array(Midi)],
    Channel = "channel" func![(Midi, Num) -> Midi],
-    Indexed = "indexed" func![(Num) -> VarType::new_array(Num)]
+    Indexed = "indexed" func![(Num) -> VarType::new_array(Num)],
+    If = "if" func![(Num, Num, Num) -> Num]


So, this is declaring that the function takes in numbers and output numbers, which means the compiler will only accept calling it with numeric values and not MIDI ones (or arrays, or tuples) whereas we probably want it to accept/return any type. There's no way to define generic functions at the moment, but I don't think a basic implementation of that should be hard. If you want to give that a shot you can, otherwise I'm happy to look at it.

I've had a swing at an Any type, which so far has involved disabling type safety for it and it only allocates space for numbers atm from AnyValue::get_type.. which doesn't seem ideal. Let me know if I'm headed in the right direction!

remaininlight · 2018-11-18T18:33:42Z

compiler/src/pass/lower_ast.rs

@@ -769,7 +771,9 @@ impl<'a> AstLower<'a> {
        expected_type: mir::VarType,
        actual_type: mir::VarType,
    ) -> Option<CompileError> {
-        if expected_type != actual_type {
+        if (expected_type != actual_type)


Type safety checking is disabled for Any type here. It would be good if there was some way of linking the output type of a function etc to its input types? ie. an if function has midi messages as its arguments, so the output takes the type of a midi message, haven't worked out the best way to do this yet.

remaininlight · 2018-11-18T18:35:30Z

compiler/src/codegen/values/any_value.rs

+
+impl AnyValue {
+    pub fn get_type(context: &Context) -> StructType {
+        // TODO Just returning Number structure for now. Can this infer what its correct type should be from context?


Not sure what should be returned here, think some restructuring might be necessary in order to infer the actual type the AnyValue is representing and return space for that?

remaininlight · 2018-11-18T18:40:28Z

compiler/src/mir/block/function.rs

@@ -106,7 +106,8 @@ define_functions! {
    Note = "note" func![(Midi) -> Tuple(vec![Num, Num, Num, Num])],
    Voices = "voices" func![(Midi, VarType::new_array(Num)) -> VarType::new_array(Midi)],
    Channel = "channel" func![(Midi, Num) -> Midi],
-    Indexed = "indexed" func![(Num) -> VarType::new_array(Num)]
+    Indexed = "indexed" func![(Num) -> VarType::new_array(Num)],
+    If = "if" func![(Num, Num, Num) -> Num]


I've had a swing at an Any type, which so far has involved disabling type safety for it and it only allocates space for numbers atm from AnyValue::get_type.. which doesn't seem ideal. Let me know if I'm headed in the right direction!

cpdt · 2018-11-20T13:07:06Z

Hey! Just wanted to say, thanks for the continued interest/work on this :)

Unfortunately I don't think we'll be able to get away with just an Any type. Since we're compiling code to bare-metal it's pretty important that we do actually know what the types are - bad stuff would happen if a user tried to do this for example:

# 'something' is typed as Any but contains a Midi value
something = if(cond, note_on(), note_off())
# out:num expects a Num, since 'something' is Any it passes the type checker, but it'll crash when emitting the LLVM IR
out:num = something

What we need is a way for a function declaration to define generic parameters, and then the AST -> MIR lowering pass can check this and change the call to be to a concrete function (i.e one with actual types defined). Then we just need a way to instantiate the concrete function, which I don't think should be complicated. I can have a look at getting this infrastructure in place sometime soon if you want, since it'll probably get quite far down into the nitty-gritty of how all the pieces work together :)

This does raise a question though, of if we want to allow the function to have different behaviour for different types. For example, numeric values contain two floats for left and right... if you have an expression like:

if(cond, a, b)

and cond is equal to {0, 1}, should the result be {b.left, a.right}? Or should we just take the left channel value like we'd do for other types (since they don't have two channels) and return b? I kinda prefer the first one since it better mirrors how other numeric functions work (i.e they operate independently for each channel), but it might be counterintuitive that it doesn't work that way for other types?

cpdt reviewed Nov 9, 2018

View reviewed changes

cpdt reviewed Nov 14, 2018

View reviewed changes

If function, not switching value on condition change

1910cf3

remaininlight force-pushed the dev branch from 2beff09 to 1910cf3 Compare November 14, 2018 23:56

cpdt reviewed Nov 15, 2018

View reviewed changes

remaininlight and others added 2 commits November 18, 2018 18:05

If function working with Any type. Type safety bypassed for Any type..

692178f

Merge branch 'master' into dev

05780ce

remaininlight commented Nov 18, 2018

View reviewed changes

Updated number of functions in mir block after merge from master

49bc440

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional statement #148

Conditional statement #148

remaininlight commented Nov 9, 2018

cpdt left a comment •

edited

Loading

cpdt Nov 9, 2018

cpdt Nov 9, 2018

cpdt Nov 9, 2018

remaininlight Nov 9, 2018

cpdt Nov 9, 2018

cpdt Nov 9, 2018

cpdt Nov 9, 2018

remaininlight commented Nov 9, 2018

cpdt commented Nov 14, 2018

cpdt Nov 14, 2018 •

edited

Loading

remaininlight Nov 15, 2018

remaininlight commented Nov 15, 2018

cpdt left a comment

cpdt Nov 15, 2018

remaininlight Nov 18, 2018

remaininlight Nov 18, 2018

remaininlight Nov 18, 2018

remaininlight Nov 18, 2018

cpdt commented Nov 20, 2018

Conditional statement #148

Are you sure you want to change the base?

Conditional statement #148

Conversation

remaininlight commented Nov 9, 2018

cpdt left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remaininlight commented Nov 9, 2018

cpdt commented Nov 14, 2018

cpdt Nov 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remaininlight commented Nov 15, 2018

cpdt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cpdt commented Nov 20, 2018

cpdt left a comment •

edited

Loading

cpdt Nov 14, 2018 •

edited

Loading