# 5 - The Lanugage ### 5.1 - Lexical Conventions CSpydr is free-form, meaning that spaces and comments are ignored, except in strings, in characters and as delimiters between tokens. Spaces are recognized as whitespaces, carriage returns and tabs. Only ASCII characters are accepted. Identifiers consist of a series of alphanumeric characters or underscores, not starting with a digit. Next to that, CSpydr is case-sensitive. meaning that myid and myID are two different identifiers. When compiled with the C backend identifiers may not start with `__csp_`. CSpydr has a series of keywords, which cannot be used as identifiers **Keywords:** ``` true false nil let fn loop while for if else ret match type struct union enum import const extern macro namespace sizeof typeof alignof break continue noop len asm using with do unless ``` **Symbols:** ``` ++ += + -- -= - *= * %= % /= / &= && & ^ ^= <<= >>= << >> || |= |> | == => = != ! >= <= <- < > ( ) { } [ ] ~ , ; _ :: ... ² ³ $ ` ``` **Strings & Characters:** String literals are delimited by double quotes (`"`) and can contain any amount of any character except non-escaped double quotes. Character literals are delimited by single quotes (`'`) and can only contain one single character, or an escape sequence. These escape sequences include the following: | Literal characters | Interpreted character | |--------------------|-----------------------| | \a | Bell (alert) | | \b | Backspace | | \f | Form Feed | | \n | Newline | | \r | Carriage return | | \t | Horizontal Tab | | \v | Vertical Tab | | \\\' | Single Quote | | \\\" | Double Quote | | \\\ | Backslash | | \0 | Null character | **Numbers:** An integer constant can be written as decimal, hexadecimal, binary or octal. Floating-point constants can only be written in decimal and must include a radix point. All number literals can contain as many underscores `_` as you like for better visual parsing. Examples of valid numerical constants are the following: ``` 99 1 1_000_000 0xff 0xabcdef 0x3d 0o77 0o7234 0o0 0b11 0b10010 0b11011 1.2 3.1415 1_234.567 ``` **Comments:** Comments can be either single-line or blocks. Single-line comments are denoted by a hashtag `#` and include the rest of the line, while block comments are denoted by `#[` and go on until `]#`. ### 5.2 - Variables, Members and Arguments **Variables:** As most programming languages, CSpydr supports variables, which are named storage areas that can be manipulated. Each variable has its own type and can be assigned or referenced anywhere in the same or a deeper scope after its declaration. The name of the variable must be an identifier that is not also a type (see [5.1 - Lexical Conventions](https://github.com/Spydr06/CSpydr/wiki/5---The-Language#51---lexical-conventions)). Variables can be declared with or without an initial value and type using the `let` or `const` keyword: ```python let foo = 0; # initialization with value and implicit type let bar: i64; # initialization without value and explicit type const baz = 3; # initialization using `const`, making the variable immutable ``` **Arguments:** Arguments are a type of local variables defined in a function or lambda declaration used for passing values to them: ```rust fn foo(bar: i32, baz: &char) {} ``` **Members:** Members are named memory-offsets in structs and unions: ```python struct { foo: i32, # offset 0 bytes bar: f64, # offset 4 bytes baz: &char # offset 12 bytes } ``` ### 5.3 - Statements Like almost any procedural language, CSpydr has statements. Statements are constructs used for defining control flow and -behaviour. These are all statements in CSpydr: #### 5.3.1 - Blocks Blocks group multiple statements together while also creating a new scope for variables. Like in C, they start by `{` and end with `}` ```python { # statements here } ``` Blocks can contain `using` statements, which can pull functions and variables from a given namespace into the current scope, making function names more precise. ```py using std::io; puts("..."); # we can now use puts() defined in std::io, rather than std::io::puts() ``` #### 5.3.2 - Loops CSpydr has three loop types: `loop` for an endless loop, `while` for loops with a single exit condition and `for`-loops for more complex loop behaviours. **loop** ```py loop { # statement gets repeated endlessly } ``` **while** ```py while condition { # statement gets repeated until `condition` is false (0) } ``` **for** For-loops consist of 3 parts: the initializer statement, condition- and counter-expression seperated by semicolons `;`: ``` for init; condition; counter; { } ``` The `init`-statement gets run once at the beginning, whereas `counter` after every iteration. The loop iterates, as long as `condition` is true (not 0). **do-while:** The do-while loop is semanthically equivalent to C's `do {} while();`. It executes the `do`-block first and repeats it until `condition` is false. ```rust do { } while condition; ``` #### 5.3.3 - Control Flow CSpydr offers the following control statements: ``` if match ret break continue with asm defer do-unless do-while ``` These statements can change the flow of programs, create new branches or merge them. Statements, that don't end in `}` require a semicolon `;` after them. **if:** If statements execute a statement if a given condition is true (not 0). If the statement is false (0), the if statement may execute the statement defined after `else`. Defining `else` is optional, however. ```py if condition { # if condition is not 0, execute this } else { # if condition is 0, execute this (not necessary) } ``` **do-unless:** The do-unless provides a reversed-order notation compared to the if-statement. It is generally used for early returning out of functions and follows this scheme: ```ruby do { # do this unless "condition" is true } unless condition; ``` **match:** Match statements work similar to C's `switch` statement. They are used when the program branches into many directions when different conditions are met. In the match statement, you define different "cases", each with their own value to compare. `_` is the default case and gets executed if no other case matched the condition. ```py match condition { 0 => # if condition is 0, execute this 1 => # if condition is 1, execute this _ => # if none of these conditions match, execute this } ``` **ret:** The return-statement (`ret`) exits and returns a value from the current function or lambda. It can be written in two different ways using `ret` or `<-` ```py ret 0; # return 0 <- 1; # return 1 ``` **break:** The break-statement breaks out of the current loop. ``` break; ``` **continue:** The continue-statement starts the next loop iteration immediately. ``` continue; ``` **with:** The with-statement can be used in conjunction with IO types for easy error handling and memory management. It assigns a given value to a given value, checking if the value is not 0. Finally, it executes a predefined function for freeing memory or similar. The exit-function for a given type can be set using the `exit_fn` compiler directive: `[exit_fn("your_exit_function": your_type)]`. Exactly as if-statements, the with-statement can be followed by `else`, executed if the variable assignment was not successful. ```py with ptr = alloc(10) { # do something with ptr, it will be automatically freed. } else { # if memory allocation failed, execute this (not necessary) } ``` **asm:** The asm-statement can inject inline x86_64-bit GNU Assembly. This statement is mostly used for low-level functions like system calls and can be *highly unsafe*. The body of the assembly statement consists of strings, but also can have integer literals or variable identifiers. The values/addresses of which will be automatically substituted. ```rust asm "mov %rax, " foo; ``` **defer:** The defer-statement moves code written in it's body to the back of the current block. > Currently, this is only done at compile-time, so it won't respect loops or any other control flow. ``` { defer { # move statements in here to the end of the surrounding block statement. std::io::printf("A") } std::io::printf("B") } ``` #### 5.3.4 - Expression Statements Some expressions can be used like statements. The expressions include: - function calls - assignments - `++` and `--` - pipes #### 5.3.5 - Variable Declarations Local variables can be declared in any block statement using the `let` or `const` keyword. Variables defined with `const` are immutable. ```rust let x = 0; let y: i32 = 0; let z: &char; ``` ### 5.4 - Expressions CSpydr features many expressions. Expressions are used to fetch, evaluate, modify and store data. They mostly behave like mathematical operators, variables and constants. #### 5.4.1 - Literals Literals form the most basic expressions of CSpydr. They represent constant values for their type. **boolean literals:** Boolean literals are the simplest kind of CSpydr's literals: They represent a value of type `bool` (1-byte integer) and can either be ```rb true # 1 ``` or ```rb false # 0 ``` **integer and float literals:** Integer and float literals represent constant (non-changing) number values in code. Here are some valid examples: ```py # integer literals 99 1 1_000_000 0xff 0xabcdef 0x3d 0o77 0o7234 0o0 0b11 0b10010 0b11011 # float literals: 1.2 3.1415 1_234.567 ``` Integer literals have the default type of `i32`, float literals `f32`. To use these literals with other types, use type casting: ```rust (0xff: u8) (3.1415: f64) ``` **character literals:** Character literals are a special form of integer literals, representing a single `char` (1-byte integer) value in a more readable, ASCII-decoded form. Character literals are delimited by `'` and can contain [escape codes](#51---lexical-conventions). ```py 'a' # ASCII "a" value (decimal value 97) 'A' # "A" (decimal 65) '0' # "0" (decimal 48, NOT 0) '\n' # new-line character using an escape code ``` **string literals:** String literals represent an *immutable* array of characters in ASCII-decoded form. They are delimited by `"` and can contain [escape codes](#51---lexical-conventions) as well. The last character, `'\n'` (null-character) is appended implicitly. The length of string values can be calculated using the `len` operator. ```py "hello" # simple string literal with length 6 (5 letters + 1 null character) "first line\nsecond line" # longer string with a line break using escape-codes ``` **nil literal:** The `nil` literal represents a pointer to address 0, indicating a non-existant or missing value. It has the type `&void`. **array literals:** Array literals provide an easy way of initializing arrays. The array type is inferred corresponding to its first element. All following elements will be casted to this type. > ***Note:** Array literals represent CSpydr-style arrays, not C-like arrays!* ```py [0, 1, 2, 3, 5.5] # represents an array of type i32[5]. The last value `5.5` will be casted to an integer. ``` **struct literals:** Struct literals provide an easy way of initializing structs. They come in two flavours: 1. Explicitly-typed:
Explicitly-typed struct literals are used to create variables on a known type. It Allows better type checking and is generally preferred over the second type. ``` type X: struct { a: i32, b: i32, c: i32 } X::{2, 3.3, 4} # represents value of struct `X`, whose members are three integers. # The second argument will be casted accordingly. ```` 1. Implicitly-typed:
Implicitly-typed struct literals are used to create structs which don't have a typedef'ed alias. They are generally discouraged because of their worse readability and unsafer type handling. ``` {2, 3.3, 4} # represents a value of an anonymous struct with two integer and one float-typed members. # The second argument will not be casted to `i32`! ``` #### 5.4.2 - Operator Precedence Operator precedence is the ability of an operator to pull it's operands towards it, making it execute after operators with lower precedence. The precedence decides the order, in which operations are executed. CSpydr has the following precedence table: (the higher the precedence, the latter the operator is executed!) | Precedence | Operator | Name | |------------|----------|------------| | 1 | `=` | Assignment | 2 | `\|>` | Pipe | 3 | `\|\|` | Logical OR | 4 | `&&` | Logical AND | 5 | ` `ident` ` | Infix function call | 6 | `:` | Type cast | 7 | `\|` | Bitwise OR | 8 | `^` | Bitwise XOR | 9 | `&` | Bitwise AND | 10 | `==` | Equals | 11 | `<`, `<=` | Less than (or equals) | 12 | `>`, `>=` | Greater than (or equals) | 13 | `<<`, `>>` | Left & Right Bitshift | 14 | `+`, `-` | Plus, Minus | 15 | `*`, `/` | Multiplication, Division | 16 | `%` | Modulo | 17 | `-` (unary), `~` (unary), `&` (unary), `*` (unary) | (unary) Minus, Bitwise Negation, Reference, Dereference | 18 | `²` | Power | 19 | `--`, `++` | Increment, Decrement | 20 | `alignof`, `sizeof`, `typeof` | Align of, Size of, Type of | 21 | `foo()` | Function call | 22 | `foo[x]` | Index | 23 | `foo.bar` | Member #### 5.4.3 - Infix Operators A binary expression contains two operands separated by one operator. All binary operators have left-to-right associativity, but not all binary operators have the same precedence. See the precedence table above for more information. ```py a + b # addition a - b # subtraction a * b # multiplication a / b # division a % b # modulo a == b # equality a != b # inequality a > b, a >= b # greater than (or equal) a < b, a <= b # less than (or equal) a && b # logical AND a || b # logical OR a << b, a >> b # bit shift (left/right) a & b # bitwise AND a ^ b # bitwise XOR a | b # bitwise OR a = b # assignment ``` CSpydr's binary operators are divisible into four groups: 1. The **mathematical** operators `+`, `-`, `*`, `/`, `%` work on numbers of any type (except `%` only on int). These operators are used in mathematical equations and calculations. 2. The **logical** operators `==`, `!=`, `>`, `>=`, `<`, `<=`, `&&` and `||` are used to compare two values to each other: they return `true`, if the values match together with the operator and `false` otherwise. 3. The **bitwise** operators `<<`, `>>`, `&`, `|` and `^` allow to modify individual bits of integer values. See [Bitwise operation (Wikipedia)](https://en.wikipedia.org/wiki/Bitwise_operation) for more information. 4. The assignment (`=`) operator sets value of the left operand to the value of the right argument. #### 5.4.4 - Unary Operators Unary operators contain one argument **after** the operator. ```py -a # negation !a # logical NOT ~a # bitwise NOT &a # reference operator *a # dereference operator ``` 1. The negation (`-`) operator is meant to be used on integers and floats and makes positive values negative and negative values positive. 2. The logical not (`!`) operator returns `false` if the argument is truthy and `true` of the argument is falsy. 3. The bitwise not (`~`) operator turns all binary bits of the argument around. An integer like `0b01101011` gets converted to `0b10010100`. 4. The memory operators `&` and `*` are used to get the memory address of a value or get the value of an address (pointer). #### 5.4.5 - Postfix Operators Postfix operators contain one argument **before** the operator. ```py a++ # increment a-- # decrement ``` The only two postfix operators are increment/decrement, which increment or decrement a given value by `1`, while returning the value before. These operators are commonly used in loops and other repeating code structures.