Skip to content
MrAndMrsK edited this page Oct 25, 2014 · 2 revisions

Project Motivation

This project is related to the meta-programming spirit of FunSharp, WebSharper, the official F# SQL Type Provider, and LINQ-to-SQL. These projects use code-quotations or expression trees to take advantage of type-safety calculus and commands described in one language (F#/C#) and then translate and remotely execute code on a remote process (a javascript engine/SQL Server) in a different language (javascript/SQL) with weaker safety mechanisms or chunkier syntax & maintainability.

The motivations for this project are related to Object-Relational-Mappers and Domain Driven Design. I've tried various ORMs including Entity Framework through the versions, now 6 and peeks at 7. The querying and auto-mapper features of ORMs, I admit, can be rather convenient. But I've consistently felt an aversion to the change-tracking and updating mechanisms. I hate (hate hate hate!) that in order to do an UPDATE, you generally need to SELECT first - incurring too many reads, transmitting too much data in both directions, overhead of the object mapping, and holding locks for the duration.

Similarly, DDD seems great when just dreaming about it but I have yet to see a convincing/clean implementation for a highly inter-connected domain with multiple bounded contexts and centralized, maintainable, and consistently-enforced business rules. As I understand it, DDD is all about the consistent enforcement of business rules, "domain invariants", or "domain assertions". In order to enforce these invariants, all possibly relevant data must be available to the process involved in evaluating them. DDD's bounded contexts are a work-around against having to load all of the data into the application layer for such evaluations. The rules have to be from the perspective of a bounded context (not the whole domain), must not explore paths beyond the bounds of the context, and be maintained symmetrically with the rules for each other bounded context in which the represented domain rule might be relevant. There are even certain rules (such as uniqueness constraints but there are others) which would be impractical to attempt to enforce in the application.

If all of the relevant data were available locally to the application process then the bounded contexts would no longer be necessary and each domain rule could be stated only once as a quantifier over the entire domain. Let us avoid ORMs for many operations. Let us support compiler-enforced type safety AND domain invariant enforcement through automatic analysis and translation of remote/native SQL execution. User/domain-specific code will be written in F#, registering arbitrary domain assertions and, separately, describing arbitrary operations on the domain. Both assertions and operations would be type-checked by the F# compiler against models mirroring the rdbms table records. At runtime, the quotations of the registered assertions would be compared against the quotations of any particular operation for relevance. Wrapped in a transaction, the operation and any relevant assertions would be translated to T-SQL and transmitted for remote execution. The operation would be executed, failing first perhaps on any native constraints and then perhaps on any appended assertions. All of the emitted code could be sent as ADO command-text or it could be optimized (in whole or in part) as one or more generated stored procedures.

Assertions which are discovered through runtime analysis to be unrelated to a particular operation could be omitted. For example, an invariant referring to tables B and C are unrelated to a current operation’s UPDATE of a in A. As a second example, an invariant regarding a foreign relationship from C to B is not relevant to an operation of an INSERT of b into B, since no c in C could yet refer to b anyway.

The relevant assertions can also be optimized: Those assertions, specified in the generalized form over set relations (“for all …, …” including joins, grouping, aggregation, sub-queries, etc.) would be automatically narrowed to check only elements of the sets that might be affected by the operation. That is, for an operation that modifies a single element (keyed as) k within T, assertions which include T could be narrowed with a WHERE clause (or JOIN condition) to check that subset instead of wastefully rechecking all elements in T. Such narrowing should also work even when multiple elements are involved, for example through an UPDATE that is restricted but not to a primary key value.

The F# types which model the data need not ever be actually instantiated – they exist for type-checking the code only, but the operations happen remotely; though perhaps the models could work double-duty for LINQ and auto-mapping when data actually has to be delivered to the UI in a more familiar query operation.

This setup could put much less strain on SQL Server in terms of reducing unnecessary reads AND transaction durations. The application server’s code could be vastly simplified, with more correctness and higher performance, and could still scale horizontally.

The Erasing SQL Type Provider seemed to be on the trajectory of this vision, but based on some latest blog entries, I’m led to believe that the recent implementation of UPDATES still seem to occur in the application layer.

Here is an mockup of what I hope to use - modeling some asexual race's family lineage. The model is small but has foreign keys onto itself, helping illustrate many complications at once in a fairly small example.

open System.Linq
open Microsoft.FSharp.Quotations
open Microsoft.FSharp.Quotations.DerivedPatterns
open Microsoft.FSharp.Quotations.Patterns
open Microsoft.FSharp.Quotations.ExprShape
 
// This type is a stand-in for the schema to be encountered on the server
// I don't know if an existing Set-like type should be used or some new type should represent tables
type Table<'T> private () =
    member this.Add : 'T -> 'T = failwith "never really executed" 
    interface IQueryable<'T> with
        member this.ElementType: System.Type = failwith "never really executed" 
        member this.Expression: Expressions.Expression = failwith "never really executed"    
        member this.GetEnumerator(): System.Collections.IEnumerator = failwith "never really executed"      
        member this.GetEnumerator(): 'T System.Collections.Generic.IEnumerator = failwith "never really executed"    
        member this.Provider: IQueryProvider = failwith "never really executed" 
 
// Extracts the F# expression tree from the quotation and [<ReflectedDefinition>]
let BodyOf op =
    match op with
    | Patterns.Call(None, DerivedPatterns.MethodWithReflectedDefinition body, args) -> body
    | Patterns.Lambda (a, Patterns.Call(None, DerivedPatterns.MethodWithReflectedDefinition body, args)) -> body
    | Patterns.Lambda (a, Lambda (b, Patterns.Call(None, DerivedPatterns.MethodWithReflectedDefinition body, args))) -> body
    | e -> e     
 
// Models below - consists right now of a single Person model with a 
// self-reference - presumably mapped to some SQL schema.
// Then an ethereal Domain model to provide an idea of the database itself
// These types
 
type PersonKey = 
    | Auto                  // Unassigned
    | PersonKey of int64    // Can be composite keys
and Person = {
    ID : PersonKey
    mutable Name : string
    mutable ParentID : PersonKey option
    mutable Birthdate : System.DateTime }
 
[<AbstractClass>] 
type Domain =
    abstract member People : Person Table
 
// These are the invariants on the domain
// A result of `true` indicates a violation of the constraint
[<ReflectedDefinition>]
module businessRules =
    // uniqueness constraint - hopefully redundant with a PRIMARY KEY constraint
    let People_Unique (d : Domain) = query { 
        for person in d.People do
        join person2 in d.People on (person.ID = person2.ID)
        exists (true) }
 
    // referential integrity - hopefully redundant with a FOREIGN KEY constraint
    let People_ParentExists (d : Domain) = query { 
        for child in d.People do
        where (child.ParentID.IsSome)
        leftOuterJoin parent in d.People on (child.ParentID = Some parent.ID) into parents
        for parent in parents do
        where (box parent = null)
        exists (true) }
 
    let People_ValidName (d : Domain) = query { 
        for person in d.People do
        exists (person.Name.Trim() = "") }
 
    let People_OlderParent (d : Domain) = query { 
        for parent in d.People do
        join child in d.People on (Some parent.ID = child.ParentID)
        exists (child.Birthdate <= parent.Birthdate) }
 
// various operations ...
 
[<ReflectedDefinition>]
let myDad (d : Domain) = // : PersonKey
    d.People.Add({ ID = Auto; Name = "Alex"; ParentID = None; Birthdate = new System.DateTime(1950, 03, 07)}).ID
 
[<ReflectedDefinition>]
let me (d : Domain, parent : PersonKey) = // : PersonKey
    d.People.Add({ ID = Auto; Name = "Pat"; ParentID = Some parent; Birthdate = new System.DateTime(1980, 03, 11)}).ID
 
[<ReflectedDefinition>]
let sibling (d : Domain, parent: PersonKey) = // : PersonKey
    d.People.Add({ ID = Auto; Name = "Sam"; ParentID = Some parent; Birthdate = new System.DateTime(1985, 12, 14)}).ID
 
[<ReflectedDefinition>]
let operation3 (d : Domain) pid = // : unit
    let person = d.People.Where(fun p -> p.ID = pid).Single() // or the query syntax
    person.Name <- "Terry"
    person.Birthdate <- new System.DateTime(1948, 04, 25)
 
// this just prints out the body's expression tree, but we'd instead send them for remote execution
printfn "%A" <| BodyOf <@@ myDad @@>
printfn "%A" <| BodyOf <@@ me @@>
printfn "%A" <| BodyOf <@@ operation3 @@>
 
(*
-- At runtime, the operations and the invariants are converted to t-sql.  
-- They are also analyzed for relevance of invariants to an operation.
-- `myDad` becomes t-sql with assertions appended:
 
    BEGIN TRAN
 
    DECLARE @__ScpID BIGINT
 
    INSERT INTO dbo.People VALUES (Name, Birthdate) SELECT @p1, @p2
 
    SELECT @__ScpID = SCOPE_IDENTITY()
 
    -- Relevant Assertions
    -- People_Unique - (natively checked by dbo.People.pk_People)
    -- People_ParentExists - (natively checked by dbo.People.fk_People_ParentPersonID)
    -- People_ValidName
    IF (EXISTS (SELECT 1 FROM dbo.People e1 WHERE e1.Name = '' AND e1.ID = @__ScpID)) BEGIN ROLLBACK; THROW ...; END
 
    -- Irrelevant Assertions
    -- People_OlderParent (with inserted item as e1)
    -- People_OlderParent (with inserted item as e2)
 
    COMMIT
*)

Here's the largely uninteresting output, so you don't feel compelled to run it:

Lambda (d,
        PropertyGet (Some (Application (PropertyGet (Some (PropertyGet (Some (d),
                                                                        People,
                                                                        [])),
                                                     Add, []),
                                        NewRecord (Person, NewUnionCase (Auto),
                                                   Value ("Alex"),
                                                   NewUnionCase (None),
                                                   NewObject (DateTime,
                                                              Value (1950),
                                                              Value (3),
                                                              Value (7))))), ID,
                     []))
Lambda (tupledArg,
        Let (d, TupleGet (tupledArg, 0),
             Let (parent, TupleGet (tupledArg, 1), Call (None, me, [d, parent]))))
Lambda (d,
        Lambda (pid,
                Let (person,
                     Call (None, Single,
                           [Call (None, Where,
                                  [PropertyGet (Some (d), People, []),
                                   Call (None, QuotationToLambdaExpression,
                                         [Quote (NewDelegate (Func`2, p,
                                                              Call (None,
                                                                    op_Equality,
                                                                    [PropertyGet (Some (p),
                                                                                  ID,
                                                                                  []),
                                                                     pid])))])])]),
                     Sequential (PropertySet (Some (person), Name,
                                              [Value ("Terry")]),
                                 PropertySet (Some (person), Birthdate,
                                              [NewObject (DateTime, Value (1948),
                                                          Value (4), Value (25))])))))
Press any key to continue . . .
Clone this wiki locally