Skip to content

Latest commit

 

History

History
326 lines (303 loc) · 13.9 KB

TSLang.md

File metadata and controls

326 lines (303 loc) · 13.9 KB

TSLang

The goal of this file is to document how to begin writing Sifter code. It is specifically directed towards understanding the high-level 'TSLang' interface, which is used to write the analogy-making rules (mapper.py) as well as the application demos (examples/...).

High-Level View

At the highest level, TSLang programs are programs that define and operate on a particular type of data structure, a triplet structure. Most TSLang programs are separated into two stages:

  1. An initial triplet structure is defined with the data of interest, then
  2. The initial structure is then iteratively modified by appling pre-defined update rules.

TSLang provides syntactic sugar to help:

  1. Define triplet structures (ts_lib),
  2. Define rules operating on triplet structures (ts_lib, ts_utils), and
  3. Write tactics determining which rules to apply in what order (runtime and tactic_utils).

We will address each of these in turn.

(0) What are Triplet Structures?

A triplet structure is a particular type of data structure suited to representing knowledge about the world.

At its core, a triplet structure consists of exactly two things:

  1. A list of nodes and
  2. A list of triplet facts, which are 3-tuples of nodes.

To give some meaning to the structure, we adopt the convention that a triplet fact (A, B, C) should be interpreted like so:

  1. A is represents a particular logical fact,
  2. B represents an instance of something,
  3. C represents the type, or role B plays in the fact.

For example, we might want to assert that Homer is the Father and Marge is the Mother of Lisa with the facts:

(FamilyFact1, Homer, Father),
(FamilyFact1, Marge, Mother),
(FamilyFact1, Lisa, Daughter)

We think of FamilyFact1 as the unified 'thought' or 'fact node' representing the fact that Homer, Marge, and Lisa together make an instance of the "Family" type, with the given roles.

For more discussion of triplet structures and how to represent logical information with them, please see our Onward paper.

(1) Representing Triplet Structures with TSLang

A triplet structure is represented by an instance of the TripletStructure class. Most programs will only have one instance of TripletStructure, which we will conventionally name ts:

from ts_lib import TripletStructure
ts = TripletStructure()

Nodes in TSLang each have a unique, string name. With few exceptions, all names should start with /:. Nodes are referenced by indexing notation, and are automatically created upon reference if they do not yet exist:

# Creates nodes '/:Homer' and '/:Marge'
ts["/:Homer"], ts["/:Marge"]

To add a fact (A, B, C) we use the notation ts[A].map({ts[B]: ts[C]}), like so:

# Adds fact ('/:FamilyFact1', '/:Homer', '/:Father')
ts["/:FamilyFact1"].map({ts["/:Homer"]: ts["/:Father"]})

Multiple facts with the same fact node can be expressed naturally as well:

# Adds 3 facts.
ts["/:FamilyFact1"].map({
    ts["/:Homer"]: ts["/:Father"],
    ts["/:Marge"]: ts["/:Mother"],
    ts["/:Lisa"]: ts["/:Daughter"],
})

To prevent name collisions and to better enable the usage of macros to automatically manipulate the structure, we have support for scopes. A scope is simply a prefix of a node name, up to (but not including) a :. Intuitively we can think of node names as paths, with : delimiting directory boundaries, and scopes playing the role of directories. For example, we might want to place Father, Mother, and Daughter all in the /:Family scope, and Homer, Marge, and Lisa in the Simpsons scope:

ts["/:FamilyFact1"].map({
    ts["/:Simpsons:Homer"]: ts["/:Family:Father"],
    ts["/:Simpsons:Marge"]: ts["/:Family:Mother"],
    ts["/:Simpsons:Lisa"]: ts["/:Family:Daughter"],
})

In fact, TSLang has first-class support for scopes. By default, ts[...] always indexes relative to its current scope, which can be changed using with ts.scope(...):. The above is equivalent, for example, to: equivalent to:

# family[":..."] is equivalent to ts["/:Family:..."]
family = ts.scope("/:Family")
with ts.scope("/:Simpsons"):
    # In this block, ts[":..."] is automatically prepended with "/:Simpsons"
    # while ts["/:..."] is taken as an absolute path.
    ts["/:FamilyFact1"].map({
        ts[":Homer"]: family[":Father"],
        ts[":Marge"]: family[":Mother"],
        ts[":Lisa"]: family[":Daughter"],
    })
# Outside the with block, ts[":..."] again refers to ts["/:..."].

Scopes are generally used to group logically-related nodes together, especially nodes which should be treated in a particular way by some macro (as we will see later in this document).

(2) Writing Update Rules to Operate on Triplet Structures

An update rule is a program that searches for a pattern in the triplet structure and then makes some modification to the structure according to that pattern. With TSLang, we define update rules within the structure itself. We accomplish this by adding the pattern to search for as part of the structure, and adding extra facts which annotate which parts of the pattern need to be searched for or inserted/removed once an assignment is found.

The below example demonstrates this with a rule expressing transitivity of the 'greater than' relation:

with ts.scope(":TransitivityRule"):
    # First we describe the pattern we want to search for:
    ts[":AGreaterThanB"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":B"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":BGreaterThanC"].map({
        ts[":B"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    # Then what we want to insert if that pattern is found:
    ts[":AGreaterThanC"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    # Finally, we encode the details of the rule:
    ts[":RuleFact"].map({
        # /:TransitivityRule:_ will represent the rule as a whole.
        ts[":_"]: ts["/RULE"],
        # The facts implying A>B, B>C should already exist in the structure
        # before we apply this rule, hence we 'must map' them.
        ts[":AGreaterThanB"]: ts["/MUST_MAP"],
        ts[":BGreaterThanC"]: ts["/MUST_MAP"],
        ts[":A"]: ts["/MUST_MAP"],
        ts[":B"]: ts["/MUST_MAP"],
        ts[":C"]: ts["/MUST_MAP"],
        # If they are found, we then 'insert' the A>C node and associated
        # facts.
        ts[":AGreaterThanC"]: ts["/INSERT"],
    })

Note that the nodes /RULE, /MUST_MAP, and /INSERT do not start with /:. This indicates they are special nodes, which will be explicitly interpreted by the runtime. Also note that the nodes /:GreaterPair:Greater and /:GreaterPair:Lesser are not mentioned in the :RuleFact, which means they will be treated as constants in the corresponding pattern.

Note also that, in addition to /MUST_MAP, other quantifications are possible: NO_MAP(#) and TRY_MAP. In general, potential rule applications are discovered in three passes:

  1. First, we search for assignments to the constraints involving only the /MUST_MAP nodes.
  2. For each of those assignments, we check to ensure that they canNOT be extended to a satisfying assignment to the NO_MAP1, NO_MAP2, ... constraints. We throw away any assignment which can be extended to also satisfy the NO_MAP constraints.
  3. For any remaining assignments, we attempt to extend the assignment to also satisfy constraints involving the TRY_MAP nodes, if possible (otherwise the original assignment is used). Similarly, in addition to /INSERT there are also other actions possible:
  4. /REMOVE removes the node and all associated facts.
  5. /SUBTRACT removes only those facts explicitly mentioned by the rule (the node is removed if there are then no remaining facts).

Finally, note that by default we assume all of the /MUST_MAP nodes must have unique assignments. This can be explicitly weakened if desired to allow pattern matches to assign the same node to two different /MUST_MAP variables.

(3) Macros for Writing Update Rules

The above rule-declaration syntax can become tedious and error-prone if done by hand. To assist in this, two macros are provided in ts_utils.py which significantly improve the experience.

RegisterRule

The first, RegisterRule, is the most flexible. It works by putting nodes with the same role in the rule (e.g., /INSERT or /MUST_MAP) in the same scope. The rule from the previous example could be rewritten:

with ts.scope(":TransitivityRule"):
    with ts.scope(":MustMap") as existing:
        ts[":AGreaterThanB"].map({
            ts[":A"]: ts["/:GreaterPair:Greater"],
            ts[":B"]: ts["/:GreaterPair:Lesser"],
        })
        ts[":BGreaterThanC"].map({
            ts[":B"]: ts["/:GreaterPair:Greater"],
            ts[":C"]: ts["/:GreaterPair:Lesser"],
        })
    with ts.scope(":Insert"):
        ts[":AGreaterThanC"].map({
            existing[":A"]: ts["/:GreaterPair:Greater"],
            existing[":C"]: ts["/:GreaterPair:Lesser"],
        })
    RegisterRule(ts)

Notice that we need to explicitly refer to existing[":A"] in the second scope, as they are no longer all in the same scope.

RegisterPrototype

The second useful macro, RegisterPrototype, is useful when both:

  1. You only need /MUST_MAP and /INSERT nodes, and
  2. You want to define multiple rules which involve the same patterns.

An equivalent rule to the above is shown below:

with ts.scope(":TransitivityRule"):
    ts[":AGreaterThanB"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":B"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":BGreaterThanC"].map({
        ts[":B"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":AGreaterThanC"].map({
        existing[":A"]: ts["/:GreaterPair:Greater"],
        existing[":C"]: ts["/:GreaterPair:Lesser"],
    })
    RegisterPrototype(ts, dict({
        ":_": {ts["/INSERT"]: [ts[":AGreaterThanC"]]},
    }))

In general, the second argument can define arbitrarily many rules using the exact same set of nodes. Each rule lists some subset of nodes which should be inserted (alternatively, mapped). The remaining nodes in the scope are assumed to be mapped (alternatively, inserted). This feature is used extensively in mapper.py, as all of the mapper rules are essentially just different 'shadings' of the same core pattern (see our Onward paper for more details).

(2,3b) Update Rule Gotchas: Inserting Only a Triplet Fact

While our rule-declaration syntax is usually quite natural, there is one particular case where care needs to be taken. Namely, when the rule needs to operate on individual facts about existing nodes. For example, suppose you want to search for nodes A, B with a fact (A, A, B), and insert a new fact (B, B, A). you might try to do the following:

with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        exist[":B"].map({exist[":B"]: exist[":A"]})
    RegisterRule(ts)

The problem is that the /:BadRule:Insert scope is actually empty --- the line there only refers to nodes in the /:BadRule:MustMap scope. Thus the underlying rule created will only have /MUST_MAP nodes, not /INSERT, and so the rule is effectively a no-op. To get around this, we need to ensure that at least one of the nodes of each fact we want to insert actually belongs to the :Insert scope:

with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts)

But now it will create an entirely new node, effectively B2, and fact (B2, B, A)! To resolve this, we need to explicitly tell the system that :BadRule:Insert:B refers to the same node as :BadRule:MustMap:B. This can be done using the AssertNodesEqual macro in ts_utils.py:

with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert") as insert:
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts)
    AssertNodesEqual(ts, [exist[":B"], insert[":B"]], "/:BadRule")

In fact, if the nodes in question end in the same name (after ignoring the :MustMap or :Insert scopes), as :MustMap:B and :Insert:B do, then RegisterRule can automatically assert their equality using the auto_assert_equal option:

with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts, auto_assert_equal=True)

(4) Applying Update Rules

After declaring the TripletStructure with some initial facts and update rules, we will initialize a new TSRuntime instance. The runtime will extract the rules we declared earlier and provide an interface to actually modify the structure using those rules. We initialize a TSRuntime like so:

from runtime.runtime import TSRuntime

# ... initializing tc ...
rt = TSRuntime(ts)

rt exposes a somewhat lower-level API for applying rules: rt.get_rule(rule_name) returns a representation of the desired update rule. Given an update rule, rt.propose(rule) will yield possible assignments to the rule of the form (assignment, delta). assignment describes the nodes satisfying the rule's pattern while delta describes the modification to the structure which should occur according to the rule.

(5) Tactics for Applying Update Rules

tactic_utils.py defines a number of helpful functions, particularly Fixedpoint, for repeatedly applying a rule to the structure.

(6) Recording Changes to the Structure

To assist with backtracking search, there are a number of tools available to checkpoint and rollback the state of a structure, as well as record changes. These are described in docstrings in ts_lib.py.