Skip to content

Capabilities

Berkus Decker edited this page Jul 9, 2017 · 2 revisions

Each userspace thread has an associated capability space (CapSpace) that contains the capabilities that the thread possesses, thereby governing which resources the thread can access.

Capabilities reside within kernel-managed objects known as CapNodes. A CapNode is a table of slots, each of which may contain a capability. This may include capabilities to further CapNodes, forming a directed graph. Conceptually a thread’s CapSpace is the portion of the directed graph that is reachable starting with the CapNode capability that is its CapSpace root.

A CapSpace address refers to an individual slot (in some CapNode in the CapSpace), which may or may not contain a capability. Threads refer to capabilities in their CapSpaces (e.g. when making system calls) using the address of the slot that holds the capability in question. An address in a CapSpace is the concatenation of the indices of the CapNode capabilities forming the path to the destination slot.

Capabilities can be copied and moved within CapSpaces, and also sent in messages. Furthermore, new capabilities can be minted from old ones with a subset of their rights. Vesper maintains a capability derivation tree (CDT) in which it tracks the relationship between these copied capabilities and the originals. The revoke method removes all capabilities (in all CapSpaces) that were derived from a selected capability. This mechanism can be used by servers to restore sole authority to an object they have made available to clients, or by managers of untyped memory to destroy the objects in that memory so it can be retyped.

Vesper requires the programmer to manage all in-kernel data structures, including CapSpaces, from userspace. This means that the userspace programmer is responsible for constructing CapSpaces as well as addressing capabilities within them. This chapter first discusses capability and CapSpace management, before discussing how capabilities are addressed within CapSpaces, i.e. how applications can refer to individual capabilities within their CapSpaces when invoking methods.

CapSpace Creation

CapSpaces are created by creating and manipulating CapNode objects. When creating a CapNode the user must specify the number of slots that it will have, and this determines the amount of memory that it will use. Each slot requires 16 bytes of physical memory and has the capacity to hold exactly one capability. Like any other object, a CapNode must be created by calling Untyped_Retype() on an appropriate amount of untyped memory. The caller must therefore have a capability to enough untyped memory as well as enough free capability slots available in existing CapNodes for the Untyped_Retype() invocation to succeed.

CapSpace Addressing

When performing a system call, a thread specifies to the kernel the capability to be invoked by giving an address in its CapSpace. This address refers to the specific slot in the caller’s CapSpace that contains the capability to be invoked.

CapSpaces are designed to permit sparsity, and the process of looking-up a capability address must be efficient. Therefore, CapSpaces are implemented as guarded page tables.

A CapSpace is a directed graph of CapNode objects, and each CapNode is a table of slots, where each slot can either be empty, or contain a capability, which may refer to another CapNode. The number of slots in a CapNode must be a power of two. A CapNode is said to have a radix, which is the power to which two is raised in its size. That is, if a CapNode has 2**k slots, its radix would be k. The kernel stores a capability to the root CapNode of each thread’s CapSpace in the thread’s TCB. Conceptually, a CapNode capability stores not only a reference to the CapNode to which it refers, but also carries a guard value.

Capability Address Lookup

Like a virtual memory address, a capability address is simply an integer. Rather than referring to a location of physical memory (as does a virtual memory address), a capability address refers to a capability slot. When looking up a capability address presented by a userspace thread, the kernel first consults the CapNode capability in the thread’s TCB that defines the root of the thread’s CapSpace. It then compares that CapNode’s guard value against the most significant bits of the capability address. If the two values are different, lookup fails. Otherwise, the kernel then uses the next most-significant radix bits of the capability address as an index into the CapNode to which the CapNode capability refers. The slot s identified by these next radix bits might contain another CapNode capability or contain something else (including nothing). If s contains a CapNode capability c and there are remaining bits (following the radix bits) in the capability address that have yet to be translated, the lookup process repeats, starting from the CapNode capability c and using these remaining bits of the capability address. Otherwise, the lookup process terminates successfully; the capability address in question refers to the capability slot s.

Addressing Capabilities

A capability address is stored in a Capability Pointer (abbreviated CapPtr), which is an unsigned integer variable. Capabilities are addressed in accordance with the translation algorithm described above. Two special cases involve addressing CapNode capabilities themselves and addressing a range of capability slots.

The translation algorithm described above will traverse CapNode capabilities while there are address bits remaining to be translated. Therefore, in order to address a CapNode capability, the user must supply not only a capability address but also specify the maximum number of bits of the capability address that are to be translated, called the depth limit.

[snip graphical explanation]

In summary, to refer to any capability (or slot) in a CapSpace, the user must supply its address. When the capability might be a CapNode, the user must also supply a depth limit. To specify a range of capability slots, the user supplies a starting address and a window size.


Fluke.md

All references between low-level objects in Fluke are represented as kernel-mediated capabilities. Each primitive object type contains a number of “capability slots.” For example, each thread object contains an address space slot, which refers to the address space in which the thread is to run. Processes can store and manipulate individual capabilities using reference objects, which are kernel objects that hold a single capability of any type. System calls are provided to produce capabilities pointing to existing kernel objects, copy capabilities between reference objects and the special-purpose slots in other kernel objects, pass capabilities through IPC, and compare them against each other. A process only refers to a capability by the virtual address of the kernel object in which it is contained; it never has direct access to the capability itself.

Capabilities in our API provide the relativity property for cross-address-space references, such as references within a child process to objects implemented by its parent. Since only the kernel can access the actual contents of a capability, capabilities can be passed freely between arbitrary processes in our system, allowing communication to short-circuit the process hierarchy when appropriate.

A parent process can locate all of the capabilities contained in a child process, discover to which objects they point, and transparently make substitutions of these capabilities, analogous to “pointer swizzling”.

Capabilities also provide border control. Since a parent process determines what capabilities it passes to its child, it can interpose on any of these capabilities, as well as on capabilities subsequently passed into or out of the child through communication channels on which the parent has interposed. This way, the parent can completely monitor and control all communication into or out of a process by intercepting IPC messages. However, the capability model also allows the parent to interpose selectively on only a subset of the capabilities provided to a child.

Clone this wiki locally