IBM i and Capability Addressing
A paper entitled "The CHERI capability model: Revisiting RISC in an age of risk" was recently published and covers capability addressing and an architecture and implementation behind it. It included a comparison other systems forms of capability addressing, but tended to overlook that of a particularly successful business system using an OS now called IBM i. What follows is a discussion of IBM i's capability addressing and some possible observations learned from that relating to CHERI.
IBM i's Architecture
IBM i OS’ basic architecture – although having been enhanced over the years – is based on that of the System 38, AS/400 (a.k.a. i5/OS, iSeries), and has been around for roughly 25 years. Today it can be thought of as supporting three addressing modes, but started out using just one, an object-based capability addressing. This document is intended to share the basics of IBM i’s addressing for comparison to other forms of capability addressing.
IBM i today (2014) runs on a Power-based processor architecture, one enhanced over the typical Power architecture largely in the way that it supports addressing. Along with the process-local addressing used in the base Power architecture, this extended architecture supports Tagged Pointers as well as what IBM i calls Single-Level Store (SLS). Because of the Power architecture's process-local addressing support, IBM i can execute applications compiled for the AIX (a UNIX derivative) operation system; to do so, IBM i uses the Power architecture's "Tags Inactive" mode. Addressing and the instruction set in this mode are just as used by AIX and other Power-based OSes. IBM i, though, was initially based on what is called “Tags Active” mode in the enhanced Power architecture. It is this “Tags Active” mode which supports a forms of capability-based addressing and which will largely be subject of this document.
The Tag of “Tags Active” stems from the use of a Tag ON/OFF state associated with each 16-byte aligned 16 bytes of physical memory. The Tag is directly associated with this 16 bytes in that a load of this 16 bytes also loads this tag. A Tagged Pointer is considered valid only if this tag’s state is ON. Attempts to store anything aside from a valid pointer to any 16-byte aligned 16 bytes sets the tag OFF. This document will get into this in considerably more detail shortly.
In IBM i’s Single-Level Store (SLS), most of the 64-bit virtual address space is common to all processes of the system. Said differently, the same 64-bit SLS value represents the same byte in virtual address space, no matter which process is using the address. Further, much of this 64-bit address space also represents persistent data; even if no process is currently using an SLS address, even if the system is powered down at the moment, each such SLS virtual address continues to represent a single byte of data in persistent storage (e.g., HDDs, SSDs). As a result, main storage (i.e., system DRAM) acts as a cache for much of the contents of persistent storage; all that a process needs to do to access data, even if not in main storage, is reference an SLS-addressed byte and that byte will be made available in main storage by the OS; once there it is also available to all processes allowed to used that address. With SLS, there is no notion of mapping a file into a process’ address space; the file, at the time of its creation, is assigned some part of the SLS virtual address space and maintains that address even when the IBM i OS is not active.
Although SLS is clearly advantageous for data sharing across processes, OSes must also ensure data protection across processes as well. IBM i’s Tagged Pointers are, of course, part of this, but we’ll get to that part shortly. First, though, it is important to note that IBM i segments SLS. There are a number of segment sizes supported, but all segments are like aligned within SLS. For example, a typical 16-Mbyte segment is 16 Mbytes of virtual address space in size and aligned on a 16-Mbyte boundary; a 64K segment is 64 Kbyte aligned. Once the segment size is known, its alignment is as well.
IBM i calls itself “Object-based”. Every object type supported is made up of one or more of these various-sized segments. As a result, the creation of any object includes the allocation of one or more segments from its SLS virtual address space. It is object-based largely in the same way that some languages are object-oriented; for each object type, there are only a well-defined set of functions that can be executed against these objects. Many of these functions are supported only by the privileged kernel of IBM i. In order to execute these functions, the programmer provides a Tagged Pointer – often called a System Pointer – referencing an object’s segment and requests execution of that function. If the program does not have an object’s System Pointer, the program also does not have access to that object. We’ll be discussing shortly how these System Pointers can not be corrupted.
Some objects allow programs to directly - as opposed to implicitly by the OS - access the data within the object. A program Stack is such an object. This is possible through another type of Tagged Pointer called a Space Pointer. To access within the object, Space Pointers are allowed to be modified via Effective Address calculations. For reasons of address protection, both the source and result addresses must address within the same segment. Doing otherwise is an addressing violation, resulting in an interrupt in processing called an Effective Address Overflow (EAO). This has nothing to do with the Tag proper but is very much part of IBM i’s address protection.
There are various forms of address calculations, both implicit to the instruction and explicit, using compiler-generated address calculations. For example, when the hardware executes instructions accessing memory, the instructions themselves implicitly execute Effective Address calculations. The Tags Active mode of the Power architecture ensures that the resulting address used still remains in the same segment as the base address; if not, the hardware throws an interrupt and aborts the storage access. Similarly, programs can explicitly generate effective address calculations. The IBM i compilers, with hardware assistance, ensures that the resulting address still remains within the same segment, throwing an exception if is not. A basic indexed array access is an example, ensuring that the addressed array entry resides in the same segment as the beginning of the array. (More on how this is assured shortly.)
EAO checking, of course, is merely bounds checking on segments.
The point here is that, although all SLS could be available to all processes, access is first allowed via Tagged Pointers only to segments of objects to which the process has the right, and then - again via Tagged Pointers - only within those segments into which the object itself allows access. (We’ll also see shortly how it is that arbitrary valued Tagged Pointers can not be generated.
Before going on, the Power ISA in Tags Active mode (and so processor hardware) does indeed know about the notion of a segment boundary and so does detect segment violations. However, the current Power ISA does this only for 16-Mbyte segments. Although IBM i does support a number of different segment sizes, the hardware is unaware; it treats all segments as 16-Mbyte segments and helps out with EAO interrupts as though this were always the case. Perhaps a preferred architecture would have provided a means for the hardware to know of the segment size and do such bounds checking per this additional information. (More on this later.)
Trusted Code Generator and the Machine Interface
There is no notion of a traditional assembler for use in generating code in Tags Active-based programs (called Program Objects in IBM i). Program Objects and their contents are only generated by the operating system, and more specifically by the privileged kernel of IBM i at that. A program really does still run “on the hardware”, but the only instructions that the program can execute are those generated by IBM i’s Trusted Code Generator residing in the privileged kernel. You can think of this code generator as a common back end stage of any compiler supported on IBM i (but not quite).
IBM i’s architecture includes a notion of a high-level Machine Interface (a.k.a. “The MI”). It does not correspond to any processor’s ISA. (As a result, IBM i can – and has – moved from ISA to ISA as long as that processor architecture also supports its needed addressing support.) Much of the MI is in support of the object-based functions mentioned earlier. (See MI Specification.) Some of it, though, includes higher level functions used by program languages, some appropriate as a compiler’s intermediate text.
The point here is that this architecture, this control of instruction generation, helps ensure that only virtual addresses valid to the needed operation are used and generated. If an explicit effective address needs to calculated, the code generator ensures that that address remains in the same segment and does so quickly. If a Tagged Pointer is to be referenced, the code generator ensures that the instruction(s) needed to validly load that pointer for use is generated. If a Tagged Pointer needs to be created by a user-level program – say, after calculating an address – the code generator ensures that the correct instructions for rapidly doing so are generated in line within the Program Object.
More on the Tagged Pointer
So what is IBM i's Tagged Pointer and what does it take to work with it?
A Tagged Pointer is 16 bytes (128 bits) in size and includes a separate Tag state. (I am not calling it a bit here because it is not.) The Tag state exists as a separate part of this 16 bytes in main storage and in the processor’s cache. (A later discussion will show how it is maintained in persistent storage.) IBM i’s Tagged Pointer is generally made up of a 64-bit SLS virtual address and a few bits representing the type of the Tagged Pointer.
The Power ISA includes a small set of specialized fast instructions for working with the Tagged Pointers. As mentioned above, these are generated by the Trusted Code Generator only.
The Power architecture's register set consists of 32 64-bit GPRs (General Purpose Registers). In this architecture, the GPRs are only these 64 bits and do not include a Tag indicator. A single Tag indicator is carried in a separate register called the XER. The XER contains many other program status bits as well. It follows that in this architecture, only the Tagged Pointer’s 64-bit virtual address can be copied between registers, not the pointer’s tag state. (Tagged GPRs may have been a better solution.)
The process of loading a Tagged Pointer into a GPR, then, is a process of loading a target GPR with the Tagged Pointer’s 64-bit virtual address and loading the XER[TAG] with the state of the pointer’s tag; the high order 64 bits are also loaded into a paired register but is less important. But, more importantly to address protection, if the tag’s state is OFF (i.e., invalid), the virtual address loaded into the GPR is the NULL address or, instead, an interrupt is generated. In the NULL generation case, a single specialized single-cycle Load instruction is generated. (The difference comes from the program’s exception model. In the NULL address case, the exception on NULL occurs if the address gets used. Otherwise, the exception is generated at the time that the address is loaded. The code generator, of course, generates the right code for the purpose.) I should add that the code generated also includes the process of validating the contents of the type field of the Tagged Pointer. This is similarly done rapidly and simply validates that the pointer type is of the expected type; if a Space Pointer is expected, for example, the Tagged Pointer loaded needs to be a Space Pointer.
The process of storing a Tagged Pointer is slightly more complex but is similarly executed in line in user code as part of the Program object. As with loading from the contents of a Tagged Pointer, storing – creating one – is done with a rapidly-executing specialized instruction. This instruction stores the contents of a contiguous pair of 64-bit GPRs as well as the contents of the XER[TAG] bit. So the process of storing includes the creation of a GPR with the needed pointer type bits in the first of these GPRs, the setting of the XER[TAG], and the storing of this data. In this architecture, this requires multiple instructions.
Only this process of storing – a process supported only by the Trusted Code Generator when needed - ensures that the resulting Tagged Pointer is valid in memory and in the cache. Any other store to 16 bytes done differently, results in the tag’s state being set OFF. Similarly, an I/O DMA into memory sets OFF the tag of all accessed 16-byte locations.
A program’s copying of data from one buffer to another similarly requires the use of specialized instructions if the tag’s state(s) is to be maintained. If tag maintenance is not required, the target buffer’s tag states are all set off. If tag maintenance is required AND if the source and target buffers are similarly aligned relative to 16-bytes, the instructions to maintain the tag (via the XER[TAG]) are used.
Tags and Persistent Storage
Main storage is a cache of the SLS-based data in persistent storage. Main storage is packaged as various size pages (typically 4K bytes along with every 16-byte’s tags). Persistent storage could be packaged the same way, but let’s assume not. Let’s instead assume 512-byte sectors, eight of them capable of holding the data contents of a 4096-byte page. But this does not include this page’s 256 tags. In this architecture, the associated I/O DMAs out of and into main storage don’t directly access these tags either. As a result, the process of writing a page out of main storage and to the I/O device requires first the extraction of the tags. It follows that the processor of reading a page into main storage from an I/O device requires the post processing of the page if some tag bits need to be set ON. It also follows the tag information would need to be DMAed separately and packaged separately. The packaging and so this process is a function of the persistent storage device; for example, 520-byte sectors could – and in IBM i do - include the tags for the associated 512 bytes of data.
… so ends the physical management of the tags.
Object Protection and Security
As perhaps expected for any capability, in order to access an object, a program needs a valid pointer to that object. And in order to get a valid pointer to an object, the executing program needs to show that it has the right to that object. Although the process of simply loading, storing, and copying of pointers can be done very quickly via inlined program code, the process of proving the rights to an object and from there getting a newly created Tagged Pointer is a function of the privileged part of IBM i. In IBM i, we are referring to objects as defined by the MI, not those defined by object-oriented languages. As a case in point, IBM i has an object called a User Space. A User Space is essentially 16 Mbytes of SLS-based, potentially persistent, virtual address space. A Process must prove its right to access that User Space, but IBM i does not enforce much about what is done with the contents of this object. Once a program has a proven that it has the rights to use a System Pointer to this object, it can also easily generate a Space Pointer into this object, alter that Space Pointer as it likes, and construct anything it wants anywhere within this User Space.
Of Segments and Allocated Storage
IBM i supports a number of different segment sizes within SLS. Most tend to be 16 Mbytes in size. When an MI object gets created, such as most data base file objects, they are typically built from these 16-Mbyte segments. As the database grows, these objects allocated an additional 16-Mbyte portion of SLS' virtual address space. Sixteen Mbytes of virtual address space, yes, but they are not ordinarily allocated 16 Mbytes of persistent storage at the time. So it is entirely possible to generate a virtual address still within a segment, but outside its allocated storage. Using that address produces an addressing violation.
The point is that this too is a bounds check. Yes, we don’t necessarily want to excessively consume SLS virtual address space, but only that portion of the segment actually used by the object is treated as valid virtual memory. An MI object can, and often does, consist only of some single segment of SLS, but it uses something far smaller – like a single page – of that segment.
Of IBM i’s Teraspace
Every object consumes some part of SLS’ virtual address space. In IBM i, determining this amount is a function of an encode of the virtual address itself. As a result, extremely large segments would also have consumed and kept a disproportionate amount of this address space. But, there was/is a very real need for at least non-persistent and extremely large regions of contiguous virtual address space. (Heap objects, for example, were often made up of many multiple 16-Mbyte SLS segments. The programmer often wants this to appear as a contiguous virtual address space so this implementation dependency showed through into applications and it was not liked.)
Partly because IBM i’s MI existed to allow programs to be automatically recompiled and pick up architecture changes, IBM i was able to invent a notion called Teraspace. Teraspace is essentially a Tags Active mode process-local form of virtual addressing and capable of extremely large allocation of process-local virtual address space. (Recall that Tags Inactive mode, used for AIX programs, is also process-local addressing, but not available to program’s using IBM i’s Tag Active Program objects.) Further, unlike Tags Inactive addressing, the Tags Active Teraspace shares part of SLS. To explain, prior to Teraspace’s introduction, SLS and its MI objects was allowed to use all of the 64-bit address space. With the introduction of Teraspace, every process gets its own very large Tags Active virtual address space, 248 bytes in size. To do this, the MI and Tags Active Power architectures were changed to identify these process-local Teraspaces in SLS by encoding the high order 16 bits of SLS as being 0x0000; this value meant Teraspace to the hardware and MI.
All interesting, I’m sure, but what does this mean to capability addressing? I am getting there, but first a step into some background.
You might have guessed, but address translation for Tags Active SLS addresses is different than for Teraspace and Tags Inactive (AIX) process-local addresses. In the Power architecture, process-local address translation is essentially a two-step process:
- Effective Address segments are translated to Virtual Addresses via a process-local segment table (and its associated hardware cache) and
- Virtual Address pages are translated to Real Address pages via a page table (and its associated hardware cache).
SLS address translation, being common to all processes, is not process-local and so entirely skips the first of these two steps; for SLS address translation, the Effective and Virtual address are essentially the same. Teraspace is also process-local and largely follows the same address translation process as is used for Tags Inactive. In this current Tags Active Power architecture, the hardware makes the choice of which translation type to use based on the Teraspace encode of Tags Active Effective Addresses (i.e., 0x0000).
Back to capabilities …. In IBM i, Teraspace can address SLS objects. Said differently, Teraspace storage can contain Tagged Pointers. It is possible for the inverse to be true as well, but that would mean that potentially globally accessed objects would be addressing into process-local storage. Tagged Pointers can, though, contain Teraspace addresses if used within the Process creating these pointers. Teraspace addresses, being process-local, don’t normally need to be tagged for reasons of MI object protection. In IBM i, Teraspace addresses are just 48 bits in size with the 0x0000 encode padding it out to 64 bits. (This is enforced by the trusted code generator so a program can not simply convert a Teraspace address into an arbitrary SLS address.)
The reason to create Tagged Pointers containing Teraspace addresses is to help with parameter passing between routines; the target routines may have been compiled to assume that an incoming parameter is being addressed with a Tagged Pointer. As a result, that routine will attempt to validate the incoming pointer which, unless tagged, would be treated as invalid. I’ll add, BTW, that if the target routine knew that the incoming parameter was Teraspace, that address can be passed via a GPR and as a result be slightly faster versus first requiring the storing of a Tagged Pointer and loading/validating in the target routine.
Although created initially merely for reasons of large contiguous addresses, Teraspace really is a process-local address space. Anything that could be process-local – like Stack and Heap – can be placed at the programmer’s preference in either Teraspace or SLS segments.
This, of course, is the current architecture, a direction taken because of its evolution from pre-existing processor architecture. The process-local Teraspace had to be some way separate from that of the potentially global SLS addresses when seen by the hardware. But that was the only real requirement. I ndeed, rather than Teraspace being 248 bytes in size and require the trusted code generator to ensure that Teraspace addresses remained Teraspace addresses, it would seem preferable for Teraspace to also be a 264-byte process-local address space with some external means of identifying it as such.
IBM i’s Tagged Pointer’s Overhead
Although Tagged Pointers can be processed fast, it is not always fast enough and does consume twice as much storage as a basic 64-bit pointer. Again, IBM i’s Tagged Pointer contains the Tags Active 64-bit virtual address. Once validated and if either the Trusted Code Generator or IBM i’s kernel can continue to protect the validated virtual address, this 64-bit value gets used in subsequent processing. Being only this 64-bit value, and known to be valid, subsequent processing can be faster and consume less storage.
For example, picture a Tagged Pointer past into a routine. There it may get validated and have its virtual address loaded into a GPR. From there this validated virtual address may be left in a GPR for use, it may be modified in a secure manner, or it might even be stored into a trusted stack frame. Trusted modification, of course, also means segment bounds checking. Similarly, suppose we have a call into the kernel which passes in Tagged Pointers as parameters. Here the kernel similarly validates these address parameters, loads the virtual addresses into GPRs, but then the trusted kernel can to anything that it needs with that address. Given the need to modify these parameter addresses, the kernel can also know the associated segment's sizes, doing bounds checking per this size.
General Thoughts on Segment Bounds Checking
In what follows, it is useful to keep in mind that the base architecture for the IBM i is roughly 25 years old. It has changed considerably in the intervening years, but each such change is partly constrained by what came before, much of it based on what processors and systems could do 25 years ago. This was also based on what computing architecture was at that time. Processor, systems technology, and computing in general has changed a lot since then.
In the System 38, one of IBM I’s primary predecessors, the single SLS-based virtual address was key; protecting that address via its Tagged Pointer seems secondary. You can rather see this by looking at the register space which does not include tags. As a result, IBM i’s segment sizes and its segment bounds checking – and, as you saw, Teraspace - are based upon an encoding of the SLS virtual address.
Instead, one could envision – as you folks are doing – a set of Address Registers, each carrying with it the tag(s) and segment size. In the currently envisioned capability pointer, the segment size appears to be any arbitrary length, largely required because these pointers can change and still be useful as long as the changed address remains within the bounds of the original protected object. The effect, unfortunately, is a 256-bit capability pointer used to encapsulate a 64-bit virtual address. For what it is worth, IBM I developers consider the 128-bit Tagged Pointer as also unfortunately large.
There might be an alternative. Start by having what you already have, a set of Address Registers. Define for these a set of processor instructions sufficient to your needs, again largely what you are doing. One set of these have the purpose of modifying the contents of an Address Register; such a modification must validate that the result remains within the object’s bounds, interrupting or invalidating the result if not. The current instructions seem to do that by adjusting both the address and the remaining length.
Instead, carry what IBM I is doing to its next logical level. (This happens to be something we considered prior to having been forced to retrofit the preceding PowerPC architecture to support what was at that time the AS/400.) Again, define a set of Address Registers with their associated instructions. Define segments as the protection domain, embedding objects within them. Tag the Address Registers, identifying there the size of the segment associated with that virtual address. (We happened to do that via the encoding of a 12-byte virtual address, but the concept is the same.) Also recall that all segments are aligned per their size; a 16-Mbyte segment is 16-Mbyte aligned, a 64K-segment is 64K aligned, and an 8K segment was to be 8K aligned. EAO checking – address bounds checking – was relative to these segment boundaries. It did not matter what the current valid virtual address was within these segments; the bounds checking was done relative to the start of the segment which was based on the virtual addresses size encoding.
So, rather than 64-bit object size associated with each capability-based address, instead keep the 64-bit virtual address, but associate with that a power-of-two-based segment size. The resulting 128-bit Tagged Pointer might then appear as follows:
If there are to be 1-byte (20) through 264-byte segments, only 64 segment size encodes would be required. (We had not envisioned needing that many different segment sizes.)
An address calculation, whether done implicitly in a storage accessing instruction or done explicitly upon an Address Register, is just the basic indexed register or immediate displacement addition to the contents of an Address Register. Bounds checking need not get in the way. With the segment size identified, subsequent processing merely checks that the number of bits in the segment are the same between the source address and the resulting address. If there is a miscompare, the storage access either ceases with an interrupt or the explicit address calculation updates the target GPR with an invalid (NULL or untagged?) address or throws an interrupt.