Orthogonal persistence

Many researchers have commented on the ideology of orthogonal persistence.

Orthogonal persistence allows the programmer to treat all objects in the same way - whether they persist or not. An object is able to persist regardless of its type. In the context of OO systems an object persists if it is reachable from another persistent object.

A system can provide orthognal persistence to varying degrees. In the most extreme case, the programmer should not need to write transactions to break up work into atomic pieces. Instead, all threads are persistent so it can be assumed that on recovery a computation will carry on from where it left off. This is far from easy to implement efficiently. To support persistent threads, it is necessary for a system to look for a consistent cut on recovery. There can easily be a domino effect where the last "consistent cut" occurred a long time in the past.

Persistence and concurrency control don't mix well. This is to be expected because adding concurrency control to persistent objects implies that persistent objects are treated differently to transient objects - in conflict with the principle of orthogonal persistence. When a class is written and no mutexes etc are used for concurrency control, it is understood that objects of that class are not threadsafe. One would expect this to be independent of whether the object is reachable from a persistent root.

Consider a system where accessing the state of a persistent object can throw an exception. If the programmer doesn't distinguish between persistent and transient objects, then it follows that it must be assumed that access to any object can throw an exception. This is unworkable - because it becomes impossible to reason about the correctness of a program.

Consider that a process supports more than one independent persistent store - because of multiple hard-disks or floppy disks. This leads to confusion - what does it mean for an object to be reachable from two independent persistent stores? From which store is it loaded when it is next accessed? What if the object exists in inconsistent states on different media? What if one media rolls back and another doesn't? The conclusion is that multiple independent persistent stores within a given process are incompatible with orthogonal persistence.

A significant problem with the ideology of orthogonal persistence is the question of how to make changes to the system. If state such as threads persist then it is very difficult to see how to fix bugs, allow classes to add or remove members or implement new interfaces. It is as though we are trying to change as system while it is still running.

The latter is such a difficult problem, it would appear that the ideology of orthogonal persistence (at least in its true form) is unworkable.

In Praise of Manual Persistence provides a good description of some of the problems with orthogonal persistence.

A very amusing quote on that page:

Do you, Programmer, take this Object to be part of the persistent state of your application, to have and to hold, through maintenance and iterations, for past and future versions, as long as the application shall live?

How CEDA breaks othogonal persistence

The ceda design recognises that the ideology of orthogonal persistence offers some advantages, but intentionally breaks with the principle in certain respects.

Persistence is type intrusive in the sense that only objects that implement the IPersistable interface can directly persist as objects with identity in the persistent store.
Persistence by reachability is broken to allow a persistent object to contain transient state. This is useful for a number of reasons.
In Ceda it is assumed that observers are transient (with respect to the so called the observer pattern). Often a persistent object is an event source, and therefore needs to allow transient observers to attach and detach in order to be notified of events. For example persistent documents need to notify their transient views of state changes to allow the views to redraw. A persistent object may contain transient state such as mutexes, socket connections, file handles and worker threads. In all these cases, such state is not written to disk. Persistent objects are able to directly cache expensive calculations, without the cache needing to persist. The decision to persist a cache is made by the programmer.
Threads do not persist
All changes to the store are made transactional through explicitly declared CSpace transactions. This allows the system to provide atomicity.
Persistent objects need to be explicitly marked as dirty when they are modified.
Ceda makes use of a smart pointer template class pref for all pointers to persistent objects. A "raw pointer" can't be used by one persistent object to point at another persistent object. In the literature this is referred to as "software swizzling", as distinct from systems like ObjectStore that use "hardware swizzling", because persistent pointers depend on hardware support for detecting page faults when pointers are dereferenced.
Explicit Serialise() functions must be written to write the state of an object to disk.

How CEDA supports orthogonal persistence

However the principle of orthogonal persistence is adhered to in the following respects

Ceda models a single heap from which both transient and persistent objects are allocated. The 'new' operator is used in the normal way to create all objects.
If we limit ourselves to the objects that implement IPersistable, then Ceda implements persistence by reachability. This is achieved by tracing outgoing IObject pointers (provided by the IObject::VisitObjects() method). When a persistent object is modified, a trace rooted in that object is performed to find any IPersistable objects that have become reachable for the first time. These objects need to be allocated OIDs and written to the persistent store.
Persistence is orthogonal to type in the sense that objects of a given class (assumed to derive from IPersistable) may be transient or persistent. It only depends on whether they are reachable from a persistent root.
Persistent objects are not treated specially from the perspective of concurrency control.
- There is no special locking mechanism to protect access to persistent objects.
- There is no "transaction framework" that will roll back persistent objects when a serious error occurs. Instead, it is up to the programmer to deal with exceptions and other errors and make sure objects continue to satisfy their class invariants etc.
If we limit ourselves to pref smart pointers, we see that they are typesafe, can be compared and can point at either transient or persistent objects. In that sense, Ceda provides pointers that meet the requirements of orthogonal persistence.

may cause a persistent object to be "faulted" off disk into memory.