Log Structured Store (LSS)

The CEDA database management system is built on top of the CEDA Log Structured Store. This provides low level persistence for arbitrary sized binary objects indexed with 64 bit identifiers.

The LSS has all the features required for an industrial strength storage system. In particular it fully supports transactions and guarantees their atomicity in the face of power failures. It supports recovery, backup, hot standby and is self cleaning to avoid fragmentation.

Multiplexing of the output allows for hotstandby and backup consistent with 24 x 7 operation.

Under the hood provides some information about the LSS implementation.

Summary of features

LSS is a persistent heap

The LSS could be regarded as a key-value store where the keys are 64 bit integers called Serial Element Identifiers (seids) and the values are arbitrary length octet ("byte") strings called serial elements.

Another possible analogy is to consider the LSS to be a persistent heap. The keys are like "pointers" to "memory buffers" that persist on disk.

The serial elements are like byte streams, in that they are both read and written in the manner of an I/O stream. In fact a single serial element can be much larger than the available physical memory, and yet can be read or written efficiently as a stream of octets.

The LSS doesn't care about the content of each serial element, as far as the LSS is concerned it is just a sequence of octets. The number of octets in a single serial element can be anything from 0 to many terabytes.

All the serial elements persist in a single file on the file system of the host operating system (or alternatively a raw disk partition).

Typically serial elements reference other serial elements by storing seid values in the byte streams. Circular references are possible.

Seids are virtual addresses

Seids are "virtual" or "logical" addresses, not physical addresses, allowing serial elements to be written to a new physical location on disk while still being identified by the same 64 bit seid value. When a serial element is rewritten to a new location (typically with a different number and sequence of octets) the old version can be recycled.

Three basic operations

There are just three basic operations on the LSS, to write, delete, and read serial elements.

This is analogous to CRUD [] operations, except that writing a serial element encompasses both creating a new serial element as well as updating an existing serial element.

To update a serial element is to rewrite it from scratch. There is no concept of incremental updates to an existing serial element. Serial elements are always written and read in their entirety (i.e. from start to finish).

Transactions

All write and delete serial element operations are applied in the context of an explicitly declared transaction on the LSS.

Transactions are opened then later closed. Although any number of threads can perform transactions, they are applied sequentially. A thread opening a transaction blocks until another thread closes its transaction.

Transactions define atomic changes to the LSS. If the system crashes the LSS will always crash recover to a state where either all of a transaction is applied or none at all. Furthermore, if a transaction is applied then all transactions that preceded it will have been applied as well.

API

The LSS API is defined in the header file ILogStructuredStore.h

Public header files

cxLss
├── ILogStructuredStore.h
├── IRAS.h
├── LssExceptions.h
├── LssSettings.h
├── Seid.h
├── Seid2.h
└── cxLss.h