Seid allocation

Serial element Ids

Each serial element is uniquely identified by a 64 bit number called a Seid. A serial element cannot change its Seid over time.

Each time a serial element is written to the LSS, it must be rewritten in its entirety - even if only a small part of its content changes. It will in fact be written to a completely new location within the store - i.e. at the "end of the log".

If that seems particularly wasteful then the serial elements should be finer grained. Although it is intuitively better to have quite small serial elements to minimise the number of bytes to be written to disk when changes are made, it can often be better to use fairly course serial elements because that is an easy way to ensure clustering on disk is reasonably good. It also reduces the various space and time overheads associated with a serial element, such as the need to index its physical location. Note finally that the product of transfer rate and seek time for a modern hard-disk is quite large - of the order 512k, so it is not efficient to write lots of small objects to disk if they can become poorly clustered over time.

When a new serial element is to be written to the LSS for the first time, it is necessary to allocate a fresh Seid for it. According to the above section on clustering, the programmer needs to have some concept of partitioning the serial elements into groups that will be clustered on disk. Therefore Seid allocation involves two separate calls.


// Create a new Seid space.  This is the upper 32 bits of the Seids shared by a group of 
// related serial elements that should be clustered together on disk.
// Never returns 0.
virtual SeidHigh CreateSeidSpace() = 0;

// Allocate a new, unused Seid within the Seid space associated with the given SeidHigh.
// Never returns a null Seid.
virtual Seid AllocateSeid(SeidHigh seidHigh) = 0;

CreateSeidSpace() is used to create a high 32 bit part of a Seid, giving a "Seid space" allowing for up to 4 billion "related" serial elements. Each serial element within this cluster is allocated a Seid with a call to AllocateSeid().

In practice there shouldn't be anywhere near 4 billion serial elements in a single "cluster" because that would defeat the whole idea! The "sweet spot" for a cluster is of the order of a few mega bytes.

It is permissible to make these calls outside of an LSS transaction! Any number of threads can concurrently allocate Seids because the above Seid allocation functions are thread-safe.

Allocation of affiliate Seids

The following is an alternative Seid allocation function


   bool AllocateAffiliateSeid(Seid& seid);

This is useful for very large sets of serial elements where it is difficult to know a-priori how to divide the serial elements into separate Seid spaces.

Before this function can be used to allocate Seids, it is first necessary to "bootstrap" by calling CreateSeidSpace() to allocate a Seid space, then AllocateSeid() to allocate a root serial element in the space. Then, instead of calling AllocateSeid() to allocate additional serial elements, it can be preferable to call AllocateAffiliateSeid(). The Seid passed into the function represents an "affiliate" Seid to which the new Seid returned by the function will be clustered.

Eg the affiliate may be a parent node in a tree of nodes. AllocateAffiliateSeid() does a good job of allocating Seids for trees of nodes which grow over time from any position by adding child nodes.

Boot strapping a store

It is common for a serial element to store (within its byte stream content) the Seids of other serial elements. For example. these could represent the "children" in a whole-part hierarchy of objects.

Typically an application using the LSS will need to write some sort of "root" registry or directory object to the store with a known Seid. This is the starting point for accessing all other objects in the store.

ROOT_SEID is the Seid for this root serial element.

Just after creating a new LSS, it is guaranteed that the first call to CreateSeidSpace() will return the high 32 bits of ROOT_SEID. It is then guaranteed that the first call to AllocateSeid() (passing in the high 32 bits of ROOT_SEID) will return ROOT_SEID.