Library and executable versioning

A target means a project which is either a library or executable written in the Xc++ language.

A target is uniquely identified by its target name. This is the logical path to the project directory of the target.

E.g. "Ceda/cxObject" is a target name. Note that target names make use of directory structures. A target name shouldn't include version numbers.

A target may define any number of $model data types. Schema evolutuion refers to the support for adding and removing members of a given $model in different software releases of the target.

Release Sequence Numbers (RSNs) are used to uniquely identify each formal release of a target. The purpose is to relate persistent object schema changes back to software releases. RSNs are zero based (i.e. they start at 0).

Within a CEDA database the IPersistable objects undergo schema evolution lazily. An object is only updated to the latest schema when it is modified and written back to disk. Within a CEDA database there can be objects recorded with an old schema. After a software update involving schema changes a CEDA database can always be opened and used immediately with no down-time even though there may be billions of objects that now have an out-of-date schema.

Example

As an example consider that a library which defines a $model named Point has been released with the following schema changes:

RSN	Point structure	Comment
0		Not defined in this release
1	`struct Point { float64 x; float64 y; };`	Initial schema has 2D Cartesian coordinates
2 and 3	`struct Point { float64 x; float64 y; float64 z; };`	A z coordinate was added
4	`struct Point { float64 x; float64 y; };`	The z coordinate was dropped
5,6 and 7	`struct Point { float64 r; float64 theta; };`	Changed to polar coordinates

The schema evolution of the Point model is specified as follows:


$model+ Point
{
    float64 r;
    float64 theta;

    $dropped
    {
        float64 x;
        float64 y;
        float64 z;
    }

    $schema
    {
        1:  +x +y              // Release #1    : (x,y)
        2:  +z                 // Release #2,3  : (x,y,z)
        4:  -z                 // Release #4    : (x,y)
        5:  -x -y +r +theta    // Release #5-   : (r,theta)
    }

    $evolve
    {
        r(x,y)     { r = sqrt(x*x + y*y); }
        theta(x,y) { theta = atan2(y,x); }
    }
};

Subtargets

It is common for $models of one type to be composed from $models of other types. For example model Triangle is composed from three Point models:


$model+ Point
{
    float32 x,y;
};

$model+ Triangle
{
    Point vertices[3];
};

Note that when a Triangle object is serialised (converted into a stream of bytes) it in turn serialises the models from which it is composed

In particular an IPersistable object may be composed from many different kinds of models.

The CEDA implementation only records a single RSN in an IPersistable object in order to track its schema and the schema of all models within it. This is even though many different types of models, possibly defined in multiple targets may be serialised when the object is serialised.

The approach allows for a schema number to only be stored at the top most level of a persistent object, and not in each $model from which it is composed. There could be significant advantages in doing so. For example, a PointSet object may store a million Point $model instances and we avoid the need to store a million schema numbers.

To allow this it is necessary to map release numbers between targets where there is a compile time linking dependency.

We say target C is a subtarget of target P if C is either directly or indirectly a sub-project of P. This is regardless of whether the libraries are shared or static libraries.

The subtarget relation is irreflexive, anti-symmetric and transitive.

If C is a subtarget of P then the serialisation of a $model defined in P may contain the serialisation of a $model defined in C.

We need to be very clear on what a compile time linking dependency means, so we first quickly review the different linking options. Paraphrasing from MSDN, there are three cases

Static linking : the linker gets the referenced functions from the static link library and places it directly with the application code to build a self contained executable
Dynamic linking : A Dynamic-link library (DLL) is an executable file that acts as a shared library of functions. Dynamic linking provides a way for a process to call a function that is not part of its executable code.
1. DLL with implicit linking: When the source code for the calling executable is compiled or assembled, the DLL function call generates an external function reference in the object code. To resolve this external reference, the application must link with the import library (.lib file) provided by the maker of the DLL.
2. DLL with explicit linking: This eliminates the need to link the application with an import library.

We consider 1 and 2(a) to represent a compile time linking dependency, whereas 2(b) is not.

Declaring the releases of a target

Each target must use the macro mRegisterTarget() to define the name of the target, the names of all its sub-targets, and a table of all the releases. The latter relates release numbers of the target to release numbers of its sub-targets. Here is an example


mRegisterTarget(
    // Name of this target
    "Nasa/SpaceShuttle",

    // Names of the sub-targets
    {
        "Nasa/CommsModule",
        "Nasa/Scrubber",
    },

    // Releases
    {
        { "Release 1.0",   {2012,4,18},  0,2   },      // Release #0
        { "Release 1.1",   {2015,7,11},  3,2   },      // Release #1
        { "Release 1.2",   {2015,8,24},  3,2   },      // Release #2
    });

This declares the target to be named Nasa/SpaceShuttle. There are two subtargets, called Nasa/CommsModule and Nasa/Scrubber.

Each target has its own Release Sequence Number (RSN) to uniquely identify each formal release of the target. RSNs start at 0.

There have been three releases of Nasa/SpaceShuttle, as follows:

SpaceShuttle RSN	Date	CommsModule RSN	Scrubber RSN
0	18 Apr 2012	0	2
1	11 Jul 2015	3	2
2	24 Aug 2015	3	2

For example, the first release (ie release #0) of Nasa/SpaceShuttle was on 18th April 2012 and compiled against release #0 of Nasa/CommsModule and release #2 of Nasa/Scrubber.

Ceda is able to make use of this information in such a way that it is able to correctly deserialise a given model based on the RSN of the target that stored the containing IPersistable object. This is achieved with very small CPU overhead – it is basically the cost of indexing into an array on top of the expected conditional logic used during deserialisation.

Declaring the releases of a leaf target

A leaf target means a library or executable which has no static linking dependencies on other libraries.

The macro mRegisterLeafTarget can be invoked exactly once in a cpp file within a leaf target to declare all the releases of that target.

Used for a target defining $models for which there are no subtargets defining $models

A new release must be declared when any $models within the target undergo schema evolution.

These releases may correspond to a proper subset of the formal releases of the target because changes to functionality without involving schema evolution of a $model do not require a release to be declared in mRegisterLeafTarget.

Example:


mRegisterLeafTarget(
    {
        // version   year  mon  day
        {   "0.1",   {2015, 01, 01}  },      // Release #0
        {   "0.5",   {2017, 03, 23}  },      // Release #1
        {   "1.0",   {2018, 09, 06}  },      // Release #2
    })

Declaring the releases of a non-leaf target

Used for a target defining $models for which there are subtargets defining $models

The declared subtargets are only the direct subtargets. The framework calculates the transitive closure to give both the direct and indirect subtargets.

It is only necessary to define direct subtargets for libraries which define models which are directly embedded within models of this target. In theory one could link against a library but not be embedding any of its models; there is no need to declare a dependency in such cases.

A new release must be declared when any models within the target or models embedded within the target undergo schema evolution. That means that typically when a subtarget has a new release, it is necessary to also add a release to this target (so there's kind of a domino effect)

Example:


mRegisterTarget(
    // Names of the sub-targets
    {
        "Ceda/cxModel"
    },

    // Releases
    {
        // version   year mon day    cxModel
        { "0.1",    {2015, 01, 01},     0   },     // Release #0
        { "0.7",    {2015, 02, 24},     4   },     // Release #1
        { "0.23",   {2015, 03, 12},     5   },     // Release #2
        { "1.08",   {2015, 07, 15},     5   },     // Release #3
        { "2.04",   {2015, 11, 31},     6   },     // Release #4
    })

Implementation details

Targets

Each library or executable is a target.

C is called a sub-target of P (written C < P) if P statically links directly or indirectly against C. E.g. Ceda/cxObject is a sub-target of Ceda/cxPersistStore. The sub-target relation is irreflexive, anti-symmetric and transitive.

Note that we can distinguish between direct and indirect sub-targets.

Target Index Numbers

cxObject contains a threadsafe singleton registry called the TargetRegistry that can be used by a target within a running process to obtain its unique Target Index Number or (TIN), an integer of type ssize_t.

The TargetRegistry records the 1-1 mapping between targetName and TIN. A target will register itself, obtaining its TIN when it is first loaded into the process. A target records its TIN in a static global variable named s_localTin so it is local to the target.

Definition : Let tin(T) be the TIN assigned to target T (within a given running process).

Release sequence numbers

Each target has its own Release Sequence Number (RSN) to uniquely identify each formal release of the target.

Consider target T4 with sub-targets T1,T2,T3. Then we can build a table for the releases of T4 showing the corresponding RSNs of T1,T2,T3. For example

    T4        Date         T1     T2     T3
    RSN                    RSN    RSN    RSN
    ------------------------------------------
     0    23 Feb 2008      14      3      0
     1    18 Jan 2009      16      3      0
     2    09 Mar 2009      17      3      4
     3    14 Aug 2009      20      3      6

Given target T let CurrentRSN(T) be the last RSN in the table for T. For example, CurrentRSN(T4) = 3

Given target T let Rsns(T) = { r | 0 <= r <= CurrentRSN(T) } be the set of valid RSNs of T

Let S be a sub-target of T and r in Rsns(T). Let SubTargetRsn(T,r,S) be the RSN of S corresponding to release r of T. For example SubTargetRsn(T4,2,T3) = 4

T1,T2,T3 may in turn have sub-targets so they will have similar tables. Note for example that if T1 is a sub-target of T3 then the above column of T1 RSNs would be redundant and would not be directly recorded by the application programmer.

Mapping RSNs

When a target is first loaded into a process, it allocates and populates a multidimensional array used to map its RSNs to all direct or indirect sub-target RSNs.

Given target T, let SubTargetRsnMap(T) be a pointer to an array of pointers to arrays of integers (of type ssize_t), such that

SubTargetRsnMap(T)[r][s] = SubTargetRsn(T,r,S)

where S is a sub-target of T having s = tin(S), and r in Rsns(T)

Note that SubTargetRsnMap(T)[r] is the address of an array of integers