CEDA Union Types

A union type is defined in terms of a set of underlying types. The set of values of a union type is the union over the sets of values of the underlying types.

Each underlying type is a subtype of the union type in the sense of subtype as subset.

Consider the following definition of a data type named X


$variant+ X
{
    char8;
    int32; 
    float64;
    string8;
};

A variable of type X is able to hold either a char8, int32, float64 or string8 value. It is 'tagged' meaning that it is possible to determine which of these 4 types of values a variable of type X currently holds.

X is implemented using a 32 bit tag value, plus sufficient space to take up the largest of the types to union over.

In this case one of the types (string8) has non-trivial copy constructor, copy assignment and destructor methods. Nevertheless it can appear in the union. For example, when a variable of type X destructs, it will call the string destructor if it holds a string value.

Default values

X has no defined default value. Therefore there is no default constructor available. The following code generates a compiler error:


// error C2512: 'X' : no appropriate default constructor available
X x;

A default value can be specified at the top of the variant definition. For example the following sets the default value to be an int32 equal to 23.


$variant+ X
{
    default 23;

    char8;
    int32; 
    float64;
    string8;
};

There is now a default constructor, allowing for the following code:


X x;
assert( x.is<int32>() );
assert( x.as<int32>() == 23 );

Specifying values of a union type

Specifying values of a union type is easy because there are implicit conversions from each of the underlying types in the union.

There are overloaded constructors for X for each of the underlying types:


X x1(10);
X x2('z');
X x3(3.14);
X x4("hello world");

Variables of type X can be assigned values of any of the underlying types:


X x1 = 10;
X x2 = 'z';
X x3 = 3.14;
X x4 = string8("hello world");

Testing the type of a value of a union type

The template function:


bool X::is<T>() const

is defined for each underlying type T of the union type. This returns true if the value is of that type.

For example


if (x.is<int32>())
{
    std::cout << "x is an int32\n";
}
else if (x.is<string8>())
{
    std::cout << "x is a string8\n";
}

Downcasting expressions of union type

The template member functions


const T& X::as<T>() const;
T& X::as<T>();

are defined for each underlying type T of the union type, allowing for downcasts to type T as both an l-value and an r-value. In debug builds an assertion fails if the variant isn't exactly of that type. In release builds you get undefined behaviour

Example:


x.as<string8>() += " world";        // as l-value
string8 s = x.as<string8>();        // as r-value

Comparing two expressions of union type

operator==() and operator!=() are always defined on union types. Two values compare equal if and only if they have the same underlying type and the same value of that type.

Example:


X x1 = 65;
X x2 = 66;
X x3 = 'A';
X x4 = int32('A');
assert( x1 != x2 );
assert( x1 != x3 );
assert( x1 == x4 );
assert( x1 == 65 );

Union type having a void underlying type

This is somewhat illogical but we cater explicitly for a notion of a "value" that represents a missing value (yes that is self contradictory), but if one really wants to embrace a notion of NULLs, whatever than means, then we provide a convenient syntax!

In the following example, one of the types is 'void'.


$variant X
{
    // Trying to specify a default value causes a compiler error
    //default int16(100);
        
    void;
    int16;
};

A curious feature of doing so is that it defines a default constructor which corresponds to the "void" value, so it is not possible to define a default value with the 'default' keyword.


X x;
assert( x.is<void>() );

Union types having arrays for the underlying type

Arrays of the form T[SIZE] are mapped into the template Array<T,SIZE>. Consider the following union type


$variant X
{
    int32; 
    char8[10];
};

The type Array<T,SIZE> must be used for the is<> and as<> template functions:


typedef Array<char8,10> array;
array s;
for (int i=0 ; i < 10 ; ++i) s[i] = char8('a' + i);
x = X(s);
std::cout << "x = " << x << '\n';
assert( x.is<array>() );
assert( x.as<array>()[1] == 'b' );

Union types having underlying types which are heap allocated

Any of the underlying types within the union can be specified as "heap allocated" by preceding the type with an asterisk.

This causes the implementation to only record a pointer to the value within the space shared by all the types in the variant.

However, this is really just an implementation detail. Semantically the variant still records a value of one of the types. All the implicit conversions are supported and the is<>() and as<>() template functions are defined without regard for the indirection used in the physical implementation.

One reason for creating an indirection could be deal with a significant mismatch in sizes of the types.


$variant X
{
    int32;
    *char8[1000];   // heap allocated so sizeof(X) = 8, so that when int32's are
                    // stored it isn't so wasteful.
}

The following is impractical (there is no good reason to heap allocate the float64)


$variant X
{
    default 0;
    int32; 
    *float64;       // Asterisk means the float64 is heap allocated
};

The following code illustrates how it can be used:


xvector<X> L;
L.push_back( 10 );
L.push_back( 3.14 );
assert( L[0].is<int32>());
assert(!L[0].is<float64>());
assert(!L[1].is<int32>());
assert( L[1].is<float64>());

Another example

See the Shape example for another example of a union type, and for details about how to define polymorphic functions on union types.