A union type is defined in terms of a set of underlying types. The set of values of a union type is the union over the sets of values of the underlying types.
Each underlying type is a subtype of the union type in the sense of subtype as subset.
Consider the following definition of a data type named X
$variant+ X
{
char8;
int32;
float64;
string8;
};
A variable of type X is able to hold either a char8, int32, float64 or string8 value. It is 'tagged' meaning that it is possible to determine which of these 4 types of values a variable of type X currently holds.
X is implemented using a 32 bit tag value, plus sufficient space to take up the largest of the types to union over.
In this case one of the types (string8) has non-trivial copy constructor, copy assignment and destructor methods. Nevertheless it can appear in the union. For example, when a variable of type X destructs, it will call the string destructor if it holds a string value.
X has no defined default value. Therefore there is no default constructor available. The following code generates a compiler error:
// error C2512: 'X' : no appropriate default constructor available
X x;
A default value can be specified at the top of the variant definition. For example the following sets the default value to be an int32 equal to 23.
$variant+ X
{
default 23;
char8;
int32;
float64;
string8;
};
There is now a default constructor, allowing for the following code:
X x;
assert( x.is<int32>() );
assert( x.as<int32>() == 23 );
Specifying values of a union type is easy because there are implicit conversions from each of the underlying types in the union.
There are overloaded constructors for X for each of the underlying types:
X x1(10);
X x2('z');
X x3(3.14);
X x4("hello world");
Variables of type X can be assigned values of any of the underlying types:
X x1 = 10;
X x2 = 'z';
X x3 = 3.14;
X x4 = string8("hello world");
The template function:
bool X::is<T>() const
is defined for each underlying type T of the union type. This returns true if the value is of that type.
For example
if (x.is<int32>())
{
std::cout << "x is an int32\n";
}
else if (x.is<string8>())
{
std::cout << "x is a string8\n";
}
The template member functions
const T& X::as<T>() const;
T& X::as<T>();
are defined for each underlying type T of the union type, allowing for downcasts to type T as both an l-value and an r-value. In debug builds an assertion fails if the variant isn't exactly of that type. In release builds you get undefined behaviour
Example:
x.as<string8>() += " world"; // as l-value
string8 s = x.as<string8>(); // as r-value
operator==() and operator!=() are always defined on union types. Two values compare equal if and only if they have the same underlying type and the same value of that type.
Example:
X x1 = 65;
X x2 = 66;
X x3 = 'A';
X x4 = int32('A');
assert( x1 != x2 );
assert( x1 != x3 );
assert( x1 == x4 );
assert( x1 == 65 );
This is somewhat illogical but we cater explicitly for a notion of a "value" that represents a missing value (yes that is self contradictory), but if one really wants to embrace a notion of NULLs, whatever than means, then we provide a convenient syntax!
In the following example, one of the types is 'void'.
$variant X
{
// Trying to specify a default value causes a compiler error
//default int16(100);
void;
int16;
};
A curious feature of doing so is that it defines a default constructor which corresponds to the "void" value, so it is not possible to define a default value with the 'default' keyword.
X x;
assert( x.is<void>() );
Arrays of the form T[SIZE] are mapped into the template Array<T,SIZE>. Consider the following union type
$variant X
{
int32;
char8[10];
};
The type Array<T,SIZE> must be used for the is<> and as<> template functions:
typedef Array<char8,10> array;
array s;
for (int i=0 ; i < 10 ; ++i) s[i] = char8('a' + i);
x = X(s);
std::cout << "x = " << x << '\n';
assert( x.is<array>() );
assert( x.as<array>()[1] == 'b' );
Any of the underlying types within the union can be specified as "heap allocated" by preceding the type with an asterisk.
This causes the implementation to only record a pointer to the value within the space shared by all the types in the variant.
However, this is really just an implementation detail. Semantically the variant still records a value of one of the types. All the implicit conversions are supported and the is<>() and as<>() template functions are defined without regard for the indirection used in the physical implementation.
One reason for creating an indirection could be deal with a significant mismatch in sizes of the types.
$variant X
{
int32;
*char8[1000]; // heap allocated so sizeof(X) = 8, so that when int32's are
// stored it isn't so wasteful.
}
The following is impractical (there is no good reason to heap allocate the float64)
$variant X
{
default 0;
int32;
*float64; // Asterisk means the float64 is heap allocated
};
The following code illustrates how it can be used:
xvector<X> L;
L.push_back( 10 );
L.push_back( 3.14 );
assert( L[0].is<int32>());
assert(!L[0].is<float64>());
assert(!L[1].is<int32>());
assert( L[1].is<float64>());
See the Shape example for another example of a union type, and for details about how to define polymorphic functions on union types.