An InputArchive
is used to deserialise a sequence of octets back into a value of some type.
An efficient binary format is used (i.e. not using printable text). See serialisation.
InputArchive
provides extremely high performance.
Deserialisation is the inverse of serialisation.
Serialisation typically uses an OutputArchive and deserialisation uses an
InputArchive
.
An InputArchive
is essentially a const octet_t*
which represents a pointer to the next octet to be read.
There are implicit conversions to and from a const octet_t*
.
class InputArchive
{
public:
// Allow implicit conversions to/from a const octet_t*
InputArchive(const octet_t* p) : p_(p) {}
operator const octet_t*() const { return p_; }
private:
const octet_t* p_;
};
An InputArchive
can only be used to read from a contiguous block of memory.
This approach eliminates the need for buffer underflow checks - which would otherwise be needed on every single read operation. This is a significant benefit for the performance.
It is unsafe to use an InputArchive
to read a buffer that cannot be trusted because it represents a
significant vulnerability for reading past the end of the buffer. This can easily result in a read access
violation.
When the data is received over a network, deserialisation with an InputArchive
must only be used
with a trusted source with adequate error checking.
For example using Transport Layer Security [].
Since an InputArchive
is not associated with some kind of input stream, deserialisation is
"pure CPU".
That provides better performance, and reduces the amount of code that can throw exceptions - see the
interleaving computation and I/O Anti-pattern.
In some cases it may be possible to directly read from a buffer in the underlying input
device (since the const octet_t*
pointer can be made to point where we like). For example,
we might be able to read directly from a segment in the LSS segment cache - since most
objects are much smaller than the LSS segment size (e.g. 4MB). This reduces memory consumption,
avoids memory allocations, avoids memcpy's and improves CPU memory cache utilisation.
For every type T
that supports deserialisation, the following function is implemented.
void Deserialise(InputArchive& ar, const T& x)
Therefore there can be many overloads of Deserialise
(i.e. adhoc polymorphism).
The cxUtils library implements the Deserialise
function for the following types:
bool, char, signed char, unsigned char, char16_t, char32_t, wchar_t, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long, float, double, long double, std::pair, std::array, std::basic_string, std::vector, std::deque, std::forward_list, std::list, std::map, std::multimap, std::unordered_map, std::unordered_multimap, std::set, std::multiset, std::unordered_set, std::unordered_multiset, ceda::xdeque, ceda::VectorOfByte, ceda::xvector, ceda::CompressedInt, ceda::schema_t ceda::HPTime, ceda::HPTimeStamp, ceda::HPTimeSpan
For convenience clients can use operator>>
to deserialise variables/objects. E.g.
ar >> x >> y >> z;
is shorthand for
Deserialise(ar,x);
Deserialise(ar,y);
Deserialise(ar,z);
This is achieved with a single implementation of operator>>
for an InputArchive
:
template<typename T>
inline InputArchive& operator>>(InputArchive& ar, const T& x)
{
Deserialise(ar,x);
return ar;
}
A possible concern is the lack of buffer underflow checks - having these checks (at least in a debug build) can be very useful to track down errors in code.
The ceda framework will normally verify that the InputArchive
hadn't read past EOF after
deserialisation was completed so there's no chance that such errors go undetected.
In fact the ceda framework will normally ensure that application defined deserialise code reads the message, the whole message and nothing but the message. Therefore simply flexing the code unit tests it quite well.