DataSetId

A WorkingSet has a 128 bit UUID called a DataSetId which identifies the data in the working set. This is used as a sanity check to avoid data synchronisation between working sets that are supposed to manage unrelated data sets.

For example, there may be an image library working set for disney icons. This may be replicated on many machines, but they all share the same DataSetId. This prevents the mistake of synchronising the disney icon library with some other image library and getting the union of the two.

If the DataSetId is null then it means the working set is empty and is allowed to be synchronised with any other working set.

A MASTER working set allocates a fresh DataSetId when the working set is first created. When the working set is replicated, all SLAVES are given the same DataSetId.

A SLAVE is assigned a null DataSetId when the working set is first created. It is initialised when it is first connected to a peer which has the DataSetId initialised.

If an attempt is made to connect working sets with different DataSetIds then an error is reported and no data synchronisation takes place.

DataSetIds are not working set identifiers

There is a problem with using a DataSetId as a working set identifier. The function OpenWorkingSetMachine is passed a string name to uniquely identify the working set in the PSpace.


    WorkingSetMachine* OpenWorkingSetMachine(ConstStringZ name, bool dataSetMaster)

When creating a new working set we must call this function with dataSetMaster=true so that the DataSetId is allocated with a call to CreateGuid(). So we don't know the DataSetId until after we have opened the working set. That means the string name cannot be derived from the DataSetId.

It follows that a DataSetId should only be regarded as a sanity check, not a working set identifier.