Let `S` be a relational database schema.
`C(S)` (the extension of the database constraint) can be regarded as a non-empty relation so we can consider its unique prime Cartesian factorisation.
Claim: For any schema `S` there exists a schema `S'` which is information equivalent to `S` where the cardinality of every finite prime Cartesian factor of `C(S')` is a prime number.
This implies the bag of integers which are the cardinalities of the prime Cartesian factors of `C(S')` equals the bag of prime number factors of `|C(S)|`.
Note that the bag of prime number factors of any positive integer is uniquely determined, by the Fundamental theorem of arithmetic [].
`S'` represents a schema having maximal decomposition - i.e. into the most parts which can be updated independently.
The following is an outline of a constructive existential proof which involves relational expressions for each relvar in `S'` that are defined in terms of the relvars of `S`:
Let `F_1, ..., F_n` be the prime number factors of `N = |C(S)|` (the order doesn't matter)
Let the `N` database values in `C(S)` be organised into an `n` dimensional array of size `prod_(j=1)^n F_j = N` (again, the order doesn't matter).
For `j` in `1..n` and `i` in `1..Fj`, let `B_(ij)` denote the subset of `C(S)` corresponding to the set of database values in the ith slice of the jth component of the n-dimensional array (a slice has `n-1` dimensions).
For `j` in `1..n` and `i` in `1..F_j`, let `T_(ij) = uuu { matches(d) | d ∈ B_(ij) }`
where `matches(d)` evaluates to DEE
if the dbvar for schema `S` has value
`d`, otherwise DUM
. See how it can be defined here.
Note that `T_(ij)` evaluates to DEE
if the value of the dbvar for `S` is in slice `B_(ij)`, otherwise DUM
.
Let `x` be some attribute name. For `j` in `1..n`, let schema `S'` have relvar `R_j'` which is defined in terms of the relvars of `S` using the expression
`R_j' = uuu { T_(ij) ⋈` REL { TUP { (`x` `i`) } }` | i in 1..F_j }`
(this expression evaluates to REL { TUP { (`x` `i`) } } where the dbvar of `S` is in slice `B_(ij)`)
Note that for every database value each relvar `R_j'` of `S'` has exactly one tuple with one attribute. This follows because any database value (i.e. element of `C(S)`) lies in exactly one slice of the jth coordinate. The UNION over the JOINs picks out exactly one of the TUP { (`x` `i`) }. There are `F_j` slices for the jth component and hence `F_j` possible values of `R_j'`.
It is easy to see that a combination of the values of the `R_j'` corresponds to one of the elements of the n-dimensional array and hence to one of the database values in `C(S)`.
Let there be three lights coloured red, green and blue which can be on or off. There are `2^3 = 8` world situations which need to be recorded.
Let schema `S_1` have a single relvar with the predicate: The light with colour [COLOUR] is on.
Let schema `S_2` have three relvars with the following predicates:
`S_1` and `S_2` are information equivalent. `S_2` is an example of a schema with maximal decomposition.
Let there be three lights coloured red, green and blue which can be on or off, except they cannot be all off or all on. There are `6` world situations which need to be recorded.
The prime factorisation of `6` is `2×3`. A maximally decomposed schema can be achieved with the following two predicates:
See maximal decompositon of two monadic relations with an IND constraint.