Constraints are great for ensuring data integrity.
But there are some issues with imposing constraints...
There is a solution for OT on text, where the only constraint is that the characters are ordered. Now consider that we impose a much more complex constraint, say that the string is a valid Java program according to the Java language specification (e.g. it much conform to the grammar). It is difficult (impossible?) to ensure that in general the merge of arbitrary concurrent operations will satisfy such a constraint.
"Mark one box only" looks like an integrity constraint. But if we impose it we end up recording less information. The real world is messy; we often need relaxed constraints in order to record partial information about the world.
Relaxing a key constraint allows for recording conflicting information. That means the DBMS can merge edits and keep the conflicting information, allowing the users to fix it up as appropriate.
The reality is that information entered into a database is imperfect because users are not omniscient and they make mistakes. Also updates are asynchronous which can create temporary conflicts.
Imposing integrity constraints can mean the data always looks perfect, giving a false sense of security. The database provides no information about how often problems are occurring if it doesn't record conflicting or missing information.
Consider all those times you have filled out a form and something goes wrong and you have to fill it out from scratch again. It is better for database management systems to manage all data (including data as it is entered – i.e. drafts).
Constraints tend to mean multiple updates must be made to the data, it is not possible to update variables independently.
Constraints can mean an innocuous looking assignment to a simple variable raises an integrity constraint violation exception
x = 10; // CRASH!
A wait-for graph is a directed graph used for deadlock detection in operating systems and database systems.
In the following example, Txn1 locks var1 then var2. Tsn2 locks var2 then var1. This can lead to a cycle in the wait-for graph, a dead-lock. An integrity constraint over both var1,var2 makes this more likely, because any update to either variable will require reading the other to check the constraint.
Constraints slow down updates in three ways:
There is a solution where we get our cake and eat it too (i.e. all the advantages without the disadvantages). We apply constraints indirectly.
Writers access a base representation which is relatively unconstrained. A function maps the base representation to a derived representation which imposes the constraints.
Multi Version Concurrency Control (MVCC) is very useful for allowing concurrency between readers and writers.
Satisfying a complicated constraint is made easy, rather than it being difficult to achieve. Users are able to drive the output through a complicated space.
Update operators tend to be simple and infallible.
The function mapping to read only derived views may not be injective. That means that the base representation isn’t canonical. There can be significant advantages to not imposing canonical representations. Canonical representations often seem arbitrary (because they break symmetries). E.g. represent a triangle as an array of three vertices, but triangles have invariant under cyclic permutations of their vertices.
It is quite common to impose a canonical representation constraint on a representation. That often means a partial constructor function. In many cases it is possible to relax the constraint (think of any member of the equivalence class as being available to represent the class, and implement an equality operator to account for the fact that the representation is non canonical)
We can weaken the constraint so that the representation is non-canonical. For example there are multiple representations of the rational number 2/3.