2.1 Basic Vector Clock Algorithm
The SData synchronization protocol is based on “vector clock” synchronization (http://en.wikipedia.org/wiki/Vector_clock), a well know technique used by several synchronization/replication systems: P2P networks, replicated file systems, Microsoft Sync Framework, etc.
Vector clocks provide an elegant solution to the problem of maintaining several copies of a given resource in sync and detecting conflicts. Let us assume that a given resource R is replicated on several nodes N1, N2…, the vector clocks are constructed as follows:
Every node maintains a “logical clock”: C1 on N1, C2 on N2, etc.
Every node increments its own clock tick when R is modified.
A vector clock for R is a list of pairs (N1 T1), (N2 T2), … where T1 is the C1 tick when R was last modified on N1, T2 is the C2 tick when R was last modified on N2, etc. If R was never modified on Ni, its vector clock does not contain any pair for Ni.
R may have different vector clocks at different nodes. The goal of the synchronization system is to bring R in sync across nodes. When R is fully synchronized, its vector clock is the same on all nodes.
By comparing the vectors clocks of R on two nodes, the synchronization algorithm can decide if the versions can be brought in sync (and how) or if there is a conflict. The following table explains how this comparison works (VRj stands for the vector clock of R at node j):
|VR1 = VR2||VR1=(N1 5)(N2 7)(N3 8) VR2=(N1 5)(N2 7)(N3 8)||N1 and N2 have the same version of R.||None|
|VR1 < VR2||VR1=(N1 5)(N2 7)(N3 8) VR2=(N1 5)(N2 8)(N3 8)||N1's version of R is an ancestor of N2's version.||Replace N1's version of R by N2's version|
|VR1 > VR2||VR1=(N1 6)(N2 7)(N3 9) VR2=(N1 5)(N2 7)(N3 8)||N2's version of R is an ancestor of N1's version.||Replace N2's version of R by N1's version|
|Not comparable||VR1=(N1 6)(N2 7)(N3 9) VR2=(N1 5)(N2 8)(N3 8)||N1 and N2 have a common ancestor but are on different branches||Conflict. We need an arbitrary rule to decide which side wins.|
This synchronization scheme has interesting properties:
It is a distributed algorithm. Every node manages its own clock and its own synchronization metadata (the vector clocks of its resources). Any pair of nodes can synchronize with each other without having to contact other nodes. There is no need for a central coordinator.
It does not constrain the topology. synchronization can be performed across arbitrary subsets (usually pairs) of nodes. Of course, who can do the most can do the least, so “star” topologies are supported but more complex topologies are supported as well.
It achieves “eventual consistency”. If the conflict handling rule is consistent throughout the system, all nodes will reach the same state if we stop modifying the data and we run enough synchronization passes (“enough” depends on the topology).
It is robust. If a node is restored to a previous state, it will not disturb the synchronization process and it will catch up with the other nodes (actually, if the node had been synchronized after the restore point, changes that occurred between the restore point and the last synchronization will be recovered when we synchronize after restoring the node). Of course this assumes that the synchronization metadata (the vector clocks) are backed up together with the data.