1.4 Linking Scenarios

This section describes two possible scenarios for the linking process:

A first one where the linking process is run from a separate “linking utility” that interacts with the applications through their SData providers.
A second one where the linking process is run from one of the applications.

The aim of this section is to illustrate how the various SData APIs will be used in these two scenarios.

External Linking Utility

In this first scenario, the linking process is run from a separate utility that interacts with the applications through their SData providers. This is illustrated by the following diagram:

Linking 1.png

The linking utility proceeds with the following steps:

It queries provider 1 for a list of resources.
It queries provider 2 for resources that are potential link candidates for the resources retrieved at step 1.
It applies a process to pair resources from provider 1 with resources from provider 2. It also resolves data inconsistencies between the members of a pair.
For all the pairs generated at step 3, it sends the (UUID, provider 1 resource URL) mapping to provider 1 (if this mapping is new) and it updates the resource if necessary.
For all the pairs generated at step 3, it sends the (UUID, provider 2 resource URL) mapping to provider 2 (if this mapping is new) and it updates the resource if necessary.

Steps 1 and 2 use standard SData query requests on collection URLs. For example, to link UK accounts from both systems, the utility will the GET requests to the following URLs:

http://www.example.com/sdata/provider1/myContract/-/accounts?where=countryCode eq 'UK'&includeUuid=true
http://www.example.com/sdata/provider2/myContract/-/accounts?where=countryCode eq 'UK'&includeUuid=true

Then, at step 3, the linking utility will try to find matching pairs between resources of the first collection and resources from the second collection. This step will typically have an automatic part in which a set of rules are applied to find pairs, and a manual part where the user will review the candidate pairs to validate or reject them. The system can also be configured to validate automatically pairs that have a strong match and only require manual validation for pairs that have a weaker match.

If the intent is to synchronize resources after they have been linked, step 3 should also resolve the eventual data inconsistencies between the two resources that are being paired. The linking utility may re-use the priorities used to solve synchronization conflicts (see Conflict Handling section) to decide which side wins, but it may as well use a linking specific policy or require that the inconsistencies be resolved manually.

Steps 1, 2 and 3 can also be handled differently. Instead of querying both providers first, the utility could query provider 1 first, and then process each resource from provider 1 in turn and issue queries against provider 2 to find a matching resource. This variant is less efficient because it requires more queries but it may be easier to implement.

So, the details of steps 1, 2 and 3 may vary but from an SData API standpoint, these steps only require the ability to run usual SData queries on collection urls.

Once the pairs have been validated the utility will determine their UUIDs and store the mappings between UUID and internal ID in the providers, using POST or batch requests on the $linked URLs of the providers (see Linking Protocol section) . For a given pair, we have three cases:

If none of the resources had a UUID before then the utility will allocate a new UUID for the pair. It will then POST the (provider 1 URL, UUID) mapping to the $linked URL of provider 1 (step 4) and it will POST the (provider 2 URL, UUID) mapping to the $linked URL of provider 2 (step 5).
If only one of the resources had a UUID before then the utility will reuse this UUID and store it on the other side. For example, if the provider 1 resource already had a UUID, the utility will POST the (provider 2 URL, UUID) mapping to the $linked URL of provider 2. So, only step 5 will be executed.
If both resources had a UUID before and these UUIDs were different (if they were already equal, the utility can just ignore the pair as it is already linked), then the utility will pick a winner arbitrarily and store the winner’s UUID on the other side. For example, if the utility picks provider 1 as the winner, it will POST the (provider 2 URL, provider 1’s UUID) mapping to the $linked URL of provider 2. So, only step 5 will be executed if provider 1 is the winner.

In case 3, the resource will end up having two UUIDs in provider 2: an old UUID and a new UUID. This UUID change will need to be propagated to the other providers with which provider 2 had established links before so that all providers end up using the new UUID. This propagation will be happen transparently during synchronization if provider 2 is synchronized with these other providers. Otherwise, a special UUID propagation process will need to be run (details will be provided later but SData already provides the necessary APIs to support this process).

Steps 4 and 5 will also update the resources on either side if they have been modified at step 3 to eliminate data inconsistencies (see above).

Linking from one Application

The second scenario is a scenario in which the linking process is run directly from one of the applications. This is illustrated by the following diagram:

Linking 2.png

The architecture is different than in the previous scenario. Application 1 is now an SData consumer which accesses its own data store directly.

But the linking steps are similar:

The linking process queries its local store for resources that need to be linked.
The linking process queries provider 2 for resources that are potential link candidates for the resources retrieved at step 1.
The linking process pairs local resources with resources from provider 2. It also resolves data inconsistencies between the members of a pair.
For all the pairs generated at step 3, it stores the (UUID, application 1 resource URL) mapping in its local store (if this mapping is new).
For all the pairs generated at step 3, it sends the (UUID, provider 2 resource URL) mapping to provider 2 (if this mapping is new).

The only difference with the previous scenario is that the SData protocol is short circuited for all the interactions with application 1. The linking process does not need to go through through the HTTP layer to interact with application 1’s resources, it can make direct calls to local application components to access the database.

The interaction with provider 2 is the same as in the previous scenario:

At step 2, the linking process will send standard SData query requests to retrieve resources that are potential matches for application 1 resources.
At step 5, the linking process will POST (provider 2 URL, UUID) mappings to the $linked URL of provider 2. It may also update provider 2 resources that have been modified at step 3 to resolve data inconsistencies.

So the SData APIs described in the Linking Protocol section cover both scenarios.