Duplicate Prevention

An important requirement of Swiss universities is that a person does not have more than one edu-ID account. The edu-ID concept therefore aims to create as few duplicates as possible right from the start. If, despite all precautions, duplicates have been created, they should be resolved as efficiently as possible.

Note: Technical and read-only accounts are excluded from the duplicate detection heuristics, and are never merged with other accounts.

Avoiding Duplicates

Strategies to avoid the creation of duplicates:

Communication: Involve the account owner

User manages his/her own account. Basically, a user should know if he/she already has an edu-ID account.

  • Users are instructed not to create additional accounts, but instead to maintain an already existing account.
  • Users have no benefit from creating duplicate accounts.
  • Inactive users receive a reminder email every year to keep their account up to date.

Rules: Hard criteria to detect duplicates

A set of rules prevents the creation of duplicate accounts with technical means. The edu-ID system prevent a user from creating an account if another account with a unique, personal and verifiable attribute already exists. The following attributes are considered unique, personal and verifiable, and are checked:

  • All email addresses
  • The mobile phone number
  • The ORCID identifier
  • All affiliation identifiers

Heuristics: Soft criteria to detect duplicates

A set of soft rules for heuristic duplicate detection is applied to all edu-ID accounts. Heuristic duplicate detection is not 100% reliable. Therefore it can be overruled by a user.

The following heuristics are currently implemented:

  • A browser cookie is set in the form to create an edu-ID, preventig the user from accessing the form a second time. This can be overridden by a user by deleting browser cookies.
  • Accounts with very similar first and last names, and identical birth dates are flagged as potential duplicates. Users with potential duplicates are asked by email to review their account(s). Users are free to ignore such notfications. The weakness of this approach is that first name, last name and birth date are self declared by the user and may not have been verified.

Resolving Duplicates

Automatic detection and resolution by user

Detection based on unique personal attributes (email, mobile number, affiliation-ID, ORCID). If any of these attribute values appears in two edu-ID accounts, then one of them is most likely a duplicate.

Resolution: Done by user

Manual detection and resolution by help desk

Detection: The recognition is based on any event in connection with the use of the edu-ID by the user (the account owner), an organisation (i.e. helpdesk) of by a service.

Resolution: Done by edu-ID Support

Service notification after account merging

If a duplicate account was detected and merged into a single account the set of identifiers of the removed account is discarded.

If the user has previously accessed services with the now discarded identifiers, the user may not be recognised anymore by the service. The user would lose access to the service.

To avoid losing access, affected organizationas and all service providers who have previously received a now discarded identifier are informed by email and receive the new identifier.

Manually processing email notifications

Notifications by email:

  • Organizations with edu-ID integration and affected by a duplicate resolution are notified by email. The email is sent to the home organization administrative contact as listed in the aai resource registry.
  • Services affected by duplicate resoulution (i.e. because an identifier has changed) are notified by email. The email is sent to the technical contact specified in the aai resource registry.

The email contains the old and the new identifiers which can be used to manually update the local user record.

Automatically checking and updating all identifiers

With this method, an organization or a service provider regularly checks all identifiers of all its edu-ID users.

  1. All locally known identifiers are collected by the SP or organization.
  2. The identifiers are sent in one single request to the bulk-check method of the Tools-API.
  3. The response contains the current status for each checked identifier.
  4. In case of changed identifiers, the organization or SP should update the local identifiers for these accounts.

Notes:

  • A reasonable update frequency via API is once per week. The bulk-check should be processed during the night and not more often than once per day.
  • If identifiers are updated automatically, the notification emails can be ignored.

Automatically updating identifiers of one merged account

With this method, an organization or a service provider receives a notification message for a specific account that was affected by a merge operation. The edu-ID identifiers can then be updated for that specific account.

  1. The service implements a notification endpoint for the SP notification service.
  2. The service subscribes to get notifications if the attribute swissEduPersonUniqueID was changed.
  3. During normal operation, the service recieves a swissEduPersonUniqueID attribute, if it was affected by a merge operation.
  4. The service sends the identifier to the bulk-check method of the Tools-API.
  5. The response contains the current status for the checked identifier.
  6. In case of changed identifiers, the service should update the local identifiers for that account.

Notes:

  • If identifiers are updated automatically, the notification emails can be ignored.