SNOMED Documentation Search


 Other Documents
Skip to end of metadata
Go to start of metadata

Current Version - Under Revision

There are several approaches that can be used to efficiently test for subsumption between concepts. These are summarized here with a brief evaluation of their suitability.

Recursive testing of subtype Relationships

It is possible to determine whether one conceptsubsumes another conceptby recursively following every possible sequences of 116680003 |is a| Relationshipsfrom a candidate conceptuntil the predicate conceptis reached or until all possible paths have been exhausted.

This approach is far too slow to deliver effective implementations in all environments in which it has been tested to date.

Semantic type Identifiers and hierarchy flags

Flags added to the internal representation of each Conceptcan be used to indicate the set of high-level conceptnodes of which that conceptis a subtype. A conceptcan only subsume conceptsthat include the same set of high-level conceptflags. This approach can reduce the number of tests that need to be performed to recursively test the subtype relationships:

  • If a candidate does not have all the high-level node flags that the predicate has, no further tests are needed. The candidate is not a subtypeof the predicate.

  • Even if a candidate shares the high-level node flags with the predicate, any path that reaches a conceptthat does not share those flags need not be further tested.

While faster than the unaided recursive testing approach, this is too slow to deliver effective implementations and is not scalable.

Use of proprietary database features

Some databases include additional features to support the recursive testing of a chain of hierarchical relationships. Other methods of optimization that may be applied to allow more rapid computation of subtype descendant relationshipsare outlined in the following subsections.

Current experiences of databases that support this type of approach indicate that (while easy to implement) the performance is substantially inferior to use of branch-numbering or transitive closure.

Branch numbering

The internal representation of each Conceptcan be extended to include a branch-number and a set of branch-number -ranges.

A branch-numbering algorithm can then be applied when each release of SNOMED CTis imported.

A typical branch-numbering algorithm processes the subtype hierarchyin the following way:

  • A depth first tree walk is performed starting from the root Concept(branch-number 1) and an incrementing number is applied to each Conceptwhen it is encountered for the first time.

  • After the branch numbers have been computed a further tree walk allocates one or more branch-number ranges to each Conceptwith any subtype descendants:

  • At run time, rather than needing to traverse many subtype Relationships, the branch number of each Conceptis tested for inclusion in the branch number range of the putative ancestor.

This approach removes the need for exhaustive testing of subtype Relationships. The disadvantages are a relatively complex build process that must be repeated for each release or update and a requirement for the internal Conceptrepresentation to accommodate a variable length representation of branch number ranges.

Precomputed Transitive Closure table

The transitive closuretable is a comprehensive view of all the supertypes of every concept. It can be derived from current release data by traversing all 116680003 |is a| relationshipsrecursively and adding each inferred supertype relationshipto a table.

The advantage of this type of view is that a candidate -conceptcan be tested for subsumption by predicate -conceptby a simple SQL query. In addition, the table can be updated to take account of changes without requiring a complete rebuild. The disadvantage is the storage capacity required.

Note: The transitive closuretable for the active content of the current version of the International Release, has about six million rows. The row count increases when Extensionsare included. Typical database representations of the transitive closuretable and associated indexes consume more than a Gigabyte of disk storage.

Recommendations

The Transitive Closuremethod is strongly recommended for use in any environment requiring high performance where disk capacity for storage and/or bandwidth for distribution are not a problem.

Where disk capacity and/or distribution bandwidth are limiting factors, Branch Numbering provides an efficient alternative approach.


Feedback