Skip to end of metadata
Go to start of metadata

Current Version - Under Revision

Information in health records may be expressed at various levels of specificity.

Example:

To represent diagnoses of:

  • Chest infection;
  • Left lower lobe pneumonia caused by pneumococcus.

Criteria for selective retrieval may also need to be stated to different levels of detail.

Example:

To retrieve all records of

  • Respiratory tract infections;
  • Left lower lobe pneumonia;
  • Pneumococcal pneumonia.

Occasionally a query may be designed to retrieve only record entries that include a particular general Concept. This may be useful for a quality review or to find record entries that are too general to map to a required classification.

However, in most cases, a general query should include more specific Concepts recorded in the record. For example, if the selected Concept is 275498002 |Respiratory tract infection| the user would expect record entries containing Concepts such as |Chest infection| or "Left lower lobe pneumonia caused by pneumococcus" to be retrieved. The subtype hierarchy of SNOMED CT is designed to facilitate this type of retrieval. Four techniques that can be used for this purpose are outlined in the following subsections.

Note:

The subtype hierarchy is improved with new releases of SNOMED CT . These changes need to be considered if more than one version of the hierarchies is used for data analysis.

Queries expanded to identify all subtypes

A query that explicitly includes the Concept Ids of all subtype descendants of the Concept to be retrieved can be built using one of the following methods:

  • A recursive tree-walk following 116680003 |is a| Relationships - from the selection Concept to its subtypes and the subtypes of its subtype. Each branch of the tree walk ends on reaching a Concept with no subtypes or a Concept that is already in the set of selected Concepts .
  • Using pre-generated branch number ranges associated with the selection Concept and looking up all Concepts with branch numbers in those ranges. This could be much faster than a tree-walk if Concepts are indexed by branch-number.
  • Using a stored list of subtype Concept Ids for frequently queried Concepts. This would initially be generated in one of the other methods and then reused in various queries. Any stored list would need to be rebuilt after installing each release of SNOMED CT .

The resulting query may contain a large list of potential Concept Ids, but the actual query structure is simple. Therefore as long as the database engine does not restrict query size, this type of query can be run in any environment that support SQL or an SQL-like query language .

This technique is likely to be most effective when a large number of candidate record entries need to be examined and when Concept selection criteria are relatively narrow. Selecting all diagnoses using this approach would generate a predicate with tens of thousands of Concept Ids. Extremely large queries may not perform efficiently or may fail to run in some environments.

Subtype tests on each recorded concept

The Concept recorded in each candidate record entry can be tested to determine whether it is a subtype of the Concept to be retrieved. The test can be applied in one of the following ways (see also Testing and traversing subtype relationships ):

  • A recursive tree-walk following 116680003 |is a| Relationships from the recorded Concept to its supertype and the supertypes of its supertypes. Each branch of the tree walk ends on reaching the Root Concept or a Concept that has already been visited. The test ends with a positive result if the selection Concept is encountered during the tree walk. Otherwise when all supertypes have been visited, the test ends with a negative result.
  • Optimized subtype testing using techniques such as branch numbering and tree-walk enhanced with semantic-type Identifiers or hierarchy flags.

This technique is likely to be effective when the number of candidate record entries to be examined is relatively small or if the Concept selection criteria are broad. Performance is directly dependent on the time taken for each subtype test. Therefore, extensive use of this approach may only be feasible by applying one or more of the optimizations discussed in the guide.

Use a database with built in hierarchical functionality

Some databases have features which build in hierarchical functionality. These databases may support extensions to SQL that allow a predicate to be specified in a way that implies that the database schema "understands" the subtype hierarchy .

Example:

It is possible to envision a statement such as:

WHERE Record. Expression SUBTYPE -OF 414024009

If a database supports this type of predicate, it clearly simplifies the writing of SNOMED CT queries. It is also reasonable to assume that functionality of this type, built into a database engine rather than added as an afterthought, will deliver enhanced performance. However, this assumption should be tested as it depends on how appropriate the internal implementation is to subtype hierarchy of the size and complexity of SNOMED CT .

Branch-range indexing of individual records

Branch numbering is an approach to subtype testing that could be extended to index record entries. The branch numbers could be used to produce an index of all record entries stored in an application. The technique is as follows:

  • Every record entry is indexed using the branch number of the Concept stored in that entry;
  • The set of branch number ranges associated with the selection Concept is then used to query the branch number index.

This approach is likely to deliver high performance retrieval but it has a significant drawback. Branch numbers have to be regenerated after each SNOMED CT release and the numbering changes each time. Therefore, any indices based on branch numbers must also be rebuilt after each release, and until this rebuild is complete, this method cannot be used for retrieval. The previous set of branch numbers could be used for retrieval during the transition period but this requires a parallel set of branch numbers and branch number ranges.

The likelihood of enhanced retrieval performance should therefore be balanced against the addition of complexity to terminology updates and record maintenance.

Retrieval Based on other Relationships

While many queries will use SNOMED CT 's hierarchical subtypes to aggregate data, the attribute relationships can also be used. For example, to find all procedure concepts that use a laparoscope, search in the Relationship file for Concepts with a relationship of Using Access Device: Laparoscope. Note that role hierarchies can be used to construct these queries.


Feedback