SNOMED Documentation Search

 Other Documents
Skip to end of metadata
Go to start of metadata

Current Version - Under Revision

The idea of a canonical representation is that it generates a predictable string string rendering. The missing element to deliver this in the "long normal form", is a specified sort order within the collections elements in an expression. A standard sort order is not essential for general purpose use but it is very useful to enable fast matching of logically identical expressions (which might otherwise be obscured by differences in order that have no semantic relevance).

The canonical form for any expression is ordered according to the following rules.

  • The expression is rendered in the form specified by the SNOMED CT compositional grammar. For canonical representation a restricted version of the compositional grammar is used:
    • No whitespace characters may be included in the canonical form
    • No pipe characters "|" and thus no term text shall be included in the canonical form .
    • Thus the permitted characters are:
      • Digits [0-9] - for conceptId values;
      • Plus [+] - to combine focus concepts ;
      • Colon [:] - to represent the start of a refinement ;
      • Equals [=] - to link an attribute name to it value;
      • Comma [,] - to separate attributes within a refinement
      • Round brackets [()] - to represent nesting;
      • Curly brackets [{}] - to represent grouping.
  • The syntax determines the general order of elements within an expression as follows:
    • Focus conceptIds;
    • Attributes (expressed as name-value pairs);
    • Groups (containing attributes).
  • Within a set of focus conceptIds:
    • Concept Identifiers are sorted alphabetically based on their normal string rendering (i.e. digits with no leading zeros):
      • The reason for alphabetic sorting rather than numeric sorting is that it is complex to sort attributes and groups which consist of an arbitrary number of conceptIds using numeric keys.
  • Within a set of ungrouped attributes or a set of attributes within a group:
    • Attributes are sorted alphabetically based on the string concatenation of the name and value conceptIds separated by an "=" sign;
    • If a value contains nested refinements, the value is enclosed in round brackets (which may influence the sort order) and the elements of the nested expression are sorted by applying the general canonical sorting rules.
  • Within a set of attribute groups:
    • Groups are sorted by alphabetical order of the combined set of previously sorted attributes.