Summary

SNOMED International propose modifying the specification of the RF2 Identifier File ( 4.2.4 Identifier File Specification ) to bring it into line with the layout of all other RF2 files.   That is, so that the effectiveTime, active and moduleId columns would be the 2nd 3rd and 4th columns in the file respectively.   In the current specification, an extra first column in the file pushes these columns out by 1 place, and this file is the only one in the RF2 specification using an alternative layout for these initial four columns.   This file has never been populated in the International Edition, so SNOMED International would like to confirm if it is in use in any currently published extensions.   Additionally, using this modified layout, SNOMED will begin to populate and publish this file, when the need arises.

SNOMED International ask that community members read this briefing paper and then respond with any questions and feedback via this Google Form.

Introduction and Background

As SNOMED International progresses to collaborate more closely with other organisations, the opportunity will arise to model terminology content from other code systems directly in SNOMED CT.  To highlight that this content is a true alternate expression of the same logical entity, SNOMED International would like to make use of the Identifier File ( 4.2.4 Identifier File Specification ).  This is an alternative approach to using a Map file, which may be seen to imply an equivalence in entities between two code systems, but not one and the same entity.   The Identifier File has existed in the RF2 specification for many years, but it has never been populated in the International Edition.  SNOMED International are not aware of any content generators (such as National Release Centres) who do make use of this file format.

Looking at the structure of this file, it is clear that it does not have the same layout as all other files - id, effectiveTime, active, moduleId.    Instead there is a schema identifier in the first column.     SNOMED International would like to modify the specification of this file to make it consistent with all other RF2 files.   This would make the file easier to work with as part of a release archive.   For example counts can be made of the number of rows for a given effectiveTime, and this is most easily accomplished if the same column index is used for the effectiveTime in every file.

Proposed Changes to RF2

The current column headers for this file are:

identifierSchemeId alternateIdentifier effectiveTime   active  moduleId      referencedComponentId

The new proposed format is:

alternateIdentifier effectiveTime   active  moduleId      identifierSchemeId  referencedComponentId

 The data types will remain the same, as detailed in the current RF2 specification:  4.2.4 Identifier File Specification   

Impact

This change is not expected to have any impact on implementers of existing systems, as SNOMED International are not aware of organisations who currently represent entities from other code systems directly in SNOMED CT, as opposed to mapping to it.   As such, consumption of the new file would only be required by organisations who have an interest is working with such content.

Proposed Schedule for Change

Once the format for the Identifier File has been agreed, SNOMED International would seek to include it in the International Edition, no earlier than April 2023.   This would also be dependent on the inclusion of content from 3rd party code systems, that could be appropriately represented in this way, rather than via an existing map reference set.

Next Steps

SNOMED International request that organisations that generate or import SNOMED CT RF2 files, provide feedback on this proposal via this form. SNOMED International will respond to feedback received, until the end of this consultation exercise on 14 April 2023. 


Responses to Feedback

At time of writing, SNOMED International have received 7 submissions of feedback.   None of the respondents said that they currently make use of the Alternative Identifier File.

Justification for Change

On the consultation question "Do you agree that the column layout of the Identifier File should be made consistent with that of other RF2 files?", some feedback received said: "This modification appears to be cosmetic and doesn't appear to articulate a clear set of strong benefits. While the cost/impact is theoretically low if all those using the RF2 spec are contacted and respond, the justification for change doesn't appear strong. I feel we should be conservative with normative specifications, and seek to change them only when we have compelling reasons."

SNOMED International response (DRAFT): We agree that the change suggested here is a minor one, however it is not entirely cosmetic as there are genuine reasons for wishing to have all RF2 files align.  One reason is to avoid confusion (and therefore mistakes) by having a different layout for just one file, and another is to be able to make use of command line tools.   For example, an entire delta archive can be sanity checked for appropriate content using the Unix command:   cat * | cut -f 2,3,4,5 | sort | uniq -c    which will ensure that all rows have the correct effective time, are in the expected modules, and give a count of active and inactive rows.   This is not a killer argument for sure, but we have an opportunity to cheaply bring consistency to our file layout now, or regret not doing so for evermore.

"Normative" is an HL7 idea, and a useful one, but if we consider that way of working we should also consider that whole process, which says that artefacts would not move from a maturity level of zero until there are documented cases of them being used in the real world.   To date,  SNOMED International are not aware of anyone currently using this in a production environment, so we would not classify this file format as Normative.   

Other feedback was received for this question which is covered by the above.

Additional Questions

On the consultation question "Do you have questions about the proposed changes?", some feedback received said: "I still wonder why an equivalence map is not sufficient for the same purpose. "One and the same" sounds like an equivalence to me. This may be the reason nobody is using this file."

SNOMED International response (DRAFT): Yes we agree that a map would provide the same functionality.  However, the proposed usage is intended to make a distinction between (map) two concepts being equivalent versus (id file) actually being the same entity expressed in another CodeSystem.  This has a parallel in FHIR's CodeableConcept datatype where a single concept can have more than one identifier.   The intention is to raise the prominence of these identifiers.   In implementation, we might expect to see them "front and centre" in some representation, rather than relegated to secondary display.  Similarly, these identifiers will form part of our JSON representation of a concept - returning in the "1st hit", whereas map entries would need to be requested separately.   We realise that these implementation decisions need not affect the RF2 specification, but this is as much about the optics of importance as it is about function.

Further feedback said: "'alternateIdentifier' and 'identifierSchemeId' have a conjoint meaning anyway (if I understand correctly), so better not split them and use them together as the 5. & 6. column?"

SNOMED International response (DRAFT): The intention is to have the columns align with other files and in this case the identifierSchemeId is similar to a refsetId and appears in that position.   Compound primary keys do not need to be placed sequentially.   However it is a good point that situations such as published mistakes being made might lead to needing to change a published identity, and currently the referencedComponentId field is considered immutable.   At point, we lose uniqueness.   This issue will be discussed at our April Business Meetings.

Concerns

On the consultation question "Do you have any concerns about this proposal?", some feedback received said: "If this file is to change in a breaking manner, perhaps considering changing the identifierSchemeId to a URI would meet this goal simply and would be easily extended"

SNOMED International response (DRAFT): Absolutely we agree that having a way to programatically determine the URI of a Code System is desirable.   We intent to achieve this using Annotations (see the work being done by the Modelling Advisory Group).   The annotations solution is preferable as it potentially provides a lookup for all Code Systems, not just those that feature in this identifier file.   It is also "The SNOMED way" to reference concepts using their SCTID primarily.




Comments

  1. Kai Kewley
    2024-01-02 01:21

    Yongsheng Gao How should alternate identifiers be represented in an OWL ontology? 

    This question was raised via GitHub https://github.com/IHTSDO/snomed-owl-toolkit/issues/85

    Reply
    1. Yongsheng Gao
      2024-01-04 02:00

      Hi Kai Kewley  There are two possible options to represent the alternate identifiers in the OWL ontology of SNOMED CT.

      1. The alternate identifiers could be represented as classes from other ontologies. Then, equivalent classes axioms can be stated between the SNOMED CT concepts and classes from other ontologies. It is less suitable because it produces a large number of new classes from other ontologies, which could be in a flat list if they are declared directly in the ontology, or have their own hierarchies if other ontologies are imported. These active concepts and classes are considered duplicates in the OWL ontology of SNOMED CT. As a general principle, duplications are not allowed in terminologies. This approach (merging of ontologies) could be appropriate for applications that require reasoning across multiple ontologies.

      2. The SNOMED CT OWL ontology could have the alternate identifiers represented as annotation values. The annotation attribute represents the coding scheme. This approach is straightforward in implementation.  The key advantage is that preferred terms or identifiers from a particular scheme could be configured for display in the OWL tool, such as Protege. 

      Reply
      1. Peter G. Williams
        2024-01-09 10:59

        In this scenario, we already have a representation of the LOINC term (as a SNOMED CT modeled concept), so all that information about LOINC parts is already present.    I'd be happy to just include the alternate identifier as an annotation.   This is also closest to what we have in RF2.   Adding in additional classes feels like creating something additional that would be over and above an expression of SNOMED CT in an OWL format.

        Reply
      1. Kai Kewley
        2024-01-17 03:00

        Thanks Yongsheng Gao . Option 2 seems to be the simplest option to meet the requirement of expressing the LOINC identifiers. I will have notified watchers of the github issue.

        Reply
        1. Yongsheng Gao
          2024-01-17 03:33

          Hi Kai Kewley  As a suggestion from Dion McMurtrie in our email communication, the alternate identifier could be the new annotation attribute. Then, the annotation values will be IRI which includes prefixes from the coding schemes. This approach would be better than having an annotation attribute for each coding scheme. 

          Reply
          1. Kai Kewley
            2024-01-19 02:25

            Yes, that's better!

            Reply
  2. Add new comment