Summary

SNOMED International proposes to add an additional relationship file to the International Edition of SNOMED CT which will - initially - contain numeric values for medicinal products.  At the same time, existing drugs strengths and counts expressed using concepts (which represent those same numeric values) will be inactivated.  This enhancement will increase the analytical power of SNOMED CT through the use of numeric queries and assist with interoperability by removing the need for extension maintainers to all - separately - add concepts representing numbers in order to publish their own drug dictionaries.  SNOMED International ask that users read this briefing paper and then respond with questions and feedback via this Google form.


Introduction

SNOMED International proposes to further enhance SNOMED CT by adding the capability to express Concrete Domain values.  This enhancement will initially target improvements in the Pharmaceutical / biologic product hierarchy.  Specifically supplying clinical drug strengths as true numbers, rather than using the existing work-around of concepts which represent those numbers.

Background and Rationale

SNOMED International seek to make SNOMED CT richer, more precise and more useful.   This drive is hampered when particular types of information cannot be fully represented using the existing RF2 release distribution format.   In the case of numerics - for example the weight of a particular medicinal substance used in a drug product - it is desirable to be able to represent that amount, and then be able to ask questions like "List all drugs that contain >= 100mg of Paracetamol".

A work around for this particular attribute was put in place in 2017, namely to create a concept to represent every number needed in the International Drug Model.   This solution results in fully correct classification and has allowed the Drug Model to progress in its implementation.  However it has several drawbacks:

  1. A new number concept must be created every time a new one is required.  National Release Centres (NRCs) may wish to use numbers that are not contained in the International Edition, and indeed many NRCs maintain substantially sized drug dictionaries so their burden is increased.  Each NRC would then be creating their own number concepts, some of which might then appear in the International Edition, causing further work to resolve.
  2. SI and Implementers need to take care to normalize drug strengths to avoid unwanted equivalencies.   For example, the SI drug model would always represent 0.5g as 500mg since the classifier would not recognise those two strengths as being the same thing - they're represented by different concepts.   Using concrete domains allows equivalences between different units to be detected.
  3. Users are not currently able to directly ask questions that involve numeric operations ( > , < , = ).   Number concepts currently need to be transformed into true numbers before this could be done.

There are other types of concrete value that are desirable, such as strings and booleans which are used already, for example, in the Singapore Drug Dictionary.  This proposal would also add the capability to express these additional data types.

Proposed Changes to RF2

The Modeling Advisory Group proposes to add a new file to the international release called, for example : SnomedCT_InternationalRF2_Production_20210731T120000Z/Full/Terminology/sct2_RelationshipConcreteValues_Full_INT_20210731.txt   with counterparts in the Snapshot and Delta folders.

This file would contain the following fields (with example rows & FSNs added for clarity):

ideffectiveTimeactivemoduleIdsourceIdvaluerelationshipGroupIdtypeIdcharacteristicTypeIdmodifierId
1234560020202107311900000000000207008348315009 |Product containing precisely caffeine 65 milligram and paracetamol 500 milligram/1 each conventional release oral tablet (clinical drug)|
#201234567809 |Count of base of active ingredient (attribute)|900000000000011006 |Inferred relationship (core metadata concept)|900000000000451002 |Existential restriction modifier (core metadata concept)|
7075526020202107311900000000000207008348315009 |Product containing precisely caffeine 65 milligram and paracetamol 500 milligram/1 each conventional release oral tablet (clinical drug)|
#6512234567809 |Has presentation strength numerator value (attribute)|900000000000011006 |Inferred relationship (core metadata concept)|900000000000451002 |Existential restriction modifier (core metadata concept)|
7075529029202107311900000000000207008348315009 |Product containing precisely caffeine 65 milligram and paracetamol 500 milligram/1 each conventional release oral tablet (clinical drug)|
#50022234567809 |Has presentation strength numerator value (attribute)|900000000000011006 |Inferred relationship (core metadata concept)|900000000000451002 |Existential restriction modifier (core metadata concept)|

The new concrete values file will use the same sort of relationship SCTIDs that the current file uses1.  

Existing attribute types that take concepts as numbers for values will be retired and replaced with new attributes using the same FSN, but taking a concrete value as the attribute target.   This will mean that all relationships using the original attribute types will be inactivated in the inferred relationship file, and replaced with new rows in the new relationship concrete values file.    There will be an equivalent change in the stated view, but since OWL is already capable of supporting concrete values, this will only result in changes within the OWL expressions.   There will be no structural changes in the stated view.

The range of types of the values that can be held are as specified in SNOMED Compositional Grammar.  That is:  boolean, numbers and strings.  The particular type of value used for each relationship type will be specified by the MRCM.  See MRCM for Concrete Domains.  Boolean will be restricted to true | false to avoid confusion with numeric types - always lower case.

To maintain structural compatibility with the existing relationship file, the various columns will have the same layout and usage.  The value column in the concrete values file replaces the target column in the relationship file.  Otherwise, the two files will be identical such that they could be loaded into the same data structures if an implementer chose to do so.

Type Indicator Symbols

In order to simplify implementations such that they do not have to parse the MRCM to determine the type of any given concrete value, numeric values will be prefixed by a hash (#) symbol and string values will be surrounded with double quote characters (").   These symbols will be used in both the RF2 concrete values file and the MRCM constraints.   This also follows the convention set by SNOMED Template Language.  Boolean values will not use any symbols, just true or false (lower case).   Any double quote characters that appear within strings, will necessarily be escaped by a backslash (\) symbol, likewise backlashes will need to be escaped by another backslash (\\).   

Note that there will be no language support for translations of String values.   If multiple co-existing translations of a given String are required, it is suggested that these be modelled using Description components.  This is consistent with the use case of Strings for branded drugs.   When a drug product has a different name in another country, it is considered to be a different product.

Proposed changes to the SI Drug Model 

Following established principles of retiring concepts when there is a change in meaning,  SNOMED International will inactivate the existing strength / concentration  attributes which use concepts-as-numbers and replace them with new ones (using the same FSNs) and switch the target/value to the corresponding concrete numeric.   This change will be done programatically as a technical solution - which will be made generally available for NRCs and other generators of SNOMED content to use.

The attribute type concepts which will be replaced are as follows (italics indicate attributes not currently used by International Drug Model):

766953001 |Count of active ingredient (attribute)|732944001 |Has presentation strength numerator value (attribute)|733724008 |Has concentration strength numerator value (attribute)|
766952006 |Count of base of active ingredient (attribute)|732946004 |Has presentation strength denominator value (attribute)|733723002 |Has concentration strength denominator value (attribute)|
784276002 |Count of clinical drug type (attribute)| 766954007 |Count of base and modification pair (attribute)| 774161007 |Has pack size (attribute)|

Classification

With the changes introduced by SNOMED International in 2018, classification of SNOMED CT is performed directly from OWL.   OWL supports concrete domains, and so the existing "stated" OWL reference set is capable of representing concrete domains without further modification.   Class (ifiers also, in general, support concrete domains.  The two areas where changes will be required are:

  1. In Tooling, to allow authors to enter a concrete value rather than a concept as the target of a relationship.
  2. In the distribution archive, adding a new relationship file to allow the concrete values to be used 2.  

Impact

Your Current or Planned Usage

Impact of Change

Detail of Impact

No use of International Drug Model, or usage limited to using descriptions and hierarchical structure.

There is no need to load the new concrete value relationship file for this use case.

SNOMED International Managed Service users and users of SI open source software.

SNOMED International tooling will be enhanced in good time to support the proposed changes.

Maintenance of drug concepts which form part of the 373873005 |Pharmaceutical / biologic product (product) hierarchy.

Extension concepts in this hierarchy will need to have their strengths and ingredient counts expressed using concrete values in the Stated OWL refset, in order to properly classify and subsume appropriately. The concrete value relationships will need to be split into their own file in order to be compatible with the International Edition.

SNOMED International software such as Snowstorm and the SNOMED OWL Toolkit will be updated to perform these steps automatically.

An analysis of the changes that will be required to the RF2 files has been performed in terms of the likely number of rows affected.  See Concrete Domains RF2 Impact Analysis.


Proposed Schedule for Change

The table below shows the features that will be introduced over the next few releases of the SNOMED CT International Edition. This is an optimistic timeline that has been driven by the desire to fulfil obligations for successful delivery of work being done in the Pharmaceutical / biologic product and Substance hierarchies.

 

International

Release

Stated OWL Axiom file changes

Inferred Relationship file changes

Additional Features in International Release

Technical preview

20210131

Unchanged

Unchanged

None

A technical preview will be produced, with drug concept strengths and counts expressed as concrete values in the new file

Expected publication prior to February 2021.

20210731

Existing drug concept axioms will be updated to use concrete values.

Existing strength and count attribute types will be inactivated, replaced with new ones using the same FSNs.

Existing drug concept strength and count relationships inactivated.

Existing strength and count attribute types will be inactivated, replaced with new ones using the same FSNs.

New separate concrete value relationship file will express these same attributes using numeric values.

New attributes will be used, although they will take have the same FSNs as the current attributes.

MRCM will include new rows to indicate that the new attribute types are expected to take a concrete domain - specifically numbers - as target values.



 N/A

Next Steps

SNOMED International request that users of SNOMED CT provide feedback on this proposal via this form. SNOMED International will respond to feedback received on this page until the end of this consultation exercise on 31 December 2019. 



Footnotes

1 Technically, this means using Partition 02 for these concrete value relationship SCTIDs in the International Edition, as we do for all relationship types.   Extensions will continue to use partition 12. 

2 SNOMED International seek to minimise disruption to consumers of SNOMED CT and here propose adding new, optional artefacts, rather than modifying existing structures.

Comments

  1. Dion McMurtrie
    2020-07-16 01:01

    Rationale 2 SI and Implementers need to take care to normalize drug strengths to avoid unwanted equivalencies. For example, the SI drug model would always represent 0.5g as 500mg since the classifier would not recognise those two strengths as being the same thing - they're represented by different concepts. Using concrete domains allows equivalences between different units to be detected.

    The above isn't quite right. The classifier (under the proposal) won't know that 0.5g and 500mg are the same thing unless you tell it. Therefore normalisation is still needed for the drug model unless CGIs are created to "tell" the classifier about the equivalence of all these measurements (more than just the scalar)

    Reply
    1. Toni Morrison
      2020-07-20 02:01

      Hi Dion McMurtrie and Peter G. Williams

      We currently have a QA query that checks for clinical drug strengths that are not normalized per our EdG. The only two outliers right now are related to medical gases (an area which the WG is addressing for the 2021-Jan Release) and a previously reported rounding issue (e.g. 0.67 vs 0.667). Dion - if you've found other examples of inconsistency, please let us know so we can track down why they weren't identified in the QA report.

      As far as the classifier knowing that 0.5 g and 500 mg are the same, I'd ask for that to be implemented in a way that is not subject to human error (e.g. I don't want to have to tell the classifier that 0.5 g and 500 mg are the same over and over and over again - it should be "once and done").

      Toni

      Reply
      1. Brian Carlsen
        2020-07-21 03:40

        One thing I've been thinking about related to Toni's 0.67 vs 0.667 is that it might be desirable to make explicit the defined "precision" of numbers associated with attributes (and even potentially the rounding strategy).  Real world numbers can sometimes have differing levels of precision than the terminology does and it should be clear whether that means a certain SNOMED code can be used vs. a need for a post coordinated expression with slightly different significant digit.  (e.g. if you've got 0.667 and the terminology has something with 0.67 - and you're trying to find a SNOMED code, it may be appropriate to use 0.67 if you know that the attribute with that value has a precision of 2 digits and a rounding strategy of going up to the next # if following digit is > 5).

        It also means that 0.600 can actually legitimately indicate that it is different than 0.615 which using a precision of 1 digit and rounding would produce the value 0.6 in both cases.  

        Reply
        1. Matt Cordell
          2020-07-30 11:51

          So AFAIK these number only get introduced when products are marketed with strengths like 50mg/3mL. There's been endless discussion about what that denominator does/doesn't mean (smile).

          The error tolerance listed for drugs is can vary between different ingredients.

          But I've seen it suggested that USP monograph lists an acceptable weighing error of <5%. And "electronic balance ... most commonly used in prescription compounding has a readability of 0.001 mg; consequently, the least amount that can be weighed is 20 times that, or 20 mg."

          So modelling to 3 decimal precision seems reasonable. I don't think a 1microgram/0.1% rounding error in modelling is introducing an unacceptable risk.  Something modelled in micrograms, might be "off" by 1 nanogram.

          The rounding should just be applicable when generating non-normalised descriptions. We just need clear rules about rounding, and 'non-normalising'.

          Also the range of product strengths in reality, isn't a continuum. Available strengths are somewhat discrete values, even across brands. (e.g. 25, 50, 100mg).



          Reply
          1. Dion McMurtrie
            2020-08-03 11:02

            FYI AMT uses significant digits (nod to floating point representation) - 13 actually, I just looked it up in section 8.4.4.1https://www.healthterminologies.gov.au/library/DH_3243_2020_SNOMED-CT-AU_Australian-Technical-Implementation-Guide_v2.5.pdf?_filename=DH_3243_2020_SNOMED-CT-AU_Australian-Technical-Implementation-Guide_v2.5.pdf. There was of course all sorts of clinical/technical review into this decision which I loosely remember from the time.

            The idea was that this should be sufficient accuracy and predictable behaviour for implementers to use and classification results to be accurate, and that the precision was proportional to the size of the measurement unlike a specific number of decimal points which makes precision vary wildly depending upon the magnitude of the number.

            Reply
            1. Matt Cordell
              2020-08-04 01:28

              sorry, I forgot about 'significant figures'.

              I still think that "13" is excessive though. And an artefact of the AMT rules about switching units, and not having more than 3 digits to the left of a decimal, modeling vs terming rules, people involved etc.

              Some extreme examples in in AMT:

              • "0.006567366283006 mg/mL" zinc sulfate heptahydrate in Smofkabiven injection, 1.477 L bag. The term uses "9.7 mg / 1.477 L"
                Denormalised = 9.699999999999862mg/L
                If AMT only modelled to "0.00657" - Denormalised = 9.70389mg/L
              • "0.2166666666667 ug/mL" hyoscine hydrobromide in Donnagel oral liquid, 30 mL. The term uses "6.5 ug/30 mL"
                Denormalised = 6.500000000001ug/mL
                If AMT only modelled to 0.217 → 6.51
              • "5988.023952096 ug/mL" Anzatax 100 mg/16.7 mL injection
                Denormalised = 100.0000000000032mg/mL
                If "5988.024 " → 100.0000008mg/mL"
                Even If "5988" → 99.9996mg/mL

              So I expect 3 significant decimals is probably sufficient. 6 almost certainly is...
              I don't think any products are 'labeled' with numbers containing more than 3 decimals and 6 significant figures.

              Anyway, It's a messy topic, with all sorts of rounding rule options etc. I think the key is just documenting the rules chosen, and consistency.

              Reply
              1. Toni Morrison
                2020-08-04 02:56

                Hi Matt Cordell and Dion McMurtrie and Yongsheng Gao

                The current EdG for the International Release is to round repeating decimals to three decimal places. If part of the MAG recommendation is to revisit this decision, please let me know so that I can solicit input from the Drug WG and we can make any changes at the same time we implement concrete domains.

                Thanks,

                Toni

                Reply
  2. Dion McMurtrie
    2020-07-21 07:53

    Hi Toni Morrison ,

    My comment above was really saying that point 2 of the rationale in this document isn't really right. Concrete domains doesn't give you 0.5g == 500mg for free, I wasn't commenting on any specific content.

    I did experience issues with unit conversion and rounding between SNOMED CT, RxNorm and AMT last year but that was an alignment issue between these code systems. I didn't look extensively for consistency within SNOMED CT content at the time or since so I'm unaware of any inconsistencies.

    As far as I can see there's two approaches

    1. don't ever have 0.5g and 500mg in the stated form but rather pick one and be consistent, which is ideally automated in the authoring tooling. This is what AMT does (rules are a bot more complex that this), and therefore there's no need to tell the classifier that 0.5g is 500mg.
    2. allow definition of concepts using all variations of 0.5g, 500mg etc., detect these and automate creation of GCIs which declare them equivalent. This is what I did for AMT/RxNorm/SNOMED CT in the presentation at the expo last year to integrate them. Sadly this wouldn't be a once off exercise and would need to be recalculated whenever a quantity was added to a concept that had not been used before.

    Either way you need automation etc to manage it well. Of course option 1 means picking a consistent set of editorial rules which would then need to be adopted and rigorously adhered to by anyone trying to extend on the content to ensure correct classification results. Option 2 also has the added advantage (that I used in my expo experiment last year) that expressions could be stated as 0.5g or 500mg and returned the same result because the classifier knew they were equivalent.

    Of course there was also the proposal to define compound named "quantity" concepts such as 500mg and use concrete domains in their definition and GCIs to each other, and then use relationships from the drug concepts to these compound quantity concepts. That was another option on the table at one point, which is similar to the above.

    Dion

    Reply
  3. Add new comment