The SNOMED International Release Manager Andrew Atkinson raised the following questions on behalf of the Terminology Release Advisory Group:

  1. Roughly how many relationships using the original attribute types will be inactivated in the inferred relationship file?
  2. Again, how many of the OWL Axioms are likely to change in the OWLExpression file?
  3. What’s the proposed estimated number of records in the new Concrete Rel file?
  4. Understand why we’re choosing true/false instead of 1/0 for booleans, but this does contradict the current usage in most RF2 files - is the likelihood of clashing with an actual 1 or 0 high enough to warrant this (soft) break from the standard?

These questions will be answered here in turn.

Relationship Inactivations (working from the 20200731 International Release)

The following attribute types used in the International Drug Model will be inactivated (as well as 2 additional counts proposed for use by National Drug models in italics):

766952006 |Count of base of active ingredient (attribute)|732944001 |Has presentation strength numerator value (attribute)|733725009 |Has concentration strength numerator unit (attribute)|
766953001 |Count of active ingredient (attribute)|732946004 |Has presentation strength denominator value (attribute)|733722007 |Has concentration strength denominator unit (attribute)|
784276002 |Count of clinical drug type (attribute)| 774161007 |Has pack size (attribute)|
766954007 |Count of base and modification pair (attribute)| 

These are simple concepts so the inactivation of the attribute types themselves will result in 9 relationship inactivations of the IS As to the parent concept.

The concepts that use these attribute types will contribute 28,218 inactivations (query).

OWL Axiom Changes

Medicinal products do not use additional axioms containing numbers (they do use them for product roles, but these aren't counted above), so we can say that - as far as numerics are concerned - they have all been modelled using one axiom per concept.  By querying the number of unique source concepts that are affected in the relationship inactivations, we obtain the number of axioms in the stated view that will need to change.   Answer: 13,515 (query)

I'll cross check that with ECL, which also gives 13.515 (although interesting to note the answer rises to 13,561 in the daily build (at date of writing 29 June2020), so that gives an indication of where we're going to be at by the time of the January 2021 International Release).    In fact since all clinical drugs will also have a count of of base of active ingredient, I could have gotten away with just checking for the presence of 766952006 |Count of base of active ingredient (attribute)| which gives the same count.

New Concrete Values Relationship File

Every relationship inactivated in the move to using concrete domains will be replaced by an equivalent one, so the number of new rows in the new concrete values relationship file will also be of the order of 28,218 (Running against the daily build today 29 June 2020, the number suggested would be 28,335 so given the time-frame involved it seems unlikely that we'd breach 30K

Boolean Values - Choice of true/false  vs 0/1 as currently used eg by the active field

Well the type of the concrete value (String, Number, Boolean) will not be obvious from the file itself.   The MRCM will be needed to be sure what type is being used.   So whether you're looking at an ingredient count of 1 or a "true" value would not be obvious.   Having true and false as values are less ambiguous in this regard, although you could still be looking at a String.

I will ask the MAG to revisit their thinking on this question, given your point Andrew Atkinsonabout us having set a precedent with the active field.   The question is academic to an extent because we do not currently use any boolean values in the International Drug Model and have no plans to do so.   This is more likely to affect Singapore who definitely use them, or perhaps Australia.


Comments

  1. Michael Lawley
    2020-06-29 10:18

    Regarding the statement

    the type of the concrete value (String, Number, Boolean) will not be obvious from the file itself

    In SNOMED Compositional Grammar and ECL there is no ambiguity because # precedes all numbers and strings are always quoted.  Requiring knowledge of the MRCM to distinguish types seems like a backwards step.

    truefalse are ALWAYS Boolean and unambiguous because a String would be "true" / "false" and a Number would be #1 / #0


    Also, as it stands, the proposal SNOMED International Proposal for Representing Concrete Domains in RF2 is lacking a complete example (with both String and Boolean cases) and the SCG/ECL syntax for Numbers is not being used.

    strongly suggest that the SCG/ECL syntax is used in the value column to avoid ambiguity and simplify processing for tooling implementers.


    Reply
    1. Peter G. Williams
      2020-07-15 04:19

      Good, thanks for this clear guidance Michael Lawley. I've made the changes as you suggest, although I've yet to put forward any examples for Strings and Booleans.   Ideally these would come from someone familiar with the edition in Singapore which uses them.

      Reply
  2. Mikael Nyström
    2020-06-29 10:42

    It is good work to already in 29 June be able to query the daily build for 29 July! (smile)

    Reply
    1. Peter G. Williams
      2020-06-30 07:50

      Well spotted!  Fixing.

      Reply
  3. Dion McMurtrie
    2020-07-16 01:14

    I've given this feedback on this so many times I can't remember if it has gone to the latest consultation (and now I've just copied and pasted it again...)

    Two issues leap out

    1. you have no "operator" field to be able to express things like "equal to" or "less than" - this was discussed at the last face to face meeting
    2. you're going to have difficulties with strength comparisons splitting numerators and denominators for ratios into two separate but grouped concrete domain properties. Really these are a compound value and is one "strength" property which has a ratio value, rather than two properties.

    Reply
    1. Peter G. Williams
      2020-07-16 03:48

      These are really good points Dion McMurtrie- thanks for giving us another chance to capture the discussion in a format that's more accessible than fast forwarding through a 2 hour video recording!   I'll tagToni Morrison here in case she'd like to comment further or disagree with me - please do if that's the case Toni, I'm not an expert here.

      1.  Yes I understand that there are drugs where their strength is expressed like < X micrograms but my understanding was that since we're intentionally not representing those in the International Edition, we're looking at some significant complexity for something that we're not going to use.  A comparison is not a numeric. Well past the 80/20 rule, more like the 99.9/0.1.  However, that's obviously not very accommodating for National Drug Models that might need this functionality.   Does AMT feature and express such product strengths?    This is part of a wider discussion about the alignment between the International Drug Model and AMT which we should perhaps continue elsewhere and includeMatt Cordell       I understand that we do have comparison operators in ECL so we can say "give me all drugs with less than 100mg of X"  which is very exciting (IMHO) and a large part of the justification for needing concrete domains.
      2. Yes we're splitting up a ratio which is really one number, but the numerator and denominator are so essentially bound to their units (like per 1 tablet) that it would be problematic to express it otherwise....well I guess we could have a combination unit eg mg/tablet.   I think we're going to hit problems either way with this, but forming a single number does seem like a reduction in the amount of information available;  we can calculate a single ratio value from the component parts, but we wouldn't be able to get back from there.

      Reply
      1. Matt Cordell
        2020-07-20 04:39

        AMT use the Concrete Domain spec developed (about 8/10?) years ago, that does include an operator field. To date, we've only used the "Equal To" value. Though there are products that could/should take advantage of this e.g. : * cannabidiol 25 mg / 1 mL + delta-9-tetrahydrocannabinol less than 2 mg/mL oral oil * OncoTICE contains between 2 to 8 × 108 colony forming units ... (https://www.ebs.tga.gov.au/ebs/picmi/picmirepository.nsf/pdf?OpenAgent&id=CP-2010-PI-06556-3&d=202007201016933)
        * Creon " contains pancreatic extract 150 mg equivalent to not less than 10,000 Ph.Eur. units lipase, 8,000 Ph.Eur. units amylase and 600 Ph.Eur. units protease" (https://www.ebs.tga.gov.au/ebs/picmi/picmirepository.nsf/pdf?OpenAgent&id=CP-2011-PI-02780-3)

        Agree these are edge cases. Though even if it's not in scope for INT, the spec should be useful for extensions also.

        However, there is much greater application for concrete domains beyond just medicines. Clinical Findings would make significant use of all this too. * 373159004|pN1a: Metastasis in 1 to 3 axillary lymph nodes (at least one tumour deposit greater than 2.0 mm) (breast)| * 310252000|Body mass index less than 20| * 631000119102|Chronic back pain greater than three months duration|

        These might not be in scope "yet", but seem reasonable to expect them to take advantage of these DL enhancements too.

        Reply
        1. Michael Lawley
          2020-07-20 06:10

          I would carefully evaluate whether using inequalities in axioms provides real added value (rather than just intellectual completeness) as there are significant constraints on what inequalities can co-exist and still remain in EL++

          Ranges are also problematic.


          Reply
          1. Matt Cordell
            2020-07-21 04:37

            I love that phrase Michael  "intellectual completeness" but agree.

            I can't think of a compelling use case that requires this level of modelling for some of these concepts. But just wanted to make sure all examples were on the table as part of the decision.

            Reply
        1. Daniel Karlsson
          2020-07-20 02:52

          Regarding the clinical findings, the model needed to represent that the count of axillary lymph nodes in which a breast tumor metastasis which size in (2.0, +∞) is found is in [1, 3] seems over the top (at least partly doable-ish with interprets-has interpretation pairs and some way of coordinating the extensive nesting needed...). Also, OWL EL (as Michael Lawley said) limited expressiveness: https://www.w3.org/TR/owl2-profiles/#Data_Ranges. Particularly it does not allow DatatypeRestriction such as those asked for here.

          Just found these papers: 

          https://ora.ox.ac.uk/objects/uuid:e92ca367-c1ec-4a43-8326-9217bbddcb3d/download_file?safe_filename=MagKazHor11NDR_JAR.pdf&file_format=application%2Fpdf&type_of_work=Journal+article

          http://ceur-ws.org/Vol-1015/paper_11.pdf

          So it seems EL++ does allow at least the simple DatatypeRestrictions requested here with a single less than or larger than restriction.

          Reply
          1. Peter G. Williams
            2020-07-20 03:33

            Thanks all.   I think it's really good to have this discussion documented and to show that we've given it due consideration, but for the moment it seems that we're talking about a very small number of edge cases of uncertain real world benefit being pitted against definite high costs in terms of work and increased complexity.

            Reply
            1. Dion McMurtrie
              2020-07-21 08:07

              I think this comes down to providing a way to represent the expressiveness of the stated form in this extended NNF form or not, and whether to do that pre-emptively or not.

              I can't speak directly to a high cost of work or complexity as I don't know what would be involved for SNOMED International's tooling to implement it, but it doesn't seem outrageously difficult to add a column that would at present always have a single value in it. That also doesn't seem particularly complex.

              Whether that would ever be used (have a different value in that column) or not in the future is another question. However I think the complexity about combinations of these values falls to the rules for authoring of the stated form, not this file. This file is simply part of the NNF rendering of the classified stated form.

              It seems we have a choice

              1. omit this column from the file and deal with the consequences if we later decide we do want to use other than data value equality in future, which would be
                1. add the column to the file at that point as a breaking change to the format
                2. leave the column out as another "lossy" part of the NNF rendering - however I think that could be pretty misleading seeing the value without the operator context if it isn't equality, and as the "substrate" for ECL is the NNF it would cause a loss of features or erroneous behaviour
              2. add in the column so we can accomodate other than value equality in the stated form if/when that happens. This runs the obvious risk that it never gets used, which it hasn't so far in AMT.

              I think really only 1a and 2 are serious options. Which means an extra column now to eliminate a possible breaking change later, or take the chance that we're not going to use that feature in the lifespan of this file format.

              Seems you want to do number 1?


              FYI you can see more chatter on this at Re: URGENT: CONCRETE DOMAINS Consultation. This feedback seems to get really fragmented and lost - like I said I feel like I've given this feedback many times in many places but I can't even begin to find them all anymore. That's more of a process issue.

              Reply
      1. Toni Morrison
        2020-07-20 01:51

        Hi Peter G. Williams

        Regarding products with strength ranges, we haven't had requests to add such a product. The EdG clearly states that these products will be added as primitive products so even if added, we wouldn't be representing the strength. There were so few products where this applied, the ability to define strength ranges was agreed to be out of scope when the drug model was created. This does not mean that we cannot revisit the topic if there is sufficient interest/need from extensions.

        During the development of the drug model, we did discuss making the "compound strength" but elected not to do that at the time. Given that the content is now remodeled with discrete attributes, it seems like it would be a reasonable effort to do so if it provides sufficient benefit. I would expect that if we did so, the "compound strength" concepts would be SD with all applicable attributes provided so that users could go "back and forth" between the representations.

        Toni

        Reply
        1. Dion McMurtrie
          2020-07-21 08:43

          I think ranges are ok as long as they aren't unbounded. I don't think AMT models products with ranges like this either, just uses primitive concepts - Matt Cordell might know more definitively.

          I remember the compound strength discussion. I've posed about that here. In short, there's a few options for how to state the strength using these DL features, and pros and cons to each. All of them really need some level of automation support, but then meds in general is pretty mechanical and benefits greatly from automation.

          Definitely the compound strength idea gives you named handles to put the equivalences to other renderings. Those equivalences would be handy for us in ECL for example.

          Reply
          1. Matt Cordell
            2020-07-30 11:56

            AMT, uses a mix of (equal to) Median and Minimum values. And probably a few unmodelled primitives too.

            Though I think there's so few of these, leaving primitive, is probably adequate.

            Reply
  4. Linda Bird
    2020-07-16 08:58

    Hi Peter - Singapore also has a number of drug products that are defined as having 'at least' X units - i.e. >= X, so please include them (e.g. Jing Jing) in this discussion too.

    Reply
  5. Add new comment