Page tree

With the support for ECL in definition of FHIR ValueSets it is possible to intensionally define (by filters and rules rather than listing codes explicitly) a ValueSet by

  1. creating a reference set in a SNOMED CT (or extension) release and exposing an Implicit ValueSet based on the reference set
  2. do not define a reference set and instead define the ValueSet directly using ECL (or other filters) as required

Both options have pros and cons. Option 1 for example provides a reference set in a SNOMED CT release which can be used by those consuming RF2 and not using FHIR whereas option 2 cannot. However option 2 provides some advantages as well, as the ValueSet definition can be reused and re-evaluated against other (perhaps downstream) SNOMED CT extensions, however option 1 doesn't provide a standard, well used way to expose the machine processable conditions defining the reference set content.

This page is intended to start a discussion and capture the groups thoughts on the pros and cons of both approaches and consider best practices when making this choice.

Invitation to contribute a list of Pros and Cons to this page.

  • No labels

23 Comments

  1. Certainly from a FHIR-API perspective, I favour option 2 as there is a compelling maintenance argument for reusability across SCT versions. I'm also not sure what standard and definitional machine processes you are referring to beyond ECL constraints.

    Incidentally, there is a FHIR gForge tracker item for adding an implicit value set filter that would contain an ECL expression.

  2. I don't think I made a reference to definitions of reference sets beyond ECL...but that is also an issue.

    Maybe Liam Barnes or Matt Cordell can provide some examples, but there are reference sets defined in the Australian extension that use SNOMED CT query language features not present in the ECL (which is a subset). These are a convenience for managing the content and extend from the DL definition into lexical properties.

    So that is another possible reason to go with option 1 over option 2.

    At the moment the Australian services are all option 1, but that's because we have a number of reference sets we've been releasing for a long while. In future we do have a decision to make - do we keep creating reference sets in our extension and expose them as implicit ValueSets or do we create explicit ValueSets with intensional definitions.

    Option 1 could be more on par with option 2 if the definitions can be provided in the SNOMED CT release in a standard way and represented in the implicit ValueSet in some way - for example see Changes to SNOMED CT to improve usage through terminology services

    Option 2 does present an issue for those wanting to consume the RF2 releases and not use the FHIR services, but I'm not sure how big an issue this is.

    1. I would say all but three (I think) of our (AU) Intensional refsets that are currently defined with Query Language, could converted to ECL without a problem. Those three use some lexical conditions, and they're pretty exceptional.

      Two identify what we consider "groupers". And rely on some term patterns to find appropriate concepts. And to be honest, the queries are that complex they end up being run externally as SQL rather than in Lingo for performance reasons (Several hundred lines of subsumption and REGEX...). We're probably at a point we're it's better to simply explicitly define membership of these, and rely on our other processes to identify new candidates. So it's probably not a show stopper for us.

      I agree though, that option 2, neglects anyone using only RF2. But it could be possible to generate refsets from value sets... How the refsets are developed (ECL/SCT QL) is generally irrelevant to users.

      One other consideration is determining historical membership. Excuse my ignorance, but how viable is this with implicit value sets?

      1. Regarding "historical membership", I'd be interested in the use cases for that.

        Using Ontoserver, you can $expand the intensional ValueSet with respect to a specific version of SNOMED CT, so you can use this to determine what was in a ValueSet for any given version.  Additionally, you can define a ValueSet that crosses versions.  Thus, for the ValueSet R, you could define another ValueSet that includes R with respect to the January SCT-AU version and excludes R with respect to the October SCT-AU version and the expansion will tell you which codes used to belong, in Jan, but no longer belong.

        1. No idea on a use case... I'd have to think. So it's probably academic.

          But I do like the idea that a the expansion can tell you what membership even before value set was defined.

  3. SCT Query Language, you're ahead of the game there!  Michael suggested adding another implicit value set filter for that language when it's officially released (2018?). Understand that the option 2 is FHIR-centric and I may well get some push-back on that at a NZ HISO SCT Implementation Group meeting tomorrow. However, we don't have dependencies on existing RF2-formatted Reference Sets or the resources to create and maintain a whole bunch of them.

    1. SCT Query Language, you're ahead of the game there!

      We've got pushy terminology authors! (wink)

      Maintaining the reference sets is more or less easier for us at the moment which is also a factor. We use a terminology authoring tool called Lingo (internally developed) which has a reference set authoring feature. It stores queries against reference sets and can re-evaluate and report on changes to reference sets, and obviously it can see all our unpublished work in progress.

      We the intensionally defined reference set approach to concept lists manually curated because of the maintenance effort. It is actually not too bad maintaining the intensionally defined reference sets with the tooling we have wrapped around them, but it is only possible because the vast majority are intensionally defined.

      For ValueSets the tooling story isn't as strong, particularly the quality process and versioning. Not to say it couldn't be, but we don't have the tools in place to do that in the same integrated way we do now.

      So that's another factor to consider for us, not that it is a reason in itself to not change - different tooling can be put in place.

  4. Just to add that this is a very important topic for us and this discussion couldn't be better timed. It's also good for me to be able to report that SNOMED on FHIR is providing some real value for us already!

    1. Glad it is useful - it will be really useful for us because we face similar and imminent questions, albeit with more existing reference sets and toolchain to consider.

      SNOMED on FHIR is definitely already useful.

  5. I'm not following all that has been said here - need to study it a bit more - but I have some concerns about FHIR Implicit ValueSets. Per the HL7 Characteristics of a value set definition spec, and based on traditional useage needs, a "value set definition" (the thing that is used to generate a value set expansion which is the set of usable concepts/codes) has metadata therefore it is a persistent artifact. Value set expansions are the result of applying the "content logical definition" portion of a value set definition against a code system version to generates the set of usable codes.

    A "FHIR Implicit ValueSet" is in essence a stored code system query that generates a resulting list of codes. In FHIR, that query is embedded inside the URI used to 'fire it off' and so you could consider the URI the "identifier" for the "implicit valueset" and I'd agree, and you could also say that the "second part of the URL" such as ?fhir_vs=isa/[sctid] is the VSD Content Logical Definition (CLD)  but these implicit valuesets are missing other required metadata to make them a conformant value set. You may not like that but it's something discussed quite a bit in the balloting process for that artifact.

    Honestly I think it actually simplifies some of the operations if we consider these "implicit" things as stored queries and not value sets, but I'm concerned that some like and use them because as such they don't have to store and maintain all the value set metadata.

    I'm interested your thoughts on all this.

    1. I believe that FHIR implicit value sets, which are based directly on the underlying semantics of the code system, are essentially a "lightweight" approach to providing the value set CLD, which then can be used, of course, to generate a value set expansion.  Because they are "lightweight" and are tied directly to the code system semantics, the fact that the implicit value set url doesn't provide some of the metadata elements that are specified by VSD may not be a problem if what is provided (by the code system and url) is sufficiently well-defined for the cases where these value sets are used.  If that is not sufficient in any particular case, then instead the full ValueSet resource capabilities should be used (which are intended to support and align with VSD via the core spec plus extensions).  I believe that is the argument.  If there is a problem with the implicit value sets not being "conformant" with VSD, then that is something that we should look at (and possibly we could update VSD, if necessary, to allow for this, if we agree that it's reasonable).

      1. I follow your point and understand these "implicit" items have value, but they are not by definition, value sets. I agree they are in essence a persistent CLD, which is why I think of them as a stored, persistent code system query. Even if the name has to stay, what I'd like to see is we don't treat them like value sets, so for example, they would not by default be included when a terminology service is asked to retrieve value sets matching a specific code, as was discussed in a zulip thread. I suppose if a particular terminology service did want to include them, they could note that in the capability statement and extend the operation to include such things.

        1. I think it's important to consider that there are at least two kinds of (SNOMED) implicit ValueSets in play here - the ?fhir_vs=isa/[sctid] style and the ?fhir_vs=refset/[sctid] style.

          For the first case, I believe what we have is a family of ValueSet definitions that can be considered to have one boilerplate, templated, piece of (fairly generic) metadata.

          The more interesting case is the second one since these correspond to explicitly curated code lists for specific purposes and thus fall under the umbrella of value sets even if, today, that metadata is not explicitly represented in a machine-readable way.  Note that many / most of the Australian SNOMED reference sets have associated metadata in a PDF file.  Resolving this is essentially what this topic Changes to SNOMED CT to improve usage through terminology services is about (as Dion notes below).

  6. So I'm not sure if we've strayed off the original topic here, although there is an interesting discussion to be had around supporting metadata for implicit value sets. See point 1 at Changes to SNOMED CT to improve usage through terminology services.

    I'm not sure I totally get your point Robert McClure. Implicit value sets are as Rob Hausam points out just a lightweight way to leverage existing characteristics of code systems and are quite useful. In the context of SNOMED CT there are two types, by subsumption and by reference set.

    There is a difference between these as in SNOMED CT the by subsumption implicit value sets there are as many of these as there are concepts, and there was no particular reason they were created, they just exist because of the subsumption hierarchy and they are just a simple query. With regard to the implicit value sets from reference sets, these are a bit different - each reference set was created and is maintained for a particular purpose, and although the metadata associated with their creation, purpose etc isn't explicitly stated in the SNOMED CT content it often is in a document somewhere. The page I refer to in the above link talks about exposing that metadata in a defined way in SNOMED CT content so it could be surfaced through FHIR and be more accessible than documents. Is this relevant to the point you are making?

    But the main topic of this page was really intended to be for those maintaining sets of codes, is it more sensible to manage those as reference sets in a their SNOMED CT extension (assuming they have one) and expose them in FHIR as implicit value sets, or maintain them as FHIR ValueSet resources.

    There are pros and cons to both. I was hoping we could explore those pros and cons to inform the decision making of organisations like mine who have to maintain a body of sets for a variety of purposes. Do you have any views on that?

    1. These are some of the key advantages of maintaining a FHIR ValueSet definition over a SNOMED reference set in an extension:

      1. FHIR ValueSets allow explicit representation of metadata
      2. FHIR ValueSets provide useful machinery for expressing computable inclusion / exclusion rules
      3. FHIR ValueSets, being separate from the SNOMED extension itself, can have independent lifecycles. That is, they can be published / updated independently and at different intervals to a SNOMED extension
      4. How to define and use a SNOMED intensional reference set is not well documented
      5. FHIR ValueSets are self-contained, whereas SNOMED reference sets require embedding in the monolithic RF2 file format


  7. From a practical perspective, I still favour option 2 in the page header. There is also a third type of FHIR Implicit Value Set for SNOMED CT to consider, which is 'by text' filtering on descriptions - plus a fourth in the offing 'by expression constraint'. These later two types are very much runtime artefacts required to support dynamic querying at the application presentation layer - so perhaps outside the scope of metadata concerns?

  8. Lots of different issues here so I'm just going to run through some of them:

    1. When I say "by definition a value set" I am highlighting that we in the standards community have a stake in the ground that we must respect. The HL7 Characteristics of a Value Set Definition STU will transition to Normative this year and it clarifies that for something to be a standards-based value set, you must have certain metadata. I'm not saying that is the only kind of code set collection. And I absolutely agree that stored "implicit" code system queries (aka "implicit value sets") are valuable and we should use them as needed. But I'd like us to be consistent and standards-based so if we say "value set" we mean the standard as defined, not something close. Dion McMurtrie My point here is I'm fine with "implicit queries" but I'm not fine with SNOMED or anyone else saying the a value set in the context of standards-based exchanges. This means that if "the metadata is somewhere, like a pdf" then while it's close, it's not a conformant value set. You all can disagree and if you do I encourage you to participate in the upcoming normative discussion, and I would suggest you clarify the advantages of whatever you propose.
    2. Clearly FHIR is playing fast and loose with this. It pains me when FHIR seems to adhere to only those standards it likes and expects everyone to adhere to those it promotes.
    3. I'll admit I'm not following the nuances wrt the difference that a SNOMED reference set creates versus other approaches, but I believe that a refset does not have the required value set metadata, therefore it is yet another way of describing code system concept collections.  Useful for some and I do understand that SNOMED clarifies the distinction.
    4. As for your last "this is the point of our discussion" - I hope my position is clear, if you want a standards-based conformant value set for sharing and use, you should align with the standard. I'll note that while far from perfect, the FHIR valueset resource is not bad and I'd strongly support its use.
    1. Hi Rob,

      I've just had a quick look through the HL7 VSD document dated June 2016.

      Section 10.5 says "A Profile that strictly aligns with the capabilities defined in this document will be developed during the STU phase of this specification to provide guidance to FHIR implementers who wish to create more sophisticated value set definitions."  Has this been produced yet?  It would be a valuable aid in understanding where the gaps are between FHIR's perspective on this and that expressed in the document, and thus assist in understanding where SNOMED reference sets fit as well.

      Regarding things like metadata being in PDFs or other assorted places, this is one of the main problems that I (and, I believe, Dion McMurtrie) wish to fix.

      BTW 10.5 needs updating for FHIR STU3 and later - FHIR ValueSets can no longer define CodeSystems.


      1. Regarding things like metadata being in PDFs or other assorted places, this is one of the main problems that I (and, I believe, Dion McMurtrie) wish to fix.

        Yes, that's exactly what I'm after

      2. NO it's not been created. Volunteer work being hard to come by. At this point just identifying where FHIR and the VSD spec meaningfully diverge would be the first thing, but Graham has been pretty good on following things so I suspect the variances are in optional elements.

        As for the metadata in pdf, I think you are saying you agree - great.

    2. I think our objectives align Robert McClure, but perhaps our use of names is perhaps loose.

      One of my main concerns with SNOMED CT reference sets is that this metadata is not uniformly available in a standard way, I'd like to see that fixed and I'm not too particular about how exactly. Then I'd like to define how that metadata is surfaced through FHIR implicit ValueSets based on SNOMED CT reference sets to ensure they do behave like the value sets defined in the trial use standard. That's more or less what I'm trying to achieve, although most of this specific conversation is happening on the Changes to SNOMED CT to improve usage through terminology services page, the topic here isn't precisely that.

      The other implicit ValueSets in FHIR (for SNOMED CT at least) behave like queries and can't really have sensible metadata associated with them. Unlike reference sets they haven't been pre-made by someone for a specific purpose, so this metadata doesn't naturally exist. In this way I don't think these FHIR implicit ValueSets will ever meet the trial use standard definition of a "value set".

      In terms of the original point of the discussion on this page, we're trying to determine the relative merit in continuing to develop SNOMED CT reference sets and surface those through FHIR implicit ValueSets (ideally with the metadata issue mentioned resolved), or if it makes more sense to dispense with the SNOMED CT reference sets and simply create ValueSet resources (either intensionally or extensionally defined). Have you got any thoughts on that topic? Relative pros and cons?

  9. I understand the need to adhere to agreed standards, but perhaps we should approach the intensional-valueset-metadata issue from another angle? Working on the basis that there is a key requirement to run dynamic FHIR-based API queries against entire Code Systems at runtime - what is the best resource for returning the results of those queries?  If it's not the ValueSet resource, because it's not possible to construct metadata that satisfies the criteria for it being a Value Set that complies with an agreed standard, what resource might be used as an alternative?  However, part of me still hopes that it might be possible to agree upon a metadata standard for Intensional Value Sets that will satisfy all parties; if that's not possible, perhaps there is a case for reviewing the standard definition of a value set on the basis that it fails to satisfy a critical terminology services requirement.

    1. I see value sets as having metadata and code system queries as not having metadata, so they are fundamentally different in terms of the ability to reference them for other uses. I also see no important distinction in "intensional versus extensional" types of content logical definitions (the query part) - the definition (CLD) is a stored query, one happens to use code system characteristics in the definition so it's more complex.I'm a bit unclear of the use cases other than "all codes in a code system" for a stored query to consistently retrieve a specified set of codes (intensional or not) where value set metadata is useless. Help me understand why such a thing is useful? Perhaps this is you trying to Map existing refsets to FHIR. If so, make them all value sets with metadata. If you don;t know what the metadata should be - figure it out or decide if they are so important when no one understands what they are. Really.

      As for "what resource" can do both a real value set and the few legitimate stored-query "implicit" types and I'm not quite sure. Stuff these into a value set resource and we'll need to clarify that they don't have to be "real" value sets. But here I'll admit that FHIR has relaxed the required elements in the resource so you can get away with these so this currently boils down to working hard to do the right thing only when you want to (wink)