Page tree

Naming, identification and versioning of resources may not seem like a big deal, however they can impact on how these resources can be used and coordinated. This is particularly important when creating URIs to refer to in specifications (for example bindings) which you do not want to have to update every time you release a new terminology version, and you don't want to have to constantly update the ValueSet definition for terminology release. However there may be times when non-backward compatible changes are made in the underpinning terminology or ValueSet which require a new URI to preserve the meaning and use of the old URI.

This tension can be subtle and complex, and in FHIR there's fields for id, identifier, url, and version to use as levers in resolving this, as well as the technical versioning for each resource.

Fortunately in Australia we've had Liam Barnes working on this in collaboration with our Clinical Informatics team who are the customers of bindings to ValueSets. I'm hoping that by tagging him I can co-opt him into explaining this issue on this page (smile), and sharing our latest thoughts and plans.

The intention is to share these ideas and collaborate to define some guidelines and best practices (or at least a checklist of considerations) when naming, identifying and versioning FHIR terminology resources.

Thanks Liam Barnes!

  • No labels

32 Comments

  1. Happy to share some thoughts on this. 

    For value sets, our strategy for balancing the need for stable URIs with providing a way to represent breaking changes is to use semantic versioning for the version where possible and include the major version in the URL (recommended in FHIR). This way updates can be shown through minor and patch version increments and if there is a breaking change it would force a URI update. 

    Another consideration for value set URIs that identify binding targets is that you want them to resolve. For this reason we ensure the logical id of the resource is the same as the id portion of the URI. (GET [base]/[type]/[id]). This of course means that a major version change requires a new logical id (new resource).

  2. I would not do the URIs like this.  Yes, you want URIs that resolve, but better to have them resolve to a human friendly artifict and not have them tied to implementation details.

    An added advantage of a level of indirection is that it helps in communicating that these are URIs and not URLs.


    1. We are planning to have a human friendly rendering resolve if the request is made by a browser.

      Can you explain your issue a bit more, I don't fully understand it. 

      1. Re-reading your comment above I see I mis-read it the first time.  I had thought you were saying your URIs would be of the form [base]/[type]/[id], but I see now that you're only retaining the id part.

        I now understand (correct me if I'm still confused) that you might have a ValueSet URI like http://myhr.gov.au/profiles/discharge_summary/diagnosis_1 and you'd have the actual ValueSet in a FHIR server at [base]/ValueSet/diagnosis_1 for versions 1.0.0, 1.0.1, 1.1.0, etc and then, after a breaking change you'd have http://myhr.gov.au/profiles/discharge_summary/diagnosis_2 and [base]/ValueSet/diagnosis_2 for versions 2.0.0, 2.0.1, 2.1.0, etc.

        In that case, this seems reasonable.


        1. Thanks for the explanation but I think you were right the first time. 

          The value set uri and the actual location of the value set would be the same. The recommendation in FHIR is that the canonical url is the same as the location of the master version.

          So our urls will look like:

          https://healthterminologies.gov.au/fhir/ValueSet/name-1

          Your understanding of the the version is correct. 

          1. Right, so the distinction/recommendation I would make is that the location of the master version is not tied to its storage in a FHIR server.  Hence it would be quite reasonable to have:

            https://healthterminologies.gov.au/ValueSets/clinical/name-1

            https://healthterminologies.gov.au/ValueSets/demographic/othername-1

            or similar for grouping/organising collections of these things.

            1. What are the benefits of having a master version in a different location to a national terminology server location? (Other than a clearer distinction between uri and url)

  3. My point is that these things are not locations, they are just identifiers.  Decoupling your identifier naming scheme from your technical implementation infrastructure gives you naming flexibility (and makes it easier to change infrastructure).

    in the example above; category / grouping information can be embedded in the identifier which is an aid to humans (but not machines : )

    1. Ok, I'm with you. Thanks for that!

  4. The US NLM VSAC is struggling with exactly the same issues. I think they made the same assumption and choice that Liam made, based in part on the same assumption that they are the source of truth for their value sets so [id] (meta.id) could also be the logical identifier. Problem with this is that the resource identified by meta.id is supposed to only ever have one "version" - it is a versionless thing. At least that is my understanding of it and I suspect there is some inconsistency within FHIR on how this really comes together. This would mean, if I'm correct, that Michael's various versions of 1.xxx are not all available at the same time, at least not "officially."

    I would love to see this directly exercised at the upcoming connectathon - particularly the impact of accessing multiple versions of a value set from the same terminology service.

    1. Rob, I think we talked about versioning of resources toward the end of the Connectathon in San Diego.  The RESTful API page in the FHIR spec has some helpful discussion of this (see 2.21.0.2 Resource Metadata and Versioning, 2.21.0.7 Support for Versions and 2.21.0.9 vread sections).  Basically, assuming that the FHIR server supports versioning, you can have multiple versions of a resource which have the same logical id (.id or [id]) and make version-specific references to the resource (via _history and .meta.versionId, e.g. GET [base]/[type]/[id]/_history/[vid]).

      1. Isn't the issue there that most implementations will initially point to the master version (ie latest) and this could be swept out from under them as a breaking change?   The advantage of being specific about the version from the start (_1) is that the new - breaking - version (_2) will appear and client callers will need to update their code to take advantage of the new functionality, rather than finding their existing code broken?

        1. I'm not sure if I entirely followed that, but at least one aspect of using the "logical" id (ValueSet.url) was to have a consistent reference which would not need to change and wouldn't break functionality due to a change in version (unless your requirements are actually version specific and you want that to happen).

          1. According to the FHIR spec (4.8.3.10) the ValueSet.id (not url) is the "logical id" although, as Michael points out, not really a logical identifier. Looking at the properties in one of the test Value Sets, it would not be possible to test $expansion against specific versions without appending the version to the Url - and so it might not be worth testing this until version is added as an input parameter to that operation.

            valueSet.Id = "extensional-case-1";

            valueSet.Url = "http://www.healthintersections.com.au/fhir/ValueSet/extensional-case-1";

            valueSet.Version = "C17";

            Further observations are that the id is used within the context of the server that's the target of the request; the Url is the 'source of truth' (SOT)for the value set - not it's physical location on the target server; and there's no need to use the identifier property as that's for alternative types of identifier (e.g. OIDs as used in other HL7 products). I am also making the (deliberately) naïve assumption that the SOT server will hold all versions of a value set.

            1. I see the confusion now - "logical" is being used in both places.  The description text for ValueSet.url is "Logical URI to reference this value set (globally unique)".  That's where it's the most visible, as the 'id' element and its documentation is largely hidden on the main resource page.  We may need to stop using "logical" as any kind of reference to an element.

      2. Hi Rob Hausam this reply may be confusing for many.  The logical id (the [id] part in [base]/[type]/[id]) is (in my opinion) badly named since it is specific to each server and thus not stable, nor necessarily under control of the client.

        The field of interest is really the .url field, which is the one that is used in the API calls for $expand etc

        The ADHA has taken the approach of recommending that the fully versioned URI is used as the value for ValueSet.url and the unversioned URI is included as ValueSet.identifier (and just the version part as ValueSet.version).  Thus one would have:

        {
        resourceType: "ValueSet",
        url: "http://myhr.gov.au/profiles/discharge_summary/diagnosis/1.0.0",
        version: "1.0.0",
        identifier: [{
        system: "https://semver.org/spec/v2.0.0",
        value: "1.0.0",
        }],
        ...
        }

        1. I agree that this can be and likely is confusing, Michael Lawley.  Your points are correct.  But the server specific [id] was being discussed, and what I stated was in regard to the resource instance on a specific server.  But, as you point out, what is likely of much more interest is the value set itself, as identified by ValueSet.url.  I think we should further explore whether in fact we should and need to use the fully versioned URI for ValueSet.url.  I'm sure that the original intent was otherwise, and if there is lack of capability or clarity that is pushing us in this direction (if we really don't want to go there), then that probably needs to be identified and addressed.  We will at least have some further discussion of this at the FHIR Connectathon and during the HL7 Working Group meeting next week.

          1. Rob, I guess we are being pushed in this direction (fully versioned URI for ValueSet.url) because ValueSet.url is an input parameter to the $expand operation, whereas ValueSet.version is not. Otherwise one is relying upon the ValueSet.id (or whatever the server will utilise in /ValueSet/[id]/$expand) to identify a ValueSet resource. The section in the spec on ValueSet Identification (4.8.3.1) concludes with the statement that "in a FHIR context....the canonical URL is always the focus". Therefore, perhaps we should test this at the connectathon - by appending /C17 to the Urls of the test value sets?


            1. We recognized this limitation at the Connectathon in San Diego and intend to address it in GForge tracker #13820 (https://gforge.hl7.org/gf/project/fhir/tracker/?action=TrackerItemEdit&tracker_item_id=13820).  This is the tracker that was mentioned on our call today and that I had promised to locate.  We discussed this yesterday on the HL7 Vocab FHIR Tracker Issues call and we have it on the agenda for the Vocab Main call on Thursday this week (http://confluence.hl7.org:8090/display/HL7/2018-01-25+Vocab+WG+Call+Proposed+Agenda).  And, if needed, we will have additional discussion during the WGM in New Orleans.

              1. And yes, we should do testing on this at the Connectathon - and since this is the mechanism that we have at the moment, we'll go with that.  

              2. Thanks for those links Rob. I'd certainly support adding ValueSet.version to the input parameters to the $expand operation. Not so sure about adding CodeSystem.version to those parameters, isn't that already covered by passing a parameter that contains a reference to an ExpansionProfile resource?

        2. MIchael, in that JSON example shouldn't the identifier.value be the unversioned URI?

  5. Rob Hausam what standing does http://www.hl7.org/implement/standards/product_brief.cfm?product_id=437 have relative to FHIR? I'm used to all the FHIR related parts being embedded in https://www.hl7.org/fhir/ - how do these relate?

    Liam Barnes are you close to being able to distill a strawman position as a starting point for guidance?

    1. Apologies for taking so long to reply.  The Characteristics of a Formal Value Set Definition (VSD) standard is specifically referenced in section 4.8.2 Boundaries and Relationships (the 5th bullet) in the ValueSet resource (http://build.fhir.org/valueset.html).  That's the primary relationship - but definitely the VSD standard informed and was used in the development of the FHIR resource.

  6. I'm going to add in here the latest position we (the terminology team at the Australian Digital Health Agency) have in our immature authoring guide for naming and identifying our resources, this is taken from Liam Barnes who's collated all of this.

    Rules are as follows

    FieldRules
    id
    1. For internally defined resources:
      1. ID should be the name of the resource followed by the version. 
      2. Hyphens separate words and version.
      3. Value sets which represent bindings should:
        1. Have a "latest" version represented by an ID ending in the major version number (e.g. relationship-type-1).
        2. Previous versions have IDs ending in the total version (e.g. relationship-type-1.2.0).
      4. Name part of the ID may be composed of a commonly recognised acronym.
    2. Externally originating resources can have any unique ID but for consistency, apply the same as above. 
    url
    1. Internally defined code systems should use the format: https://healthterminologies.gov.au/fhir/CodeSystem/[id]. 
    2. Internally defined value sets should use the format: https://healthterminologies.gov.au/fhir/ValueSet/[id].
    3. Externally defined code systems should use a stable URI that ideally is determined by the content owners and is in their namespace. An alternative option is to use a URL which identifies the location of the source codes, but consideration must be given to the likelihood this can change. 
    identifier
    1. Identifier.value should be an OID (represented as a URN).
    2. Identifier.system must be urn:ietf:rfc:3986
    3. If an OID does not exist, it can be registered in the CT OID Register:
      1. Code System OIDs use the arc: 1.2.36.1.2001.1004.200
      2. Value Set OIDs use the arc: 1.2.36.1.2001.1004.201
      3. Concept Map OIDs use the arc: 1.2.36.1.2001.1004.202
    version
    1. Version to be in the format x.y.x or YYYYMMDD.
    2. A resource instance will exist for every version. Technical versioning will not be used.
    3. Terminology specific semantic versioning interpretation to be:
      1. patch version incremented when there is an insignificant change e.g. typo change that does not affect codes/display values
      2. minor version incremented when there is a significant non-breaking change e.g. addition of a new code.
      3. major version incremented when there is a significant breaking change e.g. removal of a code.
    4. Value Sets that require the inclusion of codes from the latest version will generally not include a value for ValueSet.compose.include.version
    name
    1. Sensible name that clearly describes the content.
    2. Do not include any special characters (ASCII only)
    3. Singular where possible
    4. Title case
    5. No acronyms
    6. Do not include the name of the resource type (e.g. 'Code System')

    I believe this is in line with the guidelines provided by FHIR.

    The identification/versioning rules are trying to meet a few objectives

    • provide sufficient resolution to build ValueSets from the latest or specific pinned versions of CodeSystems
    • provide referenceable specific versions of ValueSets, as well as a "latest" version referencing the latest for a specific major version using semantic versioning - this allows binding to not need to change with each minor change to the ValueSet yet still have to change when major changes are made to the ValueSet
    • provide urls that are resolvable to the resource, this approach does bind together the identification and resolution however without that redirects for each resource would be required to make them resolvable (which we want to achieve)
    • provide as the identifier an appropriate OID to support our CDA binding and uses as described here

    I'll try to get some specific examples from Liam Barnes on Monday to illustrate this by example.

    It is a work in progress, however if people think it is of value we could use it as the beginning of a naming, identification and versioning guide as a tangible outcome for this topic. In effect we could try to develop a generic version of the guide we (Australian Digital Health Agency) need for our work in the open collaboratively, if there is interest in doing that.

    Otherwise we can continue to develop our internal guide and rules and push them here for interest/comment as we go.

    I'm interested in the views of others whether attempting to develop some sort of generic naming, identification, and versioning guide for terminology resource authors is of value and if there are people that would like to collaborate on that or not?

    Failing that are there other suggestions on what this thread should be attempting to achieve or deliver? Or should it just remain a place for people working in this space to tell each other about their naming, identification, and versioning rules or decisions as they go?

    1. Interesting. It's a pity that the ADHA wasn't represented at last week's HL7 International WGM in New Orleans where the issue of versioning - particularly in relation to FHIR ValueSets - was discussed at great length.

      A lot of use cases, and alternative points of view, from around the world were presented, but one thing that everyone agreed on was that the id property is intended to hold a surrogate, 'physical', identifier specific to the server that is the subject of a request - often that will be a database key. That makes it a very poor candidate to implement resource-level versioning.

      The use of the identifier property is not contentious - most domains appear to be using it to link to identifier types used in other standards (e.g. OIDs in CDA), although the NamingSystem resource might be a better means of achieving that goal.

      There was also consensus that the url property should be populated with a canonical url that identifies the author/owner/publisher/source of truth for a value set with a '|' plus version number suffix (for a good discussion of this strategy by FHIR Core Team member Ewout Kramer, see... https://thefhirplace.com/2017/11/28/versioning-and-canonical-urls/).

      Given that FHIR is an HL7 product, I would advise discussing any proposed strategies relating to FHIR localisation (and beyond) with your local HL7 Affiliate as a means of gaining wider acceptance among the implementer community.

      1. Peter Jordan I'm not sure you're correct about the '|' plus version number.  That syntax is for References where the identifying URI and version are combined into a single string.

        Putting a '|' into the URI itself would be a very bad thing in this case since it would create ambiguity over which '|' corresponds to the version separator in a Reference.

        1. Michael Lawley maybe it is, as you say, for hard version references - but the 'official' documentation (and/or Rob Hausam at tomorrow's meeting) will clarify this. Obviously, you are well aware of the issues involved in fishing version numbers out of a URI (smile) and it requires a standard.

          Most of the WGM discussions were driven by various US organisations with very specific requirements and I'm not aware of any NZ use cases that wouldn't be satisfied by adding version as an input parameter to operations such as $expand.

          1. IMO, URIs should be treated as opaque and should not be messed with in any way unless there is a well-defined scheme for assembling them (e.g., the SNOMED URI spec).  Otherwise you can guarantee that you'll end up being handed a URI that breaks your assumptions and thus your manipulation of it will fail.  In some cases this may have security implications!

            FWIW I am supportive of a naming pattern for URIs that includes the (major) version number directly in the URI, but I would use a '/' as a separator and not a '|' so as to steer clear of the behaviour of References

    2. Well summarised Dion. An example of a value set we're drafting for the purposes of binding is as follows:

      FieldValue
      idaustralian-vaccine-1
      urlhttps://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1
      identifierurn:oid:1.2.36.1.2001.1004.201.xxxxx
      version1.0.0
      nameAustralian Vaccine

      We can provide minor version updates to this value set while still maintaining a stable URI that can be referenced in profiles/implementation guides. If we release a v 1.1.0 we would end up with 2 or 3 value sets with the same URI (shown below). Value set 1 is a representation of the latest version with just the major version in the id so that instance will resolve to the URI, providing the latest binding. Value set 2 is optional as it is the same as the latest form. Value set 3 is a historical instance. If we significantly change the value set definition this is likely to affect any binding references so we will release a major version update, necessitating a new URI (https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-2).

      FieldValue
      id_1australian-vaccine-1
      url_1https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1
      version_11.1.0
      id_2australian-vaccine-1.1.0
      url_2https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1
      version_21.1.0
      id_3australian-vaccine-1.0.0
      url_3https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1
      version_31.0.0
      1. I'm not quite sure how to interpret the Fields id and id_x here.  Do they refer to the FHIR Resource id field?  These are "server-owned" fields so that cannot be constrained as part of an editorial convention.  What I think you're indicating are the following:

        {
          resourceType: "ValueSet",
          id: "5eb60f33-ef29-4745-90b0-157a81fcfcf4",
          url: "https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1",
          version: "1.0.0",
          identifier: [{
            system: "urn:ietf:rfc:3986",
            value: "urn:oid:1.2.36.1.2001.1004.201.xxxxx",
          }],
          name: "Australian Vaccine",
          ...
        }

        {
          resourceType: "ValueSet",
          id: "2e9a5f3e-4c1e-42eb-aae6-f48a5a8cb22c",
          url: "https://healthterminologies.gov.au/fhir/ValueSet/australian-vaccine-1",
          version: "1.1.0",
          identifier: [{
            system: "urn:ietf:rfc:3986",
            value: "urn:oid:1.2.36.1.2001.1004.201.xxxxx",
          }],
          name: "Australian Vaccine",
          ...
        }

        But that only allows for two distinct resources, not the three as per the table.


        1. The json in your comment is correct.

          The _x in id_x was just to indicate an individual resource instance. This example shows 3 instances of the value set. I see your point that it's probably best to just stick with distinct versions given they may move from server to server. So scrap the 2nd instance in my table. 

          The id we have allocated is to assist the resolving of the uri as the master copy of the value set. This id can still be assigned something different by a downstream server, but the uri will still resolve to the master. The allocation of an id seems very common in the fhir spec (e.g. http://hl7.org/fhir/valueset-relatedperson-relationshiptype.json.html)