Summary

SNOMED International propose to increase the maximum length of Fully Specified Name and Synonym descriptions from 255 characters to 4096 characters.   The requirement for this change has come from the Pharmaceutical / biologic product hierarchy, where the terming guidance for FSNs causes the current limit to be exceeded where there is a large number of ingredients in a medicinal product, which is particularly common in multivalent vaccines. 

There is some likelihood that implementers will have hardcoded database schemas to 255 characters, so a significant lead time is proposed before this change would take effect.   

Introduction

The RF2 Specification for SNOMED CT Descriptions states that the overall length limit for a description is 32Kb (understood to be Kilobits), equating to 4096 single byte characters.   This maximum length is then further restricted on a per description type basis, as specified in the Description Type Refset ( specifically: "to a maximum length, configurable for each description type as defined in the 900000000000538005 | Description format reference set| member associated with that description type - see the  Description Format Reference Set specifications document for more details."

At the October 2023 SNOMED International Business Meetings, the Modeling Advisory Group discussed a requirement (that had been brought up by the Editorial Advisory Group) to increase the limit for Fully Specified Name (FSN) and Synonym Descriptions from the current 255 characters, up to 4096 characters.    This represents the largest number of characters allowed by the specification, and would bring these two description types into line with the existing limit for the Text Definition description type.    The Modeling Group agreed that it would already be within the specification of RF2 to make this change just by increasing the size specified in the Description Type reference set, however, this would quite likely have an impact on implementers who may have created data storage structures that are not dynamically sized based on the values in the Description Type file.

Background and Rationale

Editorial Guidance for Medicinal Products, dictates that the FSN should include a concatenation of the preferred terms of all ingredients (in alphabetical order and separated by "and").   A worst case example of this currently active in the International Edition is  1162634005 |Pediatric vaccine product containing only acellular Bordetella pertussis, Clostridium tetani and Corynebacterium diphtheriae toxoids, Haemophilus influenzae type b conjugated, Hepatitis B virus and inactivated Human poliovirus antigens (medicinal product)|

which has been tweaked from what our automated systems would normally suggest, to fit in to the current limit of 255 characters.    If you look at the actual list of ingredients involved, it's clear that more characters would be required to correctly represent this product

Discussion on size limits, Editorial Guidance, and display of lengthy descriptions

The main intention with this change is to ensure that implementers allow for the maximum allowed description character length, but this does not necessarily mean that any particular release of SNOMED CT is going to feature large numbers of descriptions that get anywhere near this limit - they are expected to be quite exceptional.   In particular, this increase is mostly to accommodate Fully Specified Names as their formation is often procedurally dictated by attribute values in the concept model.  Since user interfaces will generally display preferred terms, rather than FSNs, there is a low expectation of longer descriptions breaking current display formatting.   Also no use case for this length increase has yet been identified for the International Edition outside of the Medicinal Product Hierarchy.

That said, the latest release of the New Zealand extension is expected to feature this concept: 264031000210106 |Plasma cell neoplasm multiple myeloma relapsed systemic anti cancer therapy regimen using dexamethasone via oral route and bortezomib via subcutaneous route chemotherapy and daratumumab via subcutaneous route immunotherapy every three week followed by daratumab via subcutaneous route immunotherapy every four week (regime/therapy)| which clocks in at 331 characters!   So this proposed change is also required more widely.

This change in intended to ensure SNOMED CT remains flexible enough to accommodate future terming requirements.    By extending the 'physical' constraints on description lengths, we would suggest that the responsibility for setting reasonable term length limits be moved into Editorial Guidance and validation constraints presented to users as warnings, rather than enforcing this arbitrary limit at a database storage level.   For this reason, while limits of 512 and 1024 characters has been discussed, it was felt that - at some point in the future - we would have to revisit whatever limit was chosen, and therefore it would be expedient to just set the limit to be the largest number of characters currently allowed.

Proposed Changes to RF2

SNOMED International propose to modify the existing Description Type Refset such that all three description types would have a limit of 4096 characters.   Assuming this is done in time for the January 2025 release, this file would be named der2_ciRefset_DescriptionTypeSnapshot_INT_20250101.txt

ideffectiveTimeactivemoduleIdrefsetIdreferencedComponentIddescriptionFormatdescriptionLength
<UUID>2014013119000000000002070089000000000005380059000000000005500049000000000005400004096
<UUID>2025010119000000000002070089000000000005380059000000000000130099000000000005400004096
<UUID>2025010119000000000002070089000000000005380059000000000000030019000000000005400004096

Note that the line Text Definition is unchanged here since 20140131, having already been set to 4096 characters at that time.

Sample RF2 for testing

This delta zip archive contains the proposed changes to the Description Type Refset as well as a single description which is 331 characters in length - the FSN for Concept 1162634005 extended to conform to what our automated term generator would produce: "Pediatric vaccine product containing only acellular Bordetella pertussis antigen and Clostridium tetani toxoid antigen and Corynebacterium diphtheriae toxoid antigen and Haemophilus influenzae type b capsular polysaccharide conjugated antigen and Hepatitis B virus and inactivated whole Human poliovirus antigen (medicinal product)".    The effective time of the changes have been set to 20240505.    System administrators can try loading this file into test systems to check how the longer description length setting and usage will affect them.

xSnomedCT_InternationalRF2_TestDescLenDelta_20240505T120000Z.zip

Next Steps

SNOMED International request that users of SNOMED CT provide feedback on this proposal via this form. SNOMED International will respond to feedback received - on this page - until the end of this consultation exercise on 31 July 2024.  

Page Contents


Feedback Form

SNOMED International request that users of SNOMED CT provide feedback on this proposal via this form. SNOMED International will respond to feedback received - on this page - until the end of this consultation exercise on 31 July 2024.  




Comments

    Add new comment