Document Author(s): | Yongsheng Gao |
Change Owner: | Yongsheng Gao |
Content Editor: | Yongsheng Gao |
Version: | 1.0 |
Date Created: | 20160809 |
Document status | Draft |
Related Tracker Artifact(s): | https://jira.ihtsdotools.org/browse/IHTSDO-950 Now SCTQA-7 (same link) |
Currently, the case significance only used two values in the international release of SNOMED CT.
The issue is that the value 900000000000448009 |Entire term case insensitive| has not been used due to the existing data migrated from RF1 and previous authoring tool restriction. In fact, a majority of SNOMED CT content (over 618,000 terms) should be ‘Entire term case insensitive’. These terms can be freely switched to lower or upper case with no impact to their meaning. The following is the algorithm for identifying the descriptions which should be 'Entire term case insensitive'.
Examples are provided in the table row highlighted in light blue background color.
The following table demonstrates the differences between RF1 and RF2 for 4 different situations that require specifying case significance.
Situation types with examples | Only initial character of the term (RF1 spec) | The characters other than the first | Case Significance Value (RF2 spec) |
CT of abdomen; pH measurement; von Willebrand disease. | Case Sensitive | Case Sensitive | 900000000000017005 | Entire term case sensitive (core metadata concept) | Symbol: CS |
Fracture of tibia; Abdominal aorta angiogram. | Case insensitive | Case insensitive | 900000000000448009 | Entire term case insensitive (core metadata concept) | Symbol: ci |
Addison's disease; Down syndrome; English as a second language. | Case Sensitive | Case insensitive | 900000000000017005 | Entire term case sensitive (core metadata concept) | Symbol: CS |
Family history of Alzheimer’s disease; Born in Australia; Borderline abnormal ECG; Main spoken language English. | Case insensitive | Case Sensitive | 900000000000020002 | Only initial character case insensitive (core metadata concept) |
Symbol: cI |
The assumption is that the current case significance assignment are correct following RF1 specification. We are aware of a number of incorrect or inconsistent assignments and capitalisations. They have to be addressed in separate projects. Please see the list for identified related issues in the following section agreed scope statement.
The case significance assignment needs to be updated to conform to the RF2 specification. The estimated number of changes is 618,000 for assigning Entire term case insensitive.
This is a data quality issue in the international release.
There are some incorrect assignments that cannot be systematically addressed in this project.
Assigning 900000000000448009 |Entire term case insensitive| to appropriate descriptions (over 612,000).
Identify additional changes:
The other quality issues related to case significance will be addressed in separate projects listed below. They will require additional manual review and corrections. The batch changes to the descriptions will not fix the existing problems. However, they won't make them worse.
It would be time-consuming to modify case significance for each description. The batch change to a large number of descriptions will overcome the issue and provide consistency.
The identified 618,931 descriptions in January 2016 release will be updated. The current case significance assignment is 900000000000020002 |Only initial character case insensitive|. The value will be replaced by 900000000000448009 |Entire term case insensitive| for these terms.
A tab delimited text file for these descriptions will be generated. It contains information for concept id, term id, term and case significance id as shown in the following table.
ConceptID | TermID | Term | CaseSignificanceID | New_CaseSignificanceID |
104001 | 1309013 | Excision of lesion of patella | 900000000000020002 | 900000000000448009 |
104001 | 557742014 | Excision of lesion of patella (procedure) | 900000000000020002 | 900000000000448009 |
104001 | 1310015 | Local excision of lesion or tissue of patella | 900000000000020002 | 900000000000448009 |
106004 | 1313018 | Posterior carpal region | 900000000000020002 | 900000000000448009 |
106004 | 297649012 | Structure of posterior carpal region | 900000000000020002 | 900000000000448009 |
106004 | 577123019 | Structure of posterior carpal region (body structure) | 900000000000020002 | 900000000000448009 |
The list of terms have been reviewed by terminology authors and the obvious errors have been fixed. However, the review was intended to identify issues that applied to multiple terms. The individual change is out of the scope for this project.
Technical team will implement the case significance ids for the descriptions in the final list which also include the latest new additions.
Three values for case significance in RF2 specification has been implemented in the editing tool for international edition. The Editorial Guide has been reviewed and updated to reflect the changes.
Consultation with content team, technical team, implementation and education.
Feedback from content team includes:
Feedback from implementation and education:
Feedback from Content manager AG:
The incorrect assignment for case significance for identified list of descriptions will be fixed according to RF2 specification. The impact to end users are minimum. The system developers and implementers will need to update their systems following RF2 specification if their systems have utilized the case significance. However, the changes will be minimum and limited to configuration. It would be unlikely to require system software development. The algorithm and technical implementation should be considered for easy adoption by the national extensions.
The risk of inaction is a quality issue and non-compliance to standard specification. The potential risk of making the change could be performance impacts to the front end editing. The other potential impact to resource for the SCA (Single Concept Authoring) tool development. The batch changes will involve a large number of descriptions. This kind of change has never be done before. The potential risk is that the changes may take much longer time to complete which could have impact to front end editing. Therefore, the risk needs to be assessed in the UAT environment before implementing to the production. The batch changes to the descriptions in scope of issue 1 to 4 will not fix the existing incorrect assignment. They have to be addressed in separate projects.
Complete | Approved by | Approval Date |
Complete | Business Services Executive | 14/07/2016 |
Complete | Head of Terminology | 14/07/2016 |
? | <Other> |
|
The issue of case significance in current data needs to be addressed first. Then, the future editing in the SCA tool will follow the RF2 specification. The case signification values are available in the current SCA editing tool. Some changes have already been made by authors following the RF2 specification. Therefore, the inconsistency became more apparent between the new addition and existing content. We need to fix the existing content and also introduce the QA rule to reduce the inconsistency.
There will be no frontend editing in the SCA. The steps are described in content editing section.
Random check changes in the SCA tool when the batch is completed.
Query case significance assignment in the daily build of RF2 snapshot. This will also identify further changes for new additions before the batch change. The QA rule should be developed according to the algorithm proposed in this document. It will prevent the new content won't be following RF1 specification anymore.
The default assignment has been changed to ci instead of current cI. The runtime QA warning have been developed and implemented.
July 2017