Search



Page tree

  

Role of the Example Database

A SNOMED CT example database is used throughout this guide to illustrate ways to meet a various requirements for practical access to SNOMED CT.  It is not essential for readers to create or have access to an instance of the example database but for some people access to a practical example of the database will be valuable.

Some readers of the guide may wish to setup and use a working version of this database to explore simply to provide themselves with another way to explore SNOMED CT. Others may use the example database to test out the efficacy and performance of the techniques described in the guide. Those with existing knowledge and experience of SQL may also wish to implement and test alternative approaches while reading this guide. Readers may also prefer to adapt the guidance in this document to another database environment with which they are already more familiar. Most of techniques used in the example database are standard features of SQL and that are supported by many relational database environments. Therefore, it should be fairly straight forward to apply or adapt the SQL examples in the guide to other database environments. On the other hand readers who are familiar with databases with more powerful capabilities than MySQL may see opportunities to use additional features to replace some techniques used in the example database.

Requirements for Creating the Database

  • A computer system running Windows or MacOS or a Unix-based environment like Linux or Ubuntu for which MySQL Server 8.0.x if available.
  • Installation of MySQL Community Edition Server (which is used to build and provide access to the database) and  MySQL Community Edition Workbench (which provides a user interface through which the database can be conveniently explored). Both of these are freely available with standard installation packages for all the most widely used operating system environments include Windows, MacOS and other Unix-based environments such as Linux and Ubuntu.
  • Those using Windows will also need to install a Perl processor, but fortunately the Strawberry Perl environment is also freely available, so there is no additional software to be purchased.
  • At least 10.5 Gb of disk storage space during the build process. Assuming the release package archive file and the expanded release package folder are deleted the storage requirement reduces to 6.5Gb after the build has completed 1 .

Creating the Example Database

Those readers wishing to create their own instance of the example database should refer to Appendix A: Building the SNOMED CT Example Database. That appendix provides instructions on the steps required to configure MySql and then to create, populate and experiment with the example SNOMED CT database. These instructions are supported by installation scripts for Windows and MacOS, which have been tested to work with the standard installations of MySQL 8.0.x Community Server for those environments.

Functionality of the Example Database

Overview

The SNOMED CT example database is designed to provide an effective and flexible way to access SNOMED CT content imported from a SNOMED CT release package. Access is provided through running SQL queries against tables containing the data and predefined database views that allow data to be selected in ways that recognize versioning information and where appropriate combine related data from different tables and views. 

Important Note

The example database in its current form should be considered as a read-only resource. Although an SQL databases enables addition, deletion and updating of data, the design of the example database is not intended to be used for editing the release data in any way. The reasons for this limitation include the conditions of the SNOMED CT license as well as lack of support for the formal processes required for authoring, change management and component versioning. Those interested in tools that support editing of SNOMED CT content and reference sets should refer to information on SNOMED International Software and Tools.

FeatureDescription
Database tables created for data in all full and snapshot release filesSeparate database tables are created for the full and snapshot release of each distinct component and reference set type. Consistent design principles are applied to these tables to match the data structure, data type and function requirements.
Import of all full release filesThe full release makes it possible to use SQL queries to access to the complete history of all SNOMED CT components and reference sets from the first release of the terminology in January 2002 up to the most recent release.
Import of all snapshot release filesThe snapshot release provides rapid access to the current view of the terminology, without the need for queries to explicitly exclude earlier versions.
Examples of computed current snapshot views

As the example database imports the snapshot release files into separate tables, there is no requirement for a computed snapshot views. However, computed current snapshot views derived from some of the full release tables are included in the example database. These views provides practical illustrations of the way to derive a current snapshot from the full release. They also allow comparisons of performance and output between SQL queries run on the snapshot table and the same queries run on the computed snapshot view.

Configurable retrospective snapshot viewsThe example database includes two configurable snapshot views of every full release database table. This allows SQL queries to be refer to and compare the current snapshot and snapshot and one or two earlier dates. A simple configuration procedure is included to allow the retrospective snapshot dates of each view to be changed.
Configurable delta viewsThe example database includes three configurable delta views of every full release database table. This allows SQL queries to be refer only to versions of a component added or changed between two specified dates. A simple configuration procedure is included to allow the delta data ranges (start and end times) of each view to be changed.
Language configuration of viewsThe example database includes a simple configuration procedure to specify the preferred display language or dialect. This is dependent on the languages in the release files, so with the International Edition only the options en-US and en-GB can be used. However, when used with National Editions that include translated descriptions and an appropriate language reference set this feature can be readily configured to support those additional language settings. The language configuration setting can be applied separately to different snapshot views allowing or can be switched as part of an sequence of queries to enable multilingual query results.
Consistent access to all components and reference set membersThe same configurable snapshot and delta views are available for all component types and reference set members and these views follow a consistent naming convention. This means individual components and reference sets can be queried in a consistent manner. It also provides a foundation for composite views that bring together data from the same snapshot view of different related tables. The features below illustrate the practical application of this principle.
Language refset dependent views of descriptions and concepts

Built-in views allow concepts to be displayed using either their fully specified name or preferred synonym in a specified language or dialect. Other built-in language dependent descriptions views include:

  • All the active synonyms of each concept (including or excluding the preferred synonym).
  • All active synonyms associated with active concepts (this view is particularly useful for text searches).
Integrated views of relationships and descriptions

The database also features built-in views that provide access to information about relationships between concepts. These views include synonym and fully specified name variants for each of the following:

  • The id and term of all subtype children of specified concept.
  • The id and term of all supertype parents of a specified concept.
  • Defining relationships as the id and term of the source, type and destination concepts followed by the relationship group number.

All references to id and term refer to a pair of columns in the output of the view. The id is the concept identifier, depending on the view the term is either the fully specified name or the preferred synonym.

Full text term searchesThe database uses a full-text index to allow searches by words within a term independent of the order in which those terms appear. Example queries demonstrate this search facility as well as a simple way to display the closest matches first. Other more technical search approaches using complete, pattern and regular expression matching can also be readily applied.  
Rapid subtype testing using a transitive closure table

The database import process includes a step that builds a transitive closure file and loads this into an indexed database table. This enables rapid testing of whether one concept is a subtype descendant or supertype ancestor of any other concept2 .

Procedure demonstrating selection of concepts based on expression constraints

This illustrates an approach to testing expression constraints. The current version of the procedure is limited to constraints including one focus concept with one or two attribute value constraints2 .

Procedure demonstrating term search limited to a specified hierarchy

This illustrates an approach to searching using full text search with the returned results limited to concepts that are subtypes of a specified concept. For example, there are more than 600 synonyms that contains both "mitral" and "valve". However if this search is limited to subtypes of a concept (e.g. procedure, body structure, observable, physical object etc.) it returns substantially fewer matches3 .

Note

This summary of the functionality the database should be interpreted in the scope and context of the its intended use as a demonstrator and learning tool. It is possible that the database may have other practical applications for some use cases, but it is not regarded as a robust or high performance solution suitable for large scale use. Rather it is designed to confirm the feasibility implementing some of the feature of SNOMED CT and to stimulate others to use and improve upon the ideas and approaches described in the guide.



Footnotes
Ref Notes
1 These storage requirements are for the International Edition Release for 2019-07-31. More storage may be required by other Editions with substantial additional national or local extension content.
2 Only the current snapshot view of the transitive closure table is available. Therefore this feature is not available for retrospective snapshots. [ a b ]
3 Only the current snapshot view of the transitive closure table is available. Therefore subtype testing is not available when searching retrospective snapshots.



Feedback