SNOMED Documentation Search


 Other Documents

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Note
titleNote

This page contains some advice on generating multiple keyword index tables. However, please note the wide availability of open source search tools and search capabilities built into databases makes this advice less relevant than it was when originally issued.

The performance of single keyword searches keyword searches is highly dependent on the number of candidate

glossary-ref
Gloss
tdescriptions
returned by the keyword for keyword for subsequent filtering. The extremely high number of matches for some words in common use makes it likely that some searches will be unacceptably slow.

One way to alleviate this problem would be is to create a table containing a row for all combinations of word pairs in each

glossary-ref
Gloss
tdescription
. In some database environments that support optimization of multiple key searches, this may offer no benefits. However, in other environments, such a table may substantially speed searches.

A comprehensive word pair table would be very large. Such a table covering the full content of

glossary-ref
Gloss
tSNOMED
CT
would contain approximately 1.5 million unique word pairs and 6 million rows. Limiting the unique keys to the first three letter of each word reduces the table size to a more readily optimized set of keys. This requires the final part of the search to be conducted using text comparison (since the keys are incomplete).

...

Although Dualkey indexes are available as part of the Developer Toolkit , it is important to know how this table is generated.

Glossary ref
SNOMED CT
users that generate
Glossary ref
Extensions
should follow the method outlined below to generate new entries in the Dualkey index, based on the
Glossary ref
descriptions
in the
Glossary ref
Extension
.

...

Generating a dual key index

For each

Gloss
tdescription
, parse the text of the  
Gloss
tterm
:

  • To avoid inappropriate case mismatches, convert all characters to the same case;
  • Extract words by breaking at spaces, punctuation marks, and brackets;
  • For each word of three characters or more that is not in the a list of excluded words, extract the first 3 characters, and arrange the word fragments in alphabetical order;
  • Generate the dual keys for this glossary-ref
    Gloss
    tdescription
    by concatenating each word fragment with those that come after it in the list;
  • For each dual key, add a row to the word pair tables.

...

Example: Generation of dual keywords for a sample

...

description

Caption table
Example Description



 

Glossary ref
Description Identifier

Glossary refConcept

Gloss

 

 

tDescription Identifier

Gloss
tConcept Identifier

Term

scg-expression

Concept

ShowParts

t

id

33592011||

scg-expression

Concept

ShowParts

t

id

19954002|Reconstruction

of

hip

with

use

of

methyl

methacrylate

(procedure)|

Total replacement of hip with use of methyl methacrylate

To avoid inappropriate case mismatches, convert all characters to the same case

...

"TOTAL REPLACEMENT OF HIP WITH USE OF METHYLE METHYL METHACRYLATE"

Extract words by breaking at spaces, punctuation marks, and brackets

...

  1. TOTAL;
  2. REPLACEMENT;
  3. OF;
  4. HIP;
  5. WITH;
  6. USE;
  7. OF;
  8. METHYLEMETHYL;
  9. METHACRYLATE.

For each word of three characters or more, that is not in

...

a list of excluded words, extract the first 3 characters, and arrange the word fragments in alphabetical order.

  1. HIP;
  2. MET;
  3. REP;
  4. TOT;
  5. USE.

...

Note

In this example "OF"

...

and

...

"WITH"

...

are excluded as they are in a list of excluded words, while "MET" is duplicated, so we only include it once.

Generate the dual keys for this

...

Gloss
tdescription
by concatenating each word fragment with those that come after it in the list

...

For each dual key, add rows to the word pair tables

...

...

 

...

Dual key

...

Glossary ref
Description Identifier

...

HIPMET

...

Scg expression
ShowPartsid
33592011||

...

HIPREP

...

Scg expression
ShowPartsid
33592011||

...

HIPTOT

...

Scg expression
ShowPartsid
33592011||

...

HIPUSE

...

Scg expression
ShowPartsid
33592011||

...

METREP

...

Scg expression
ShowPartsid
33592011||

...

METTOT

...

Scg expression
ShowPartsid
33592011||

...

METUSE

...

Scg expression
ShowPartsid
33592011||

...

REPTOT

...

Scg expression
ShowPartsid
33592011||

...

REPUSE

...

Scg expression
ShowPartsid
33592011||

...

TOTUSE

...

Scg expression
ShowPartsid
33592011||

...

 

...

Dual key

...

Glossary ref
Concept Identifier

...

HIPMET

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

HIPREP

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

HIPTOT

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

HIPUSE

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

METREP

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

METTOT

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

METUSE

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

REPTOT

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

REPUSE

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

TOTUSE

...

Scg expression
ShowPartsid
19954002|Reconstruction of hip with use of methyl methacrylate (procedure)|

...

Caption table
Example Dual Key Index



Dual key

Gloss
tDescription Identifier

HIPMET

33592011

HIPREP

33592011

HIPTOT

33592011

HIPUSE

33592011

METREP

33592011

METTOT

33592011

METUSE

33592011

REPTOT

33592011

REPUSE

33592011

TOTUSE

33592011


Searching for descriptions using a dual key index

A search on the dual key index can only be carried out if the user enters a search

...

string that contains at least two word fragments both of which are three characters or more in length. If the

...

search string does not meet this criterion, the single

...

keyword search mechanism must be used.

  • The user-typed search string is string is converted to consistent case;
  • The string is string is parsed, breaking at spaces and punctuation characters;
  • For each word of three characters or more, extract the first 3 characters, and arrange the word fragments in alphabetical order order ;
  • Create a dual key by concatenating the first two 3 letter word fragments;
  • Use this dual key to look up exact matches on the word pair index;glossary-ref
  • Gloss
    tDescriptions
    found by searching on the word pair index are screened, to see if they contain the complete words in the original search string

...

  • .

Example: Search using word pair index

User searches for "PYRO* 1 OXYGEN*".

The

...

string is parsed, breaking at spaces and punctuation characters.

  1. "PYRO*";
  2. 1;
  3. "OXYGEN*".

For each word of three characters or more, extract the first 3 characters, and arrange the word fragments in alphabetical
Specref
RefTypefield
torder
.

  1. "OXY";
  2. "PYR".

Create a dual key by concatenating the first two 3 letter word fragments.

  • OXYPYR

Use this dual key to look up exact matches on the word pair index.

...


...

 

...

 

...

Dual key

...

Glossary ref
Description Identifier

...

Glossary ref
Description

...

OXYPYR

...

Scg expression
ShowPartsid
1969019||

...

ShowPartsterm
Caption table
Sample results of a search for "PYRO* 1 OXYGEN*"



Dual key

Gloss
tDescription Identifier

Gloss
tDescription

OXYPYR

1969019

Concept
t104951019|2,5-Dihydroxy-pyridine

...

oxygenase|

OXYPYR

...

ShowPartsid

22565018

...

...

Concept

...

t

...

22565018|pyrogallol

...

1,2-oxygenase|

OXYPRY

...

ShowPartsid

104951019

...

...

Concept

...

t

...

104951019|2,5-Dihydroxy-pyridine

...

oxygenase|


  • glossary-ref
    Gloss
    tDescriptions
    found by searching on the word pair index are screened, to see if they contain the complete words in the original search string:
    • glossary-ref
      Gloss
      Description
      Space
      scg-expression
      true
      ShowParts
      1969019||
      is
      t
      id
      Description
       1969019 is eliminated since it does not contain the word "1";glossary-ref
    • Gloss
      Description
      Space
      scg-expression
      true
      ShowParts
      104951019||
      is
      t
      id
      Description
       104951019 is eliminated, it does not contain the word "1" or any word beginning with the string string "pyro".