Notes 4 Intro to Clinical Terminology

Uploaded from authorPOINT Lite
Download as
 PPT
Presentation Description 

No description available

Views: 328
Like it  ( Likes) Dislike it  ( Dislikes)
Added: April 30, 2008 This Presentation is Public 
Presentation Category : Education All Rights Reserved
Presentation Transcript

Introduction to Clinical Terminology and Classification Clinical Decision Support L4: Introduction to Clinical Terminology and Classification Clinical Decision Support L4 AL Rector OpenGALEN TopThing UK The Medical Informatics Group, U of Manchester www.cs.man.ac.uk/mig/galen www.opengalen.org www.topthing.com rector@cs.man.ac.uk


The Vision: The Vision Best Practice Best Practice


OpenGALEN: Philosophy: OpenGALEN: Philosophy Terminology is software Terminology is the interface between people and machines Re-use is the key Patient-centred information Terminology must have a purpose Always ask: “What’s it for?” Not art for art’s sake Terminology supports clinical applications - not vice versa Applications for someone to do something for somebody Keep the ‘Horse before the Cart’ Always ask: “How will we know if it works?” “How will we know if it fails?”


OpenGALEN: Key ideas: OpenGALEN: Key ideas Separation of kinds of knowledge Terminology, medical record and information system schemas Concepts, language, Coding, Indexing, Pragmatics Machine level, User level Knowledge is fractal! There will always be more detail to be added Therefore terminologies must be extensible Formal logical Support Too big and complicated to maintain by hand Extensibility requires rules Software needs logical rigour


Axes for kinds of Knowledge: Axes for kinds of Knowledge Machine level Human Level Concepts Language Coding Indexing Pragmatics & User Interface Terminology Medical Records/ Information systems


Slide6: Uses of Terminology Clinical Epidemiology and quality assurance Reproducibility / Comparability Indexing Software Re-use ! Integration and Messaging between systems Authoring and configuring systems Data capture and presentation (user interface) Indexing information and knowledge (meta-data, The Web)


History: Origins of existing terminologies: History: Origins of existing terminologies Epidemiology ICD - Farr in 1860s to ICD9 in 1979 International reporting of morbidity/mortality ICPC - 1980s Clinically validated epidemiology in primary care Now expanded for use in Dutch GP software Librarianship MeSH - NLM from around 1900 - Index Medicus & Medline EMTree - from Elsevier in 1950s - EMBase Remumeration ICD9-CM (Clinical Modification) 1980 10 x larger than ICD; aimed at US insurance reimbursement


Traditional Systems: Traditional Systems Built by people for interpretation by people (Coding clerks) Most knowledge implicit in rubrics Must understand medicine to use intelligently Not built for software On paper for use on paper Enumerated - top down all possibilities listed Serial - Single use - Single View Hierarchical Thesauri Traditional terminological techniques from librarianship ‘Broader than’ / ‘Narrower than’ (ISO 1087) no logical foundation Focused on ‘terms’ Language and concepts mixed Synonyms, preferred terms, etc caused confusion


History (2) : History (2) Pathology indexing SNOMED 1970s to 1990 (SNOMED International) First faceted or combinatorial system Topology, morphology, aetiology, function Plus diseases cross referenced to ICD9 Specialty Systems Mostly similar hierarchical systems ACRNEMA/SDM - Radiology NANDA, ICNP… - Nursing …


History (3): History (3) Early computer systems Read I (4 digit Read) Aimed at saving space on early computers 1-5 Mbyte / 10,000 patients Hierarchical modelled on ICD9 Detailed signs and symptoms for primary care Purchased by UK government in 1990 Single use Morbidity indexing Medical Entities Dictionary (MED) Jim Cimino


History (4): History (4) Aspirations for electronic patient records (EPRs) Weed’s Problem Oriented Medical Record Direct entry by health care professionals Aspirations for decision support Ted Shortliffe (MYCIN), Clem McDonald (Computer based reminders), Perry Miller (Critiquing),.. Aspirations for re-use Patient centred information Needed common multi-use multi-purpose terminology None worked


Summary of Changes at end of 1st Generation: Summary of Changes at end of 1st Generation From terminologies for people to terminologies for machines From paper to software From single use to multiple re-use for patient centred systems From entry by coding clerks to direct entry by health care professionals From pre-defined reporting for statistics to reliable indexing for decision support


Problems with ‘First Generation’ Enumerated Systems in coping with these changes: Problems with ‘First Generation’ Enumerated Systems in coping with these changes


Problems (1): Problems (1) Scaling !!! More detail and more specialities required scaling up, but... The combinatorial explosion Example: Burns: 100 sites x 3 depths  404 codes 5 subsites/site x chemical or thermal  7272 x 3 extents x 3 durations  116,352 ‘The Persian chessboard’ 264  1019 1019 grains of rice  100 billion tonnes of rice 1019 nanoseconds  10,000 years Read II grew from 20,000 to 250,000 terms in ~100 staff-years still too small to be useful but too big to use


Problems (2): Problems (2) Information implicit in the rubrics “Hypertension excluding pregancy” Computers can’t read! Invisible to software No explicit information except the hierarchy Minimal support for software No opportunity to use softwre to help Language and concepts confused Synonyms Preferred terms Homonyms Only simple look up and spelling correction


Problems (3) : Problems (3) Mixed Organisation ‘Heart diseases’ in 13 of 19 chapters of ICD Tumours, infections, congenital abnormalities, toxic, … ‘Steroids’ in five chapters of standard drug classifications Anti-inflammatories, anthi-asthmatics, … Unreliable for indexing or Abstractions How to say something about ‘all heart diseases’? Fixed organisation Single hierarchy - Single use Where to put ‘gout’ - arthritis or metabolic disease? Back and forth in each edition of ICD No re-use


Problems 3b Thesauri rather than Classifications: Problems 3b Thesauri rather than Classifications


Problems (4): Problems (4) ‘Semantic identifiers’ Codes really paths - moving a concept meant changing its code 3 Cardiovascular disorders … 3.4 Disorders of Artery ... ... 3.4.2 Disorders of coronary artery ... … 3.4.2.3 Coronary thrombosis … Easy to process but... Reorganisation requires changing codes Codes cannot be permanent


Problems (5): Problems (5) Maintenance 20 Years from ICD9 to ICD10 ~100 person-years from Read 1 to Read 3 Mega francs/guilders/crowns/marks on European coding schemes Thousands of unpaid hours of committee time Impossible / meaningless decisions take longest You can search forever for something that is not there Multiple uses compete - Must choose one use Most successful were clear about their purpose - ICD, ICPC, MeSH Codes change meaning with version changes Old data misleading!


Problems (6): Problems (6) Version specific artefacts “Not otherwise specified” (NOS) Used to move a general concept ‘down’ Not elsewhere classified (NEC) Catch all - Nowhere else in coding system e.g. ‘Tumour not elsewhere classified’ dependent on version, “Other” Catch all - Not listed below, e.g. “Other diseases of the cardiovascular system” dependent on version Not used consistsently


Problem (7): Language is slippery: Two hands or Four?: Problem (7): Language is slippery: Two hands or Four?


Language/Concepts are slippery: Language/Concepts are slippery Human cognition makes it look easy Logic fails to capture it Classification is easy until you try to do it Trying since Aristotle in the West and Ancient Chinese in the East Words/Concepts mean what a community decides they mean Does a chimpanzee have four hands? Is a prion alive? Is surgery on the ovary a kind of ‘Endocrine surgery’? Easier to agree on the concrete than the abstract Easy to agree on useful abstractions and generalisations Harder to agree on how to name them


Problems (8): Problems (8) There is no re-use - there is no standard The ‘grand challenge’: A common controlled vocabulary for medicine But re-use requires multiple different views People’s needs differ / People do and find different things By profession Doctors and specialties, nurses, physiotherapiests, dentists… By situation Inpatient, outpatient, primary care, community… By task Diagnosis, management, prescribing, patient care, public health, quality assurance, management, planning By country and community US, UK, France, Germany, Japan, Korea, ...


Summary of Problems 1st Generation Enumerated Systems: Summary of Problems 1st Generation Enumerated Systems Enumerated Single Hierarchies List all possibilities in advance Cannot cope with fractal knowledge Most knowledge implicit Invisible to software Can’t agree on common concepts and classification Unreliable for indexing Difficult to use for healthcare professionals No support for user interface Can’t build and maintain big classifications Language and concepts don’t translate easily to logic and software


Cimino’s Desiderata (1): Cimino’s Desiderata (1) Concept orientation Separate language (terms) and concepts (codes) Concept permanence Never re-use a code (‘retire’ it) Nonsemantic concept identifiers Separate the code from the path Polyhierarchy Allow one concept to be classified in multiple ways Gout can be both a metabolic disease and an arthritis


Cimino’s Desiderata (2) : Cimino’s Desiderata (2) Formal Definitions i.e ‘Be compositional’ Reject ‘Not elsewhere classified’ concept permanence and NEC Multiple granularities Organ, tissue, cellular, molecular Grades, types, classes of diseases Special clinical criteria Multiple consistsent views Allow different organisations e.g. functional, anatomical, pathological


Cimino’s Desiderata (3): Cimino’s Desiderata (3) Represent context Family history, risk, source of information Evolve gracefully Allow controlled changes Recognise redundancy (equivalence) ‘Carcinoma’ + ‘Lung’ ?=? ‘Carcinoma of the lung’ How would we know? How could a machine know?


Solution Generation 1 Megaterm + Crossmapping = UMLS: Solution Generation 1 Megaterm + Crossmapping = UMLS Clinical Applications Medical Records Data entry Decision support


Solution 1 Cross-mapping & UMLS : Unified Medical Language System (UMLS) from US National Library of Medicine Defacto common registry for vocabularies Concept Unique Identifiers (CUIs) and Lexical Unique Identifiers (LUIs) are defacto the common nomenclature Solution 1 Cross-mapping & UMLS


Solution 1 Cross-mapping & UMLS : Solution 1 Cross-mapping & UMLS An invaluable resource, but... No better than the vocabularies which are mapped Limited detail for patient care Unreliable for indexing or abstraction of knowledge Best for relating everything to MeSH for indexing literature Still limited by combinatorial explosion Still can’t cope with fractal knowledge Not extensible - no help in building or extending terminologiese No help in reorganising existing terminologies to re-use for new purposes Top down Information still implicit Minimal help with software No help with data capture, user interfaces


Solutions Generations 2-3 Compositional Systems: Solutions Generations 2-3 Compositional Systems Beat the combinatorial explosion Build concepts out of pieces - leggo Dictionary and grammar rather than phrasebook But hard


Solution Generation 1.5: Faceted: Solution Generation 1.5: Faceted Faceted systems: SNOMED International Inflammation + Lung + Infection + Pneumococcus  Pneumoccal pneumonia Limit combinatorial explosion, but… Rigid - a limited number of axes / facets / chapters Each facet has the problems of a first generation enumerated system Much knowledge still implicit No way to know how identifiers relate No explicit relations, only ‘+’ No way to recognise redundancy / equivalence No help with data capture or user interface / No way to recognise nonsense Carcinoma + Hair + Donkey + Emotional  ???? Still can’t cope with fractal knowledge Limited extensibility: limited help with building, extending or reorganising Still Top Down


Generation 2: Enumerated Compositional: Generation 2: Enumerated Compositional Read III with qualifiers Inflammation: site: lung, cause: pneumococcus  Pnemococcal Pneumonia More semantics but… Limited qualifiers - limited views - limited re-use Limited help with data capture - User interface difficult Much information still implicit - limited software support No way to recognise redundancy / equivalence / errors Organisation still mixed - indexing better but still unreliable Limited separation of language and concepts Still can’t cope with fractal knowledge Limited extensibility; limited help with building and reorganising terminologies Top down


CT Vocabulary: CT Vocabulary “Reference Terminology” vs “Interface Terminologies” Reference terminology = enumerated hierarchy of formally defined terms Interface terminology = navigation structure for user interface Explicitly excluded from SNOMED-RT “Terming”, “Coding”, and “Grouping” Terming - finding the lexical string Coding - finding the correct unique code (concept) Grouping - putting codes into groupers for epidmiological or other purposes


Generation 2.5 Pre-coordinated Formal Compositions: Generation 2.5 Pre-coordinated Formal Compositions SNOMED-RT (SNOMED-CT?) Formal logical model for classifying a fixed list of definitions Simple fixed ontology (7 links) GALEN derived terminologies UK Drug Ontology Procedure classifications


Generation 2.5 Pre-coordinated Formal Compositions More semantics: Generation 2.5 Pre-coordinated Formal Compositions More semantics Limited ability to cope with combinatorial explosion Any one pre-coordinated terminology of fixed size but arbitrarily many terminologies might be derived Limited ability to cope with fractal knowledge Limited extensibility Extensibility requires access to ‘Workbench’ Bottom up / middle out More explicit information Logical criteria for correctness / redundancy / equivalence Based on knowledge representation (ontologies) and description logics Limited support for data capture and user interface


Generation 3: Post-Coordinated Formal Concept Model with Constraints delivered as Software Services: Generation 3: Post-Coordinated Formal Concept Model with Constraints delivered as Software Services OpenGALEN Reference Model - PEN&PAD/Clinergy™ Inflammation which hasCause (Infection which hasCause Pneumococcus)  PneumococcalPneumonia  “Pneumococcal Pneumonia” A dictionary and grammar rather than a phrase book Software rather than data A sound logical and ontological foundation


Generation 3: Post-Coordinated Formal Concept Models: Generation 3: Post-Coordinated Formal Concept Models Copes with combinatorial explosion Indefinitely many compositions possible Lists not pre-enumerated Copes with fractal knowledge Easily extensible to add more detail Most information explicit More comprehensive ontology (50-250 links) Good support for data capture / user interface But requires additional pragmatic knowledge layer Separates user view and machine view Intermediate representation vs GRAIL


Case Study 1: The exploding bicycle: Case Study 1: The exploding bicycle ICD-9 (E826) 8 READ-2 (T30..) 81 READ-3 87 ICD-10 (V10-19) 587 V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities


Description Logics: A crash course: Description Logics: A crash course Thing + feature: pathological + (feature: pathological)


Defusing the exploding bicycle: 500 codes in pieces: Defusing the exploding bicycle: 500 codes in pieces 10 things to hit… Pedestrian / cycle / motorbike / car / HGV / train / unpowered vehicle / a tree / other 5 roles for the injured… Driving / passenger / cyclist / getting in / other 5 activities when injured… resting / at work / sporting / at leisure / other 2 contexts… In traffic / not in traffic V12.24 Pedal cyclist injured in collision with two- or three-wheeled motor vehicle, unspecified pedal cyclist, nontraffic accident, while resting, sleeping, eating or engaging in other vital activities


Goodbye to picking lists…: Goodbye to picking lists… What you hit Your Role Activity Location Cycling Accident


Other important links and initiatives: Other important links and initiatives HL7 Vocabulary group See HL7 web site Or join list server SNOMED-DICOM-Microglossary (Radiology) Nursing initiatives - see Nick Hardiker papers ISO TC215 WG2 / CEN TC251 WG3 See web sites


Criteria for success: Criteria for success Re-use A recognised growing library of common decsision support modules Stop starting from scratch! Integration 2+ independently developed DSSs integrated with 2+ independently developed EPRS without exponentially increasing effort.


Criteria for success: Criteria for success Authoring No individual invests in their own terminology enterprise-wide terminology servers Indexing Simplification of systems a sharp drop in special cases and exceptions a sharp increase in authors’ productivity


Criteria for success: Criteria for success User interfaces Real systems in real use with real patients by real clinicians transparent systems


OpenGALEN: OpenGALEN www.opengalen.org