Principles of Database Design: Principles of Database Design NLM/MBL Medical Informatics
Session Outline: Session Outline Why learn this?
Database Principles and Paradigms
Principles of Relational Database Design
System design and building methods
Exercise: Transforming flat files to tables
Why Learn about Database Design?: Why Learn about Database Design? Vendors will sell you on user interfaces, but the power and flexibility is in the data model
Evaluating and comparing products
Communicating with vendors and IT support staff
Building your own databases
What is a Database?: What is a Database? An organized collection of information
Computer-based representation
Systematic, automated retrieval
Systematic, automated symbol manipulation
Historical Evolution of Databases: Historical Evolution of Databases Dedicated files created & maintained by application software (sequential, random access)
Database Management Systems (DBMSs)
Hierarchical Databases: Hierarchical Databases Lab Results 5/30/96 Serum Na+ Pt=Smith Advantages: efficient storage and I/O, rapid access via predetermined data hierarchies
Disadvantages: difficult to view/retrieve data from other perspectives, hard to modify underlying structure
Information Network Databases: Information Network Databases Advantages: Can model complex many-to-many relationships as well as hierarchies and simple lists
Disadvantages: difficult to predict & control effects of transitive relationships; recursion; I/O intensive, potential to become incomprehensible “Database as Hypertext”
Relational Databases: Relational Databases Advantages: Understandable, permits variety of logical aggregation or “views” of data elements, structure easily modifiable, new elements generally do not “break” existing programs
Disadvantages: I/O intensive, 1 logical record may = many physical records, relational integrity is a constant concern & must be under software control “Rows & Columns with inter-table references” Pt-UI Testname Date 12345 Serum_Na 5/30/96
42353 CBC 5/30/96
47756 ESR 5/30/96
12348 HBsAg 5/30/96
34523 Amylase 5/30/96 Lab_test Pt-UI Lname Fname 12345 Smith Elmer
12346 Jones Barbara
12347 Clark Arthur
12348 Jones Casey
12349 Sample Steve Patient
Object-Oriented Databases: Object-Oriented Databases Multiple data types including text, graphics, sound, signals, etc.
Encapsulation of data & programs
Interprocess messaging: e.g., “Print Yourself” Advantages: applications programs consist of high level commands & functions which do not need to know the underlying data organization; modularity, reusability and portability between systems
Disadvantages: early in commercialization; CPU intensive; few standards for query & object sharing
Fundamental Assertions about Systems Design: Fundamental Assertions about Systems Design The Data Model is the most critical aspect of system design and function
Data Models should reflect real world objects and their relationships to ensure durability
A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up
Object-oriented Systems design:Basic Concepts: Object-oriented Systems design: Basic Concepts The World contains Things e.g., Collies, Terriers, Bloodhounds
We develop abstractions of things called “objects” e.g., dog
We group objects by criteria which represent the abstract object as an empty table Dog Name Breed Favorite Food Birthdate
Basic Concepts, cont’d: Basic Concepts, cont’d Empty tables can be filled in to represent the real world things from which the object was abstracted Dog Name Breed Favorite Food Birthdate Boris St. Bernard Canned Jan 81
Fifi Poodle Dry May 92
Fido Pomeranian Canned Apr 87
Basic Concepts, cont’d: Basic Concepts, cont’d There are Relationships between objects which are attributes of those objects Dog Name License Owner Name Lic. Date Relationship: “OWNS”
Dog Owner OWNS Dogs
Objects: Objects All of the real-world things in the set (the “instances”) have the same characteristics
All instances conform to the same rules
Types of Objects (ie., types of tables): Types of Objects (ie., types of tables) Tangible Things e.g., book
Roles e.g., doctor, patient, supervisor
Incidents (=events, occurences) e.g., ordering of a lab test
Interactions (bind two or more other objects via a transaction) e.g., Purchase relates Buyer to Seller
Specifications (definition tables of tangible things)
Table Notation: Table Notation Graphical Form: Patient_Admissions * Pt_ID
-Date_Adm
-Time_Adm
-Unit
-Room Textual Form:
Patient_Admissions (Pt_ID,
Date_Adm, Time_Adm, Unit,
Room)
Formalisms for Tables: Formalisms for Tables Rule 1: One instance of an object has exactly one value for each attribute (i.e, only one data element at each row-column intersection; no repeating groups, no true “holes” in table)
Rule 2: Attributes must contain no internal structure Name Age-Sex
Smith 38-F
Jones 22-M
Clark 18-M Not OK: If Rules 1 and 2 are obeyed, the data model is in “First Normal Form”
Formalisms for Tables, cont’d: Formalisms for Tables, cont’d Rule 3: Every attribute should represent a characteristic of the entire object, not a characteristic of a limited part of the object Not OK: Attribute of hospital
staff appointment, not
committee Hospital Committee Membership
* Person Name
* Committee Name
-Date committee term expires
OK:
Relationships: Relationships A relationship is the abstraction of a set of associations that hold systematically between different kinds of real world things
Patient OCCUPIES bed
Library CONTAINS books
Specimen IS ASSAYED by Lab Method
Most relationships may be stated in the inverse also:
Library LENDS book
Book IS LENT BY Library
Relationship Types: Relationship Types State Governor One-to-One: has governs
Many-to-Many Author writes Book is written by One-to-Many Dog Owner owns Dog is owned by
Modeling Many-to-Many Relationships: Modeling Many-to-Many Relationships DRUG MANUFACTURER
* manufacturer name
- other attributes
DRUG
*generic name
- other attributes LICENSE
* manufacturer name
* generic name
- date licensed
Overall System Design Process: Overall System Design Process Build the Entity-Relationship diagram for all defined objects (tables), [including an Object Specification Document]
[Create a State Transition Model which describes changes to objects based on events or transactions]
[Create a Data Flow diagram which models the information elements which cause State Transitions] [Recommended for multi-programmer projects]
Exercise: Devise a Relational Model for MEDLINE citations: Exercise: Devise a Relational Model for MEDLINE citations
Slide24: UI - 90134185
AU - Greenes RA ; Shortliffe EH
TI - Medical Informatics. An Emerging academic
discipline and institutional priority
MH - Hospital Information Systems; Career Choice;
Medical Informatics/EDUCATION/*TRENDS
PT - JOURNAL ARTICLE; REVIEW; TUTORIAL
EM - 9005
AB - Information management constitutes a major activity
of the health care profession. Currently a number
of forces are focusing attention on this function...
AD - Department of Radiology, Brigham and Women’s Hosp.,
Boston, MA 02115
SO - JAMA 1990 Feb 23; 263(8):1114-20
Sample MEDLINE citation
The “Bottom Line” in Database Design: The “Bottom Line” in Database Design The Data Model is the most critical aspect of system design and function
Data Models should reflect real world objects and their relationships to ensure durability
A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up
Questions?: Questions?