“Show Me How to Get Past MCQs: Emerging Opportunities in Measurement ”Carol O’Byrne, PEBC Karen S. Flint and Jaime Walla, AMPDrs. Frank Hideg, Paul Townsend, & Mark Christensen, NBCEAlison Cooper, CAPRLila Quero-Munoz, Consultant : “Show Me How to Get Past MCQs: Emerging Opportunities in Measurement ” Carol O’Byrne, PEBC Karen S. Flint and Jaime Walla, AMP Drs. Frank Hideg, Paul Townsend, & Mark Christensen, NBCE Alison Cooper, CAPR Lila Quero-Munoz, Consultant Presented at the 2004 CLEAR Annual Conference
September 30 – October 2 Kansas City, Missouri
Goals : Goals Gain an overview of performance assessment
Observe and try out electronic & standardized patient simulations
Consider exam development, implementation and administration issues
Consider validity questions & research needs
Create computer-administered & standardized patient simulations with scoring rubrics
Set passing standards
Part 1 - Presentations : Part 1 - Presentations Introduction to performance assessment
Purposes and objectives
Models
Issues, successes and challenges
15-minute presentations
Four models, including their unique aspects with two participatory demonstrations
Developmental and ongoing validity issues and research studies
Part 2 - Break-out Sessions : Part 2 - Break-out Sessions Identify steps in development and implementation of a new performance assessment and develop a new station
Create a new electronic simulation and set passing standards
Create a new standardized patient simulation and scoring rubrics
Participate in a standard setting exercise using the ‘Competence Standard Setting Method’
and all the while, ask the ‘hard questions’
Performance Assessment - WHY? : Performance Assessment - WHY? To assess important problem solving, critical thinking, communications, hands-on and other complex skills that:
Impact clients' safety and welfare if not performed adequately and
Are difficult to assess in a multiple choice question format
HOW? : HOW? ‘Pot luck’ direct observation (e.g., medical rounds, clerkships and internships)
Semi-structured assessments (e.g. orals and Patient Management Problems)
Objective, Structured Clinical Examinations (OSCEs) (combining standardized client interactions with other formats)
Other standardized simulations (e.g., airline pilots' simulators)
Electronic simulations (e.g., real estate, respiratory care, architecture)
Does it really work? : Does it really work? Links in the Chain of Evidence to Support the Validity of Examination Results:
Job Analysis
Test Specifications
Item Writing
Examination Construction
Standard Setting
Test Administration
Scoring
Reporting Test Results
PEBC Qualifying Examination : PEBC Qualifying Examination Based on national competencies
Two parts:
MCE & OSCE
Must pass both to be eligible for pharmacist licensure in Canada
Offered spring and fall in multiple locations
1400+ candidates/year
$1350 CDN 15-station OSCE
12 client interactions (SP or SHP) + 3 non-client stations
7 minute stations
One expert examiner
Checklist to document performance
Holistic ratings to score exam
Standard Setting
Reports – results and feedback
Competencies Assessed by PEBC’s MCE and OSCE : Competencies Assessed by PEBC’s MCE and OSCE
Comparing PEBC’s OSCE (PS04) and MCE (QS04) Scores : Comparing PEBC’s OSCE (PS04) and MCE (QS04) Scores
Comparing PEBC’s OSCE and MCE scores : Comparing PEBC’s OSCE and MCE scores
Holistic Rating Scales : Holistic Rating Scales COMMUNICATION Skills (1)
Rapport
Organization
Verbal and nonverbal expression
Problem-solving OUTCOME (2)
Information processing
Decision making
Follow-up
Overall PERFORMANCE (3)
Comm & Outcome
Thoroughness (checklist)
Accuracy (misinformation)
Risk
Validity – an ascent from Practice Analysis to Test Results : Validity – an ascent from Practice Analysis to Test Results Job/practice analysis
Who/what contexts?
How?
Test specifications & sampling
Which competencies?
Which tasks/scenarios?
Other parameters?
Item writing and review
Who and how?
Scoring
Analytic (checklists) &/or holistic (scales)?
Validity – an ascent from Practice Analysis to Test Results : Detect and minimize unwanted variability, e.g.:
Items/tasks – does the mix matter?
Practice effect – how can we avoid it?
Presentation/administration – what is the impact of
different SPs, computers, materials/equipment?
Scores – how do we know how accurate and dependable they are? What can we do to improve accuracy?
Set Defensible Pass-fail Standards
How should we do this when different standard setting methods -> different standards?
How do we know if the standard is appropriate?
Report Results
Are they clear? Interpreted correctly?
Are they defensible?
Validity – an ascent from Practice Analysis to Test Results
Validity – flying high : Validity – flying high Evidence
Strong links from job analysis to interpretation of test results
Relates to performance in training & other tests
Reliable, generalizable & dependable
Scores
Pass-fail standards & outcomes
Feasible
Large & small scale programs
Economic, human, physical, technological resources
Ongoing Research
Wild Life : Wild Life Candidate diversity
Language
Training
Format familiarity,
e.g. computer skills
Accommodations
Logistics
Technological requirements
Replications (fatigue, attention span)
Security
“Computer-Based Simulations”Karen S. Flint Director, Internal Development & Systems IntegrationApplied Measurement Professionals, Inc. : “Computer-Based Simulations” Karen S. Flint Director, Internal Development & Systems Integration Applied Measurement Professionals, Inc. Presented at the 2004 CLEAR Annual Conference
September 30 – October 2 Kansas City, Missouri
Evolution of Simulation Exam Format : Evolution of Simulation Exam Format AMP’s parent company, NBRC, provided oral exams from 1961 to 1978
Alternative sought due to:
Limited number of candidates that could be tested each administration
Cost to candidates who had to travel to location
Concern about potential oral examiner bias
Evolution of Simulation Exam Format : Evolution of Simulation Exam Format Printed simulation exam format introduced in 1978 using latent image technology
Latent image format used by NBRC from 1978 to 1999
NBRC decision to convert all exams to computer-based testing
Proprietary software developed by AMP to administer simulation exams in comparable format via computer – introduced in 2000
Both latent image test booklets & computerized format being used
How Simulation Exams Differ from MCQs : How Simulation Exams Differ from MCQs Provides accurate assessment of higher order thinking related to a content area of interest (testing more than just recall)
Challenge test takers beyond complexity of MCQs
Simulation problems allow test takers to assess their skills against test content drawn from realistic situations or clinical events
Sample relationship between multiple-choice and simulation scores assessing similar content : Sample relationship between multiple-choice and simulation scores assessing similar content
Simulation Utility : Simulation Utility Continuing competency examinations
Self-assessment/practice examinations
High-stakes examinations
Psychometric characteristics comparable to other assessment methodologies
That is, good reliability and validity
Professions Using This Simulation Format : Professions Using This Simulation Format Advanced-Level Respiratory Therapists
Advanced-Level Dietitians
Lighting Design Professionals
Orthotist/Prosthetist Professionals
Health System Case Management Professionals (beginning 2005)
Real Estate Professionals
Candidate fees range from $200 to $525 for full-length certification/licensure simulation exam
Structure of Simulations : Structure of Simulations Opening Scenario
Information Gathering (IG) Sections
Decision Making (DM) Sections
Single or multiple DM
All choices are weighted (+3 to –3)
Passing scores relate to judgment of content experts on ‘minimal competence’
Simulation Development(Graphic depiction of path through a simulation problem) : Simulation Development (Graphic depiction of path through a simulation problem)
IG Section Details : IG Section Details IG section
A section in which test takers choose information that will best help them understand a presenting problem or situation
Facilitative options may receive scores of +3, +2, or +1
Uninformative, wasteful, unnecessarily invasive, or potentially illegal options may receive scores of –1, –2, or –3
Test takers who select undesirable options accumulate negative section points
IG Section Details : IG Section Details IG Section Minimum Pass Level (MPL)
Among all options with positive scores in a section, some should be designated as REQUIRED for minimally competent practice
The sum of points for all REQUIRED options in a section equals MPL
DM Section Details : DM Section Details DM section
A section of typically 4-6 options in which the test taker must make a decision about how to handle the presenting situation
Facilitative options may receive scores of +3, +2, or +1
Harmful or potentially illegal options may receive scores of –1, –2, or –3
Test takers who select undesirable options accumulate negative section points and are directed to select another option
DM Section Details : DM Section Details DM Section Minimum Pass Level (MPL)
May contain two correct choices, but one must be designated as REQUIRED for minimally competent practice
The REQUIRED option point value in the section equals MPL
Minimum Passing Level : Minimum Passing Level DM MPL
The sum of all DM section MPLs
IG MPL
The sum of all IG section MPLS
Overall Simulation Problem MPL
Candidates must achieve MPL in both Information Gathering and Decision Making
Simulation Exam Development : Simulation Exam Development 8 to 10 simulation problems per examination
Each problem assesses different situation typically encountered on the job
Let’s Attempt A Computerized Simulation Problem!!! : Let’s Attempt A Computerized Simulation Problem!!!
Slide34 : Karen S. Flint, Director, Internal Development & Systems Integration
Applied Measurement Professionals, Inc.
8310 Nieman Road
Lenexa, KS 66214
913.541.0400
(Fax – 913.541.0156)
KFlint@goAMP.com
www.goAMP.com
“Practical Testing”Dr. Frank Hideg, DCDr. Mark Christensen, PhD Dr. Paul Townsend, DC : “Practical Testing” Dr. Frank Hideg, DC Dr. Mark Christensen, PhD Dr. Paul Townsend, DC Presented at the 2004 CLEAR Annual Conference
September 30 – October 2 Kansas City, Missouri
NBCE History : NBCE History The National Board of Chiropractic Examiners was founded in 1963
The first NBCE exams were administered in 1965
Prior to 1965 chiropractors were required to take chiropractic state boards and medical state basic science boards for licensure
NBCE Battery of Pre-licensure Examinations : NBCE Battery of Pre-licensure Examinations Part I – Basic Sciences Examinations
Part II – Clinical Sciences Examinations
Part III – Written Clinical Competency
Part IV – Practical Examination for Licensure
Slide38 : Hierarchy of Clinical Skills DO SHOW HOW KNOW HOW KNOWLEDGE PARTS I & II PART III PART IV PRACTICE
NBCE Practical Examination : NBCE Practical Examination Content Areas
Diagnostic Imaging
Chiropractic Technique
Chiropractic Case Management
Content Weighing : Content Weighing CAM 67% TEC 17% DIM 16%
Diagnostic Imaging : Diagnostic Imaging 10 Four-minute Stations
Candidate identifies radiological signs on plain film x-rays
Candidate determines most likely diagnoses
Candidate makes most appropriate initial case management decisions
Chiropractic Technique : Chiropractic Technique 5 five-minute stations
Candidate demonstrates two adjusting techniques per station
Cervical spine
Thoracic spine
Lumbar spine
Sacroiliac articulations
Extremity articulations
Chiropractic Case Management : Chiropractic Case Management 10 five-minute patient encounter stations
10 linked post-encounter probe (PEP) stations
Candidate performs focused case histories
Candidate performs focused physical examinations
Candidate evaluates patient clinical database
Candidate makes differential diagnoses
Candidate makes initial case management decisions
Key Features of NBCE Practical Examination : Key Features of NBCE Practical Examination Use of standardized patients
Use of OSCE format and protocols
Case History Stations : Case History Stations Successful candidates use organized approach while obtaining case history information
Successful candidates communicate effectively with patients
Successful candidates respect patient dignity
Successful candidates elicit adequate historical information
Perform a Focused Case History : Perform a Focused Case History
Slide47 : Post-Encounter Probe Station
Part IV Candidate Numbers : Part IV Candidate Numbers
Part IV State Acceptance : Part IV State Acceptance
Candidate Qualifications : Candidate Qualifications Candidates must pass all basic science and clinical science examinations before applying
Candidates must be within 6 months of graduation from an accredited chiropractic college
$1,075 examination fee
Contact Information : Contact Information National Board of Chiropractic Examiners
901 54th Avenue
Greeley, CO 80634
970-356-9100, 970-356-1095
ptownsend@nbce.org
www.nbce.org
Station DevelopmentAlison Cooper Manager of Examination OperationsCanadian Alliance of Physiotherapy Regulators : Station Development Alison Cooper Manager of Examination Operations Canadian Alliance of Physiotherapy Regulators Presented at the 2004 CLEAR Annual Conference
September 30 – October 2 Kansas City, Missouri
First Principles : First Principles If it’s worth testing, it’s worth testing well
it is possible to test anything badly
this is more expensive
Some things are not worth testing
trivia
infrequently used skills
Overview : Overview Write
Review
Dry run
Approve
Write : Write Focus of station
SP portrayal - general
Checklist & scoring
Instructions to candidate
Details of SP instructions
Review everything
References
Focus of Station : Focus of Station Each station must have a clear focus
establish the focus in one sentence
take time to get this right
you can’t write a good station without a clear focus
Example: Perform passive range of motion of the arm for a client who has had a stroke.
SP Portrayal - General : SP Portrayal - General Consider SP movement, behaviour
a picture in your head
use real situations to guide you
Not detailed yet
Example: Client is 55 years old, is disoriented, and has no movement in the left arm or leg.
Checklist & Scoring : Checklist & Scoring What is important to capture
Consider the level of the candidates
Group items logically
Assign scores to items
Scoring scales
Checklist Example : Checklist Example Explains purpose of interaction 1
Corrects client’s position 2
Performs passive ROM of scapula 1
Performs passive ROM of shoulder 1
Performs passive ROM of elbow 1
Performs passive ROM of wrist 1
Performs passive ROM of hand & fingers 1
Performs passive ROM of thumb 1
Uses proper body mechanics 3
Uses proper handling 3
Instructions to Candidate : Instructions to Candidate Information the candidate needs
age and sex of client
pertinent information and assumptions
The task for the candidate
exactly what they are to do and not do
Example : Example Eric Martin
55 years old
This client had a right middle cerebral artery haemorrhage resulting in a left sided hemiplegia two (2) weeks ago.
The client presents with confusion and left sided flaccidity. His cardiovascular status is stable.
Perform passive range of motion on the client’s left upper extremity.
Perform only one (1) repetition of each movement.
Assume that you have the client’s consent.
Details of SP Instructions : Details of SP Instructions History, onset, changes
Initial position, movements, demeanor, must say/ask
anticipate strong AND weak candidates
Cover the checklist and candidate instructions
SP prompts
SP Instructions... : SP Instructions... Use plain language
Include
what to wear/not wear
features of the SP (height, scars)
Diagrams are often helpful
Example : Example Presenting complaint
Initial position, general mobility, affect
Comments you must make
Medical, social history
Medications
Activities and areas affected
Sensation
Pain
Muscle quality
Responses to candidate
Emotions
Check Everything : Check Everything Go back and check
does it make sense?
is there still a clear focus?
is anything missing?
Edit/revise as needed
add notes to examiner for clarification
Check for plain language
References : References Use references you expect candidates to know
Umphred, 2nd edition, page 681
Next Steps : Next Steps Review by others
Dry run
Approve for use
Thank you : Thank you Canadian Alliance of Physiotherapy Regulators
1243 Islington Ave., Suite 501
Toronto, ON, Canada M8X 1Y9
(W)416-234-8800, (F)416-234-8820
acooper@alliancept.org
www.alliancept.org
“OSCE Research: The Key to a Successful Implementation”Lila J Quero Muñoz, PhD Consultant : “OSCE Research: The Key to a Successful Implementation” Lila J Quero Muñoz, PhD Consultant Presented at the 2004 CLEAR Annual Conference
September 30 – October 2 Kansas City, Missouri
Prior to the OSCE: CPBC and PEBC : Prior to the OSCE: CPBC and PEBC Need for assessing communication, counseling, and interpersonal skills to provide pharmaceutical care to patients
PEBC MC examination was not assessing the full scope of pharmacy practice as profiled by NAPRA (National Association Pharmacy Regulatory Authorities of Canada)
Generalizability: Data Analyses : Generalizability: Data Analyses Psychometrically, OSCEs, are complex phenomena, producing scores with potential errors from multiple sources, including:
Examiners (pharmacists and non-pharmacists)
Cases (context, complexity, # of stations)
Scoring methods (global vs. checklists)
Standard setting
Differential grading practices
Research Question # 1 : Research Question # 1 How many examiners are required to obtain consistent and dependable candidates’ scores?
Results #1-1998 : Results #1-1998 1 examiner per case yielded similar consistency as 2 (G=.82, .81, D=.81, .79) indicating that examiners agreed highly on their scores
Examiners contributed little to the scoring errors of candidates’ performance
1 vs. 2 Global -1999 : 1 vs. 2 Global -1999
Research Question # 2 : Research Question # 2 How many cases are required to maintain consistency, validity and generalizability of scores?
Adequate and representative sampling of professional practice are necessary to capture a candidate’s abilities.
Multiple observations of abilities yield more consistent and content valid inferences.
Logistical constraints restrict the number of cases that are timely and economically feasible to administer within one OSCE examination.
Results # 2-1998 : Results # 2-1998 15 cases reduced the candidate’s score error due to sampling variability of the cases dramatically from 5 or 10 cases and improved the consistency of scores from G=.60 to .81
15 cases reduced the cases and raters interaction variance as an indication that raters agreed on their scores across cases
Results # 2-1998 : Results # 2-1998 Candidates’ scores varied mostly due to their differential performance across cases.
Sampling of the cases might affect the candidates’ performance on an OSCE.
We suggest, however, that differential performance across cases might be due to candidate’s differential levels of skills across the pharmacy competencies assessed
Profile of Sources of Errors in %-1998 : Profile of Sources of Errors in %-1998
Research Question # 3 : Research Question # 3 How do different scoring methods such as checklists or global grading affect candidates’ scores?
Results # 3-1998 : Results # 3-1998 Low correlations between checklist and global scores suggest both methods might not be interchangeable
If used in isolation they would yield different end results, particularly for borderline candidates
Global grading yields higher mean scores than checklist grading (p values.81 and .59)
Global vs. Checklist-1999 : Global vs. Checklist-1999
Research Question # 4 : Research Question # 4 What is the validity and defensibility of standard-setting procedures and pass/fail decisions
Results # 4-1998 : Results # 4-1998 SME’s agreed highly on the minimum standard necessary for safe pharmacy practice for the borderline qualified pharmacists
On different occasions, SME’s had similar standards for entry-to-practice for the core cases
Standards varied little between 26 & 20 cases and were consistent enough with 15 cases (G=.74, .74, .71)
Results # 4-2003 : Results # 4-2003
Research Question # 5 : Research Question # 5 Are there differential grading practices among Canadian Provinces?
Are candidates’ pass/fail decisions affected by provincial differences on scoring practices?
Results # 5-Videos 2003 : Results # 5-Videos 2003 Variability in scores between sites are due mostly to true score variance
Differences between exam sites are in magnitude of scores but not in pass/fail status
Differences between assessors are mostly of magnitude of scores but not in pass/fail status
Pass/Fails decisions did not vary between sites and assessors
There is more variance between assessors than between exam sites
Results # 5-2003 : Results # 5-2003
Results # 5-2003 : Results # 5-2003
Results # 5-2003 : Results # 5-2003
Conclusions 1998-2004 : Conclusions 1998-2004 Development of cases should follow templates, guidelines and a detailed blueprint
Selection of cases must follow a detailed blueprint to mirror OSCE forms between exam administrations to control for differences in cases such as complexity and content
Conclusions 1998-2004 : Conclusions 1998-2004 Multiple sources of errors in OSCEs forces us to do more extensive and non-traditional research than for MC exams
OSCEs require continuous vigilance to assess the impacts of the many sources of errors
OSCE research must be planned and implemented beyond exam administrations
Conclusions 1998-2004 : Conclusions 1998-2004 OSCE infrastructure must support both design research and exam administration research
Successful implementation and continuous improvements of OSCE go hand and hand with research
More collaborative efforts among OSCE users are needed to built on each other’s success and avoid pitfalls
Conclusions 1998-2004 : Conclusions 1998-2004 Although OSCE research is costly it is a deterrent to litigation and wasted exam administration resources
Similar conclusions may apply to other performance assessments
Slide96 : Carol O’Byrne, BSP, OSCE Manager
John Pugsley, PharmD, Registrar, PEBC
obyrnec@pebc.ca
416-979-2431, 1-416-260-5013 Fax
Lila J. Quero-Muñoz, PhD, Consultant
787-431-9288, 1-888-663-6796 Fax
lila@insidetraveling.com