towards the construction of item banks for moodle-based

Views:
 
Category: Education
     
 

Presentation Description

Pacific Rim Objective Measurement Symposium (PROMS) 2010 KL

Comments

Presentation Transcript

Towards the construction of item banks for moodle-based in-house computer adaptive English tests:

Towards the construction of item banks for moodle-based in-house computer adaptive English tests Tetsuo KIMURA & Keizo NAGAOKA Graduate School of Human Science, Waseda University PROMS 2010 KL

Outline:

Outline Background of the Study Types of Items Used in the Study Misfit Analysis: Kimura (2009a) Fixed Moodle-based Placement Test Validity: Kimura (2009b) Misfit Analysis Revision to Salvage Items Anchoring & Equating Item Blueprint of Item Bank for CAT

Background of the study:

Background of the study High diversity in the English ability of incoming students Spread of learning management system (Moodle) to schools Need to stream students into levels Computer-based test at hand Placement test on Moodle Construction of item banks on Moodle to share among schools In-house computer adaptive tests on Moodle

Background of the study:

Background of the study “a self-created placement test using open source software could, over several years of development, prove equal or superior to generic commercial products in reliability for closed population placement testing” ( Hinkelman & Grose , 2004, p974).

Types of items used in the study:

Types of items used in the study Three types of multiple choice questions Vocabulary and grammar ( Vgm ) Listening comprehension with dialogue ( Dlg ) Listening comprehension with monologue ( Mlg ) All the items were adopted from the Eiken Test Grade pre 1 to Grade 3 , under the permission of the Society for Testing English Proficiency (STEP).

Slide 6:

Vocabulary & grammar (Vgm) STEP Grade3 2008 Summer

Slide 7:

Listening with dialogue ( Dlg ) STEP Grade3 2008 Summer

Slide 8:

Listening with monologue ( Mlg ) STEP Grade3 2008 Summer

Misfit analysis 1: Kimura (2009a):

Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 222 80 25 20 20 15 Dlg 157 47 12 15 10 10 Mlg 119 35 --- 15 10 10 Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 193 32 2 10 13 7 Dlg 142 13 0 7 2 4 Mlg 112 19 --- 7 5 7 misfit elimination Misfit analysis 1: Kimura (2009a) Misfit Item P.BIS < 0.25 or t-value (ZSTD) > 1.96 Misfit Person Z L -value <- 1.96

Fixed Moodle-based Placement Test Validity:

Fixed Moodle-based Placement Test Validity r =.76 n = 55 r =.90 n = 13 Placement Test (Kimura, 2009b) Vgm Dlg Mlg 32 13 19

Misfit Analysis Revision to Salvage Items (1):

Misfit Analysis Revision to Salvage Items (1) Misfit Item P.BIS < 0.25 or t-value (ZSTD) > 1.96 Misfit Person Z L -value <- 1.96 Misfit Item & Person MSQ > 1.3 and t-value (ZSTD) > 1.96

Misfit analysis 1: Kimura (2009a):

Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 222 80 25 20 20 15 Dlg 157 47 12 15 10 10 Mlg 119 35 --- 15 10 10 Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 193 32 2 10 13 7 Dlg 142 13 0 7 2 4 Mlg 112 19 --- 7 5 7 misfit elimination Misfit analysis 1: Kimura (2009a) Misfit Item P.BIS < 0.25 or t-value (ZSTD) > 1.96 Misfit Person Z L -value <- 1.96

Misfit Analysis Revision to Salvage Items (1):

Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 222 80 25 20 20 15 Dlg 157 47 12 15 10 10 Mlg 119 35 --- 15 10 10 Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 205 71 18 19 20 14 Dlg 147 46 11 15 10 10 Mlg 116 35 --- 15 10 10 misfit elimination Misfit Analysis Revision to Salvage Items (1) Misfit Item & Person MSQ > 1.3 and t-value (ZSTD) > 1.96

Misfit Analysis Revision to Salvage Items (2):

Misfit Analysis Revision to Salvage Items (2) Misfit Item & Person MSQ > 1.3 and t-value (ZSTD) > 1.96 Cut off responses with low probability ( < 12 % ) of success --good luck guess-- B n - D j <-2 CUTLO=-2 Cut off responses with high probability ( > 95%) of success -- careless miss-- B n - D j >3 CUTHI= 3 THEN

Misfit Analysis Revision to Salvage Items (2):

Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 222 80 25 20 20 15 Dlg 157 47 12 15 10 10 Mlg 119 35 --- 15 10 10 Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 207 79 24 20 20 15 Dlg 157 47 12 15 10 10 Mlg 112 35 --- 15 10 10 misfit elimination Misfit Analysis Revision to Salvage Items (2) Misfit Item & Person MSQ > 1.3 and t-value (ZSTD) > 1.96 CUTLO=-2 CUTHI= 3

Misfit Analysis Revision to Salvage Items:

Misfit Analysis Revision to Salvage Items Kimura (2009a) Revision (1) Revision (2) Vgm (80) 47 9 1 Dlg (47) 34 1 0 Mlg (35) 16 0 0 Item loss in the three analyses

Misfit Analysis Revision to Salvage Items:

Revision (1) Revision (2) Kimura (2009a) 0.998 0.996 Revision (1) 0.997 Item difficulty c orrelation between the three analyses (32 items in Vgm ) Misfit Analysis Revision to Salvage Items

Misfit Analysis Revision to Salvage Items:

Item difficulty distributions between the three analyses (32 items in Vgm ) Misfit Analysis Revision to Salvage Items Kimura 2009a Revised 1 Revised 2 AVG -0.48 -0.23 -0.36 SD 0.94 0.92 0.91 MAX 1.61 1.91 1.75 MIN -2.15 -1.80 -1.82

Misfit Analysis Revision to Salvage Items:

Item difficulty correlation between the two analyses (71 items in Vgm ) Misfit Analysis Revision to Salvage Items Revised 1 Revised 2 AVG -0.24 -0.35 SD 1.38 1.36 MAX 2.73 2.77 MIN -3.28 -2.84

Misfit Analysis Revision to Salvage Items:

Misfit Item Misfit Person Kimura (2009a) P.BIS < 0.25 or t-value (ZSTD) > 1.96 Z L -value <- 1.96 Revision ( 1) MSQ > 1.3 and t-value (ZSTD) > 1.96 Revision (2) CUTLO=-2 CUTHI= 3 then MSQ > 1.3 and t-value (ZSTD) > 1.96 Which analysis is the best? Misfit Analysis Revision to Salvage Items

Anchoring & Equating Items:

Number of item for new tests Test A Test B Test C Test D Calibrated items Vgm 16 16 16 16 Dlg 7 6 6 7 Mlg 9 10 10 9 New items Vgm 16 16 16 16 Dlg 9 10 10 9 Mlg 7 6 6 7 Total Vgm 32 32 32 32 Dlg 16 16 16 16 Mlg 16 16 16 16 Total 64 64 64 64 Anchoring & Equating Items

Anchoring & Equating Items:

Items Test A New Items Anchors Anchors Anchors Anchors Test B New Items Anchors Anchors Persons Anchors Anchors Test C New Items Anchors Anchors Anchors Anchors Test D New Items Anchoring & Equating Items

Anchoring & Equating Items:

Anchoring & Equating Items Vgm (93) Mlg (44) N=1067 N=1024

Anchoring & Equating Items:

Anchoring & Equating Items Dlg (47) N=1090 Something wrong? Too many low extreme D/B Some anchored (fixed) items were eliminated as a result of misfit analysis DIF in persons? But same groups as the other two tests Unstable anchors? WARNING: DATA ARE AMBIGUOUSLY CONNECTED INTO 3 SUBSETS. MEASURES MAY NOT BE COMPARABLE ACROSS SUBSETS

Anchoring & Equating Items:

Anchoring & Equating Items Person Item Fixed New Vgm 1070 96 32 64 Dlg 1145 51 13 38 Mlg 1119 45 19 26 Person Item Fixed New I- loss P-loss Vgm 1024 93 32 61 3 (3.1%) 46 (4.3%) Dlg 1090 47 9 38 4 ( 7.8& ) 55 (4.8%) Mlg 1067 44 19 25 1 (2.2%) 52 (4.6%) misfit elimination Misfit Item & Person MSQ > 1.3 and t-value (ZSTD) > 1.96 CUTLO=-2 CUTHI= 3 34.4% 19.1% 45.5%

Misfit analysis 1: Kimura (2009a):

Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 222 80 25 20 20 15 Dlg 157 47 12 15 10 10 Mlg 119 35 --- 15 10 10 Person Item Pre-1 st 2 nd Pre-2 nd 3rd Vgm 193 32 2 10 13 7 Dlg 142 13 0 7 2 4 Mlg 112 19 --- 7 5 7 misfit elimination Misfit analysis 1: Kimura (2009a) Misfit Item P.BIS < 0.25 or t-value (ZSTD) > 1.96 Misfit Person Z L -value <- 1.96

Misfit analysis 1: Kimura (2009a):

Misfit analysis 1: Kimura (2009a) Elimination of misfit and change of reliability and mean Vgm Dlg Mlg

Anchoring & Equating Items:

Anchoring & Equating Items Dlg (47) N=1090 Something wrong? Too many low extreme D/B Some anchored (fixed) items were eliminated as a result of misfit analysis DIF in persons? But same groups as the other two tests Unstable anchors? WARNING: DATA ARE AMBIGUOUSLY CONNECTED INTO 3 SUBSETS. MEASURES MAY NOT BE COMPARABLE ACROSS SUBSETS Recalibrate item difficulty and change anchors

Anchoring & Equating Items:

Anchoring & Equating Items Dlg (47) N=1090 Something wrong? Too many low extreme D/B Some anchored (fixed) items were eliminated as a result of misfit analysis DIF in persons? But same groups as the other two tests Unstable anchors? I got many useful suggestions from the floor. Among them, Dr. Jim Sick’s suggestion of miss -coding was right. I ran over my data analysis and found some mistakes. Here is the revised result of the graph. It looks much better though it has some gap. Coding Mistake !  N=1091 Dlg (49) I can move on ! 

Blueprint of Item Bank for CAT:

Blueprint of Item Bank for CAT

Blueprint of Item Bank for CAT:

Blueprint of Item Bank for CAT

Blueprint of Item Bank for CAT:

Blueprint of Item Bank for CAT P=1340 I=113

Blueprint of Item Bank for CAT:

Blueprint of Item Bank for CAT

Blueprint of Item Bank for CAT:

Blueprint of Item Bank for CAT Descriptor Level 1 Level 2 Level 3 Level 4 Level 5 Can-Do- Statment

Slide 35:

Thank you  Questions, comments & suggestions? Tetsuo KIMURA ( tkimura@akane.waseda.jp ) ------------------- Hinkelman , D., & Grose , T. (2004). Placement testing and audio quiz-making with open source software. Proceedings of CLaSIC 2004 , 972-981. Kimura, T. (2009a). Construction of a Moodle-based placement test and possibility of a Moodle-based computer adaptive test. ARELE 20 , 161-169. Kimura, T. (2009b). Construction and evaluation of an in-house English placement test from a Neural Test Theory perspective. KATE Bulletin 20 , 23-34. (Written in Japanese)