Moodle UCAT: a CAT module for Moodle based on the Rasch model

Views:
 
Category: Education
     
 

Presentation Description

Presented at the 5th International Rasch Measurement Conference at Western University of Australia, Jan 2012

Comments

Presentation Transcript

Moodle UCAT: a computer-adaptive test module for Moodle based on the Rasch model:

Moodle UCAT: a computer-adaptive test module for Moodle based on the Rasch model Tetsuo Kimura ( Waseda University / Niigata Seiryo University) Akio Ohnishi (VERSION2) Keizo Nagaoka ( Waseda University) The 5 th International Conference on Probabilistic Models for Measurement At University of Western Autralia 1

PowerPoint Presentation:

CAT for Everyone Happy CAT MOTTO & 2

Outline:

Outline Background & Previous Studies CAT & UCAT Moodle UCAT Module Demonstration of Moodle UCAT Module Case Study 3

Background & Previous Studies:

Background & Previous Studies 4 LMS began to spread to educational institutes In-house CBT/CAT have been implemented Hinkelman & Grose (2004) Akiyama (2008) Kimura (2009) Koyama & Akiyama (2010) Construction of item banks for CAT Kimura & Nagaoka (2010a) Kimura & Nagaoka (2010b) Kimura & Nagaoka (2011b)

Dichotomous CAT Test Administration:

Dichotomous CAT Test Administration 5

CAT is greedy! :

CAT is greedy! Prerequisites for CAT implement Lot of items calibrated Pretesting items Analyzing items Eliminating misfit More pretesting Equating 6

CAT likes large pool:

CAT likes large pool Prerequisites for CAT implement Well-balanced contents Wide variety of difficulty Enough unidimensionality The bigger the better 7

Major problems in CAT:

Major problems in CAT Pretesting new items for the item bank Adding new test items to the item bank Recalibrating the bank UCAT addressed these problems 8

UCAT: CAT with Item Bank Recalibration (Linacre, 1987):

UCAT: CAT with Item Bank Recalibration (Linacre, 1987) Difficulty level of the new items can be guessed intelligently without degrading the resulting ability estimates. The degradation of measures by poor item calibration is further diminished by the self-correcting nature of CAT. Poor calibration of a few items is not deleterious to Rasch measurement. Wright and Douglas (1975) / and Yao (1991) 9

UCAT: CAT with Item Bank Recalibration (Linacre, 1987):

UCAT: CAT with Item Bank Recalibration (Linacre, 1987) Existing items can be recalibrated with minimal impact on previous test-taker measures. This is especially important when the item difficulty calibrations are derived from non-CAT sources, or when there is concern that part of the item bank has become public knowledge. 10

UCAT: CAT with Item Bank Recalibration (Linacre, 1987):

UCAT: CAT with Item Bank Recalibration (Linacre, 1987) The CAT test developer or the CAT administrator can choose to have the difficulties of the items in the bank recalibrated at any point based on the responses of those to whom the items have been administered so far. As part of the recalibration procedure, all test-takers are remeasured based on their original responses and the revised item difficulties. 11

UCAT: CAT with Item Bank Recalibration (Linacre, 1987):

UCAT: CAT with Item Bank Recalibration (Linacre, 1987) The final revised item calibrations are computed in such a way as to maintain unchanged the mean of the ability estimates of those who have already taken the test. This minimizes the effect of the recalibration on any previously reported test results. 12

Moodle UCAT Module beta ver.:

Development Status CAT setting window Ending conditions Logit to unit conversion Logit bias CAT administration window Set item difficulty individually or category by category Set student ’s ability individually or as a whole Administer CAT and provide result individually Retrieve C AT processes and results Recalibration of i tem difficulty & estimate ability Unit = Logit×10 + 100 Moodle UCAT Module beta ver. 13 Under Development

CAT Algorithm: Initial Ability Estimation:

CAT Algorithm: Initial Ability Estimation 14 UCAT Moodle UCAT Lower Limit ( LL ) = AVG(D) - (0.5+0.5* RND ) Upper Limit ( UL ) = LL + 1 B 0 = AVG(D) - 0.5* RND AVG(D) : average item difficulty RND : random value between 0 & 1 B 0 : initial ability Same or Assign each student’s initial ability in the CAT administration window based on other test results or intelligently one by one, or as a whole.

CAT Algorithm: Ability (B) Estimation:

CAT Algorithm: Ability ( B ) Estimation 15 UCAT / Moodle UCAT the number of successes probability of success of a student of ability B m on the i -th dministered item of difficulty Di

CAT Algorithm: Standard Error (SE) Estimation:

CAT Algorithm: Standard Error ( SE ) Estimation 16 UCAT / Moodle UCAT

CAT Algorithm: Item Selection :

CAT Algorithm: Item Selection 17 UCAT / Moodle UCAT Next item will be selected randomly between LL and UL score when he next (m- th ) answer will be wrong If no item found between LL & UL , use the closest. Ability estimate when the next answer will be wrong Ability estimate when the next answer will be correct

CAT Algorithm: Ending Condition:

CAT Algorithm: Ending Condition 18 UCAT / Moodle UCAT Prescribed number of item Prescribed SE Both number of item and SE All item

CAT Algorithm: Item Selection (logit bias) :

CAT Algorithm: Item Selection ( logit bias) 19 Moodle UCAT LL and UL can be adjusted by adding logit value to the Logit bias box in the CAT setting window Positve logit value decrease the chance of answer correct Negative logit value increase the chance of answer correct

PowerPoint Presentation:

20 Students reaction to CAT It was so difficult  More difficult than entrance exam  I could not answer with confidence  I dislike English worse  I may not be able to reach the passing grade  Worst score I’ve ever got  Hana Taro Yoko Kimura & Nagaoka (2011a)

PowerPoint Presentation:

21 Students reaction to CAT WHY? 1 Tests in middle/high schools were not so difficult for the students who entered university High Mode/Median/Mean Negatively Skewed Kimura & Nagaoka (2011a)

PowerPoint Presentation:

22 Students reaction to CAT Nation wide entrance exams are designed to be 60 on average: Normal Distribution 60 WHY? 2 Kimura & Nagaoka (2011a)

PowerPoint Presentation:

23 Students reaction to CAT False beginner Low Intermediate Advanced Intermediate Every students tend to get 50% correct in CAT: because such items maximize test information for each student WHY? 3 Hana Taro Yoko 50% 50% 50% Kimura & Nagaoka (2011a)

PowerPoint Presentation:

24 Students reaction to CAT 60 60-80% 80-100% 40-60% School Exams Entrance Exams 50% 50% 50% WHY? 4 Hana Taro Yoko Kimura & Nagaoka (2011a)

You can make CAT happier with logit bias:

You can make CAT happier with logit bias B m - D i =0 B m - D i =0.4 B m - D i =1.0 25 Logit bias= - 0.4 Logit bias=0.0 Logit bias= - 1.0   

Item difficulty & P of sucess:

Item difficulty & P of sucess B - D P of success B - D P of success 4.0 98% -4.0 2% 3.0 95% -3.0 5% 2.2 90% -2.2 10% 2.0 88% -2.0 12% 1.4 80% -1.4 20% 1.1 75% -1.1 25% 1.0 73% -1.0 27% 0.8 69% -0.8 31% 0.5 62% -0.5 38% 0.4 60% -0.4 40% 0.2 55% -0.2 45% 0.1 52% -0.1 48% 0.0 50% 0.0 50% 26

Tradeoff between happiness & test length:

Tradeoff between happiness & test length Minimum number of CAT Items Administered Targeting P S.E. ( logits ) 0.5 0.4 0.3 0.2 0.1 0.5 16 25 45 100 499 0.6 17 27 47 105 517 0.7 20 30 53 120 477 0.8 25 40 70 157 625 0.9 45 70 125 278 1112  5%  20%  60% Linacre (2006) 27

Moodle UCAT Module beta ver.:

Moodle UCAT Module beta ver. Demonstration 28 Creating CAT Setting CAT Previewing CAT

CAT setting window:

CAT setting window 29

CAT setting window:

CAT setting window 30

CAT setting window:

CAT setting window 31

CAT setting window:

CAT setting window 32

CAT administration window:

CAT administration window 33

CAT administration window: users:

CAT administration window: users 34

CAT administration window: users:

CAT administration window: users 35

CAT administration window: Qs:

CAT administration window: Qs 36

CAT administration window: Qs:

CAT administration window: Qs 37

Case Study:

Case Study 38 Moodle UCAT: Basic English Quiz Participants : 59 Japanese freshmen of engineering department Item : 258 fill-in-the-blank g rammar & vocabulary questions taken from STEP Eiken Test Grade 3rd~ Grade Pre 1st in 2007 and 2008 Pretest : 12 testlets equated with anchored items were administrated to groups of Japanese EFL students (each testlets has 20 to 32 items with 6 to 16 anchored items; each size of group was about 200 to 300 )

Case Study:

Case Study 39 Status of the item bank Eiken grade N AVG SD Pre 1st 73 1.57 0.84 2nd 69 0.52 0.81 Pre 2nd 67 -0.47 0.91 3rd 49 -1.41 0.80 Total 258 0.19 1.37

Case Study:

Case Study 40 CAT Quiz Part 1 : item difficulty is determined grade by grade intelligently CAT Quiz Part 2 : item difficulty is determined item by item based on the result of pretests Eiken grade Difficulty in Unit Pre 1st 115 2nd 105 Pre 2nd 95 3rd 85 CAT conditions Initial estimate ability: 0.0 logit ( 100 unit) Ending condition: number of item (16 items) Logit bias: 0

Case Study:

Case Study 41 Research Questions Results of CAT Quiz Part 1 and Part 2 are consistent? Item difficulty guessed intelligently will degrade the resulting ability estimates?

Case Study:

Case Study 42 Results of CAT Part 1 & Part 2 Part 1 Part 2 AVG 108.1 106.6 SD 7.6 8.8 MAX 125 119 MIN 87 90 Part 1 Part 2 AVG 5.32 5.39 SD 0.24 0.33 MAX 5.87 6.38 MIN 5.05 5.05 Estimate Ability (in Unit) Standard Error (in Unit) Comparison of Estimate Ability (in Unit)

Case Study:

Case Study 43 Conclusions & Limitation Item difficulty guessed intelligently will NOT degrade the resulting ability estimates. Sample size is very small

REFERENCES:

REFERENCES Akiyama, M. (2008). Trial version of adaptive test module using Moodle. Proceedings of JART 2008. Hinkelman , D., & Grose , T. (2004). Placement testing and audio quiz-making with open source software. Proceedings of CLaSIC 2004, 972-981. Kimura, T. (2009). Construction of a Moodle-based placement test and possibility of a Moodle-based computer adaptive test. ARELE 20, 161-169. Kimura , T. & Nagaoka , K. (2010a). Towards the construction of item banks for moodle-based in-house computer adaptive English tests. Pacific Rim Objective Measurement Symposium 2010 Kuala Lumpur. Kimura, T. & Nagaoka , K. (2010b). Toward the construction of Moodle-based in-house computer adaptive test 1:Improvement of tem banks, 343-344. JSET 26. Kimura, T. & Nagaoka , K. (2011). Psychological aspects of CAT: How test-takers feel about CAT., IACAT Conference 2011, Pacific Grove, CA. Kimura, T. & Nagaoka , K. (2011). Toward the construction of Moodle-based in-house computer adaptive test 2: Consolidation of item banks, JSET 27. Koyama, Y. & Akiyama, M. (2009) Developing A Computer Adaptive ESP Placement Test Using Moodle. eLEARN2009 940-945Linacre, J.M. (1987). UCAT: a BASIC computer-adaptive testing program. MESA Psychometric Laboratory. (ERIC ED 280 895). Linacre, J.M. (2006). Computer Adaptive Tests, Standard Errors and Stopping Rules. Rasch Measurement Transaction 20:2, p.1062. Wright, B.D. and Douglas, G. (1975). Best test design and self-tailored testing. MESA Memorandum No. 19. Department of Education, Univ. of Chicago Yao, T. (1991). CAT with a poorly calibrated item bank. Rasch Measurement Transactions 5:2, p.141. 44

Thank you for listening.:

Thank you for listening. Tetsuo Kimura (tetsuo.kmr@gmail.com) Acknowledgements: A part of the present study was supported by a Grant-in-Aid for Scientific Research for 2010-2012 (No. 22520590) from the Japan Society for the Promotion of Science. 45

Case Study:

Case Study 46 CAT Quiz Part 1 : item difficulty is determined grade by grade intelligently CAT Quiz Part 2 : item difficulty is determined item by item based on the result of pretest CAT Quiz Part 3 : item difficulty is determined same as Part 2, but with - 1.0 logit bias for item selection Eiken grade D in Unit Pre 1st 115 2nd 105 Pre 2nd 95 3rd 85

Case Study:

Case Study 47 Research Questions Results of CAT Quiz Part 1 and Part 2 are consistent? Item difficulty guessed intelligently will degrade the resulting ability estimates? Results of CAT Quiz Part 2 and Part 3 are consistent? Item selection rule to increase probability of success will degrade the resulting ability estimates?

Case Study:

Case Study 48 Results of CAT Part 1 & Part 2 Part 1 Part 2 AVG 108.1 106.6 SD 7.6 8.8 MAX 125 119 MIN 87 90 Part 1 Part 2 AVG 5.32 5.39 SD 0.24 0.33 MAX 5.87 6.38 MIN 5.05 5.05 Estimate Ability (in Unit) Standard Error (in Unit) Comparison of Estimate Ability (in Unit)

Case Study:

Case Study 49 Results of CAT Part 2 & Part 3 Part 2 Part 3 AVG 106.9 107.3 SD 9.3 9.7 MAX 130 130 MIN 90 87 Part 2 Part 3 AVG 5.27 5.27 SD 0.19 0.19 MAX 5.80 5.80 MIN 5.05 5.05 Estimate Ability (in Unit) Standard Error (in Unit) r =. 86 N =27 Comparison of Estimate Ability (in Unit)

Case Study:

Case Study 50 Conclusions Item difficulty guessed intelligently will NOT degrade the resulting ability estimates. Item selection rule to increase probability of success will NOT degrade the resulting ability estimates? Limitation: Sample size is very small

authorStream Live Help