FSU ETD Logo

Title page for ETD etd-04072006-115219


Type of Document Dissertation
Author Turhan, Ahmet
Author's Email Address ahmet.turhan@gmail.com
URN etd-04072006-115219
Title Multilevel 2PL Item Response Model Vertical Equating with the Presence of Differential Item Functioning
Degree Doctor of Philosophy
Department Educational Psychology and Learning Systems, Department of
Advisory Committee
Advisor Name Title
Akihito Kamata Committee Chair
Albert Oosterhof Committee Member
Colleen Kelley Committee Member
Richard Tate Committee Member
Keywords
  • FCAT
  • Developmental Scale
  • Differential Item Functioning
  • Vertical Equating
  • Multilevel Item Response Model
  • GLLAMM
Date of Defense 2006-01-20
Availability unrestricted
Abstract
Recent developments in multilevel modeling made it possible to model the relationships between item properties and examinee properties within the multilevel and structural equation modeling framework. In this study, the performance of the multilevel two parameter logistic item response model (2PL IRM) was investigated for estimating item difficulty and discrimination parameters and equating a test among different grade levels under the presence of differential item functioning (DIF) by using real and simulation data. A statewide data, designed for vertical scaling, were used with three different adjacent grade levels. The data were collected for the Florida Comprehensive Achievement Test in 2001. In addition, simulated data comparable to large-scale assessment data from two grade levels were analyzed to control for conditions of different numbers of DIF items.

The performance of 2PL IRMs with modeling of the DIF and inclusion of an examinee-level variable was compared with traditional IRT for the development of a vertical scale. It was found that 2PL IRM without any DIF parameter produced the same item difficulty and discrimination parameters. Furthermore, 2PL IRM generated the same scale score as traditional IRT. The inclusion of an examinee variable (grade level) in 2PL IRM produced a better vertical scale in comparison to 2PL IRT. The modeling of nonuniform DIF for some of the anchor items, in addition to the examinee-level variable, resulted in the same scale as the previous model; however, the modeling of uniform DIF for some of the anchor items distorted the vertical scale.

A small simulation study was designed to investigate the effects of DIF items on vertical equating with respect to presence of uniform, nonuniform, and both nonuniform and uniform DIF exhibiting on some of anchor items. It was found that distortion of the scale increased as the number of nonuniform DIF items increased in the anchor set. The scale distortion got larger than the effects of having one type of DIF when items in the anchor set had both types of DIF at the same time. There was one conflicting result: Increasing the number of uniform DIF items in an anchor set decreased the scale distortion when only uniform DIF items were present. However, this could have been the result of random error due to the limited simulation size.

There was one drawback of multilevel IRM in using the large-scale assessment data. The computation time needed to complete the calibration process was far beyond practicality for a comprehensive state testing program. However, multilevel IRM potentially provides more flexibility for investigating the dimensions that affect the success. Directions for future research and limitations are also discussed.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  DISSERTATION_(Ahmet_Turhan)_Final_Submission.pdf 2.06 Mb 00:09:31 00:04:54 00:04:17 00:02:08 00:00:10

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact the FSU Digital Library Center.