Cizek GJ, O'Day DM: Further investigations of nonfunctioning options in multiple-choice test items. Educ Psychol Meas. 1994, 54 (4): 861-872. 10.1177/0013164494054004002.
Article Google Scholar
Downing SM: Assessment of knowledge with written test forms. International handbook of research in medical education. Edited by: Norman GR, Van der Vleuten C, Newble DI. 2002, Dorcrecht: Kluwer Academic Publishers, II: 647-672.
Chapter Google Scholar
McCoubrie P: Improving the fairness of multiple-choice questions: a literature review. Med Teach. 2004, 26 (8): 709-712. 10.1080/01421590400013495.
Article Google Scholar
Schuwirth LWT, Vleuten van der CPM: Different written assessment methods: what can be said about their strengths and weaknesses?. Med Educ. 2004, 38 (9): 974-979. 10.1111/j.1365-2929.2004.01916.x.
Article Google Scholar
Farley JK: The multiple-choice test: writing the questions. Nurse Educ. 1989, 14 (6): 10-12. 10.1097/00006223-198911000-00003.
Article Google Scholar
Haladyna TM, Downing SM: Validity of a taxonomy of multiple-choice item-writing rules. Appl Meas Educ. 1989, 2 (1): 51-78. 10.1207/s15324818ame0201_4.
Article Google Scholar
Haladyna TM, Downing SM: How many options is enough for a multiple-choice test item?. Educ Psychol Meas. 1993, 53 (4): 999-1010. 10.1177/0013164493053004013.
Article Google Scholar
Bruno JE, Dirkzwager A: Determining the optimal number of alternatives to a multiple-choice test item: An information theoretic perspective. Educ Psychol Meas. 1995, 55 (6): 959-966. 10.1177/0013164495055006004.
Article Google Scholar
Lord FM: Optimal number of choices per item – A comparison of four approaches. J Educ Meas. 1977, 14: 33-38. 10.1111/j.1745-3984.1977.tb00026.x.
Article Google Scholar
Tversky A: On the optimal number of alternatives at a choice point. J Math Psychol. 1964, 1 (2): 386-391. 10.1016/0022-2496(64)90010-0.
Article Google Scholar
Aamodt MG, McShane T: A meta-analytic investigation of the effect of various test item characteristics on test scores. Public Pers Manage. 1992, 21 (2): 151-160.
Crehan KD, Haladyna TM, Brewer BW: Use of an inclusive option and the optimal number of options for multiple-choice items. Educ Psychol Meas. 1993, 53 (1): 241-247. 10.1177/0013164493053001027.
Shizuka T, Takeuchi O, Yashima T, Yoshizawa K: A comparison of three- and four-option English tests for university entrance selection purposes in Japan. LangT. 2006, 23 (1): 35-57.
Trevisan MS, Sax G, Michael WB: The effects of the number of options per item and student ability on test validity and reliability. Educ Psychol Meas. 1991, 51 (4): 829-837. 10.1177/001316449105100404.
Article Google Scholar
Trevisan MS, Sax G, Michael WB: Estimating the optimum number of options per item using an incremental option paradigm. Educ Psychol Meas. 1994, 54 (1): 86-91. 10.1177/0013164494054001008.
Article Google Scholar
Rogers WT, Harley D: An empirical comparison of three- and four-choice items and tests: susceptibility to testwiseness and internal consistency reliability. Educ Psychol Meas. 1999, 59 (2): 234-247. 10.1177/00131649921969820.
Article Google Scholar
Tarrant M, Ware J: Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments. Med Educ. 2008, 42 (2): 198-206.
Article Google Scholar
Tarrant M, Knierim A, Hayes SK, Ware J: The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Educ Today. 2006, 26 (8): 662-671. 10.1016/j.nedt.2006.07.006.
Article Google Scholar
Taylor AK: Violating conventional wisdom in multiple choice test construction. Coll Stud J. 2005, 39 (1).
Osterlind SJ: Constructing test items: Multiple-choice, constructed-response, performance, and other formats. 1998, Boston: Kluwer Academic Publishers, 2
Google Scholar
Ebel RL, Frisbie DA: Essentials of educational measurement. 1991, Englewood Cliffs, N.J.: Prentice Hall, 5
Google Scholar
Precht D, Hazlett C, Yip S, Nicholls J: International Database for Enhanced Assessments and Learning (IDEAL-HK): Item analysis users' guide. 2003, Hong Kong: IDEAL-HK
Google Scholar
StatCorp: Stata Statistical Software: Release 9.2. 2005, College Station, Tx: StataCorp LP
Google Scholar
Rodriguez MC: Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educ Meas Issues Pract. 2005, 24 (2): 3-13. 10.1111/j.1745-3992.2005.00006.x.
Article Google Scholar
Frary RB: More multiple-choice item writing do's and don'ts. Pract Assess Res Eval. 1995, 4 (11).
Haladyna TM, Downing SM: A taxonomy of multiple-choice item-writing rules. Appl Meas Educ. 1989, 2 (1): 37-50. 10.1207/s15324818ame0201_3.
Article Google Scholar
Case SM, Swanson DB: Constructing written test questions for the basic and clinical sciences. 2001, Philadelphia, PA: National Board of Medical Examiners, 3
Google Scholar
Wallach PM, Crespo LM, Holtzman KZ, Galbraith RM, Swanson DB: Use of a committee review process to improve the quality of course examinations. Adv Health Sci Educ. 2006, 11 (1): 61-68. 10.1007/s10459-004-7515-8.
Article Google Scholar
Haladyna TM, Downing SM, Rodriguez MC: A review of multiple-choice item-writing guidelines for classroom assessment. Appl Meas Educ. 2002, 15 (3): 309-334. 10.1207/S15324818AME1503_5.
Article Google Scholar
Masters JC, Hulsmeyer BS, Pike ME, Leichty K, Miller MT, Verst AL: Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education. J Nurs Educ. 2001, 40 (1): 25-32.
Google Scholar
Swanson DB, Holtzman KZ, Clauser BE, Sawhill AJ: Psychometric characteristics and response times for one-best-answer questions in relation to number and source of options. Acad Med. 2005, 80 (10 Suppl): S93-96. 10.1097/00001888-200510001-00025.
Article Google Scholar
Haladyna TM: Developing and validating multiple-choice test items. 2004, Mahwah, NJ: Lawrence Erlbaum, 3
Google Scholar
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6920/9/40/prepub
Page 2
Test A
Test B
Test C
Test D
Test E
Test F
Test G
Total
No. of items
96
72
86
50
50
60
100
514
No. of examinees
146
74
74
73
73
73
75
588
Mean test score % (SD)
67.7 (9.87)
55.5 (8.52)
69.2 (10.44)
72.0 (10.82)
62.6 (11.28)
67.8 (10.02)
65.6 (11.29)
--
Range of test scores (%)
38–89
33–71
38–90
46–94
34–88
35–88
34–89
--
KR20 Reliability
.81
.71
.82
.71
.72
.70
.87
--
SD = standard deviation; KR-20 = Kuder-Richardson 20