REFERENCES

Alderson, C. J. (1991). Innovation in language testing: Can the microcomputer help? Language Testing Update, Special Report No. 1.

American Psychological Association. (1985). Standards for educational and psychological testing. Washington, DC: American Psychological Association.

American Psychological Association. (1986). Guidelines for computer-based tests and interpretations. Washington, DC: American Psychological Association.

Andrich, D. (1988). Rasch models for measurement. Newbury Park, CA: Sage.

Baker, F. B. (1989). Computer technology in test construction and processing. In R. L. Linn Educational measurement (3rd ed., pp. 409-428). London: Collier Macmillan.

Bejar, I., & Braun, H. (1994). On the synergy between assessment and instruction: Early lessons from computer-based simulations. Machine-Mediated Learning, 4, 5-25.

Bennett, R. E., & Rock, D. A. (1995). Generalizability, validity, and examinee perceptions of a computer-delivered formulating-hypotheses test. Journal of Educational Measurement, 32, 19-36.

Bergstrom, B. A., Lunz, M. E., & Gershon, R. C. (1992). Altering the difficulty level in computer adaptive tests. Applied Measurement in Education, 5, 137-149.

BestTest [Computer software]. (1990). Chicago, IL: WiseWare.

Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431-444.

Boekkooi-Timminga, E. (1990). A cluster-based method for test construction. Applied Psychological Measurement, 14, 341-354.

Brown, J. D. (1992a). Technology and language education in the twenty-first century: Media, message, and method. Language Laboratory, 29, 1-22.

Brown, J. D. (1992b). Using computers in language testing. Cross Currents, 19, 92-99.

Brown, J. D. (1996). Testing in language programs. Upper Saddle River, NJ: Prentice Hall.

Bunderson, C. V., Inouye, D. K., & Olsen, J. B. (1989). The four generations of computerized educational measurement. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 367-407). London: Collier Macmillan.

Burstein, J., Frase, L., Ginther, A., & Grant, L. (1996). Technologies for language assessment. Annual Review of Applied Linguistics, 16, 240-260.

Canale, M. (1986). The promise and threat of computerized adaptive assessment of reading comprehension. In C.W. Stansfield (Ed.), Technology and language testing (pp. 29-45). Washington, DC: TESOL.

Corbel, C. (1993). Computer-enhanced language assessment. In G. Brindley (Ed.), Research Report Series 2. Sydney: National Centre for English Language Teaching and Research, Marquarie University.

Corel Paradox 7.0 [Computer software]. (1996). Ottawa, Ontario, Canada: Corel Corporation

Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Measurement, 19, 5-22.

Dorans, N. J. (1990). Scaling and equating. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 137-160). Hillsdale, NJ: Lawrence Erlbaum.

Drasgow, F., Levine, M. V., & McLaughlin, M. E. (1987). Detecting inappropriate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59-80.

Du, Y., Lewis, C., & Pashley, P. J. (1993). Computerized mastery testing using fuzzy set decision theory. Applied Measurement in Education, 6, 181-193.

Dunkel, P. (1991). The effectiveness research on computer-assisted instruction and computer assisted language learning. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 5-36). New York: Newbury House.

-55-
Educational Testing Service. (1996). TOEFL: Announcing computer-based testing. Princeton, NJ: Educational Testing Service.

Eignor, D., Taylor, C., Kirsch, I., & Jamieson, J. (1997). Development of a scale for assessing the level of computer familiarity of TOEFL examinees. Unpublished ms. Princeton, NJ: Educational Testing Service.

Eignor, D. R., Way, W. D., Stocking, M. L., & Steffen, M. (1993). Case studies in computer adaptive test design through simulation (Research report # 93-56). Princeton, NJ: Educational Testing Service.

Flaugher, R. (1990). Item pools. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 41-63). Hillsdale, NJ: Lawrence Erlbaum.

Green, B. F. (1988). Construct validity of computer-based tests. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 77-86). Hillsdale, NJ: Lawrence Erlbaum.

Green, B. F. (1990). System design and operations. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 23-40). Hillsdale, NJ: Lawrence Erlbaum.

Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. B., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347-360.

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Item response theory. Newbury Park, CA: Sage.

Henning, G. (1986). Item banking via dBase II: The UCLA ESL Proficiency Examination experience. In C. W. Stansfield (Ed.), Technology and language testing (pp. 69-77). Washington, DC: TESOL.

Henning, G. (1987). A guide to language testing: Development, evaluation, research. New York: Newbury House.

Henning, G. (1991). Validating an item bank in a computer-assisted or computer-adaptive test. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 209-222). New York: Newbury House.

Henning, G., Johnson, P. J., Boutin, A. J., & Rice, H. R. (1994). Automated assembly of pre-equated language proficiency tests. Language Testing, 11, 14-28.

Hicks, M. (1989). The TOEFL computerized placement test: Adaptive conventional measurement. TOEFL Research Report No. 31. Princeton, NJ: Educational Testing Service.

Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 485-514). London: Collier Macmillan.

Jamieson, J., Campbell, J., Norfleet, L., & Berbisada, N. (1993). Reliability of a computerized scoring routine for an open-ended task. System, 21, 305-322.

de Jong, J. H. A. L. (1986). Item selection from pretests in mixed ability groups. In C. W. Stansfield (Ed.), Technology and language testing (pp. 91-108). Washington, DC: TESOL.

Kaya-Carton, E., Carton, A. S., & Dandonoli, P. (1991). Developing a computer-adaptive test of French reading proficiency. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 259-284). New York: Newbury House.

Kingsbury, G. G., & Weiss, D. J. (1983). A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 257-283). New York: Academic Press.

Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1997). Computer familiarity among TOEFL examinees. Unpublished manuscript. Princeton, NJ: Educational Testing Service.

-56-
Larson, J. W., & Madsen, H. S. (1985). Computerized adaptive language testing: Moving beyond computer-assisted testing. CALICO Journal, 2, 32-36, 43.

Laurier, M. (1991). What we can do with computerized adaptive testing...and what we cannot do! In S. Anivan (Ed.), Current developments in language testing (pp. 244-255). Singapore: Regional Language Centre.

Laurier, M. (1996). Using the information curve to assess language CAT efficiency. In A. Cumming & R. Berwick (Eds.),Validation in language testing (pp. 111-123). Clevedon, UK: Multilingual Matters.

Lewis, C., Sheehan, K. (1990). Using Bayesian decision theory to design a computerized adaptive mastery test. Applied Psychological Measurement, 14, 367-386.

Lissitz, B. (1997). The New York standardized testing legislation. (p. 2). National Council on Measurement in Education Quarterly Newsletter (5)2.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.

Lunz, M. E., & Bergstrom, B. A. (1994). An empirical study of computerized adaptive test administration conditions. Journal of Educational Measurement, 31, 251-263.

Lunz, M. E., & Bergstrom, B. A., & Wright, B. D. (1992). The effect of review on student ability and test efficiency for computer adaptive tests. Applied Psychological Measurement, 16, 33-40.

Madsen, H. S. (1991). Computer-adaptive testing of listening and reading comprehension. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 237-257). New York: Newbury House.

Madsen, H. S., & Larson, J. W. (1986). Computerized Rasch analysis of item bias in ESL tests. In C. W. Stansfield (Ed.), Technology and language testing (pp. 47-67). Washington, DC: TESOL.

Mazzeo, J., Druesne, B., Raffeld, P. C., Checketts, K. T., & Muhlstein, A. (1991). Comparability of computer and paper-and-pencil scores for two CLEP general examinations (Report #91-5). New York: College Entrance Examination Board.

McBride, J. R., & Martin, J. T. (1983). Reliability and validity of adaptive ability tests in a military setting. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 223-236). New York: Academic Press.

MicroCAT Testing System [Computer software]. (1984). St. Paul, MN: Assessment Systems.

Microsoft Access 2.0 [Computer software]. (1996). Redmond, WA: Microsoft Corporation.

Mislevy, R. J. (1993). Foundations of a new test theory. In N. Frederiksen, R. J. Mislevy, & I. I. Bejar (Eds.), Test theory for a new generation of tests (pp. 19-39). Hillsdale, NJ: Lawrence Erlbaum Associates.

Mislevy, R. J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-483.

Mislevy, R. J., & Bock, R. D. (1986). PC-BILOG: Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software.

Neu, J., & Scarcella, R. (1991). Word processing in the ESL writing classroom: A survey of student attitudes. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 169-187). New York: Newbury House.

O'Neill, K., Folk, V., & Li, M.-Y. (1993). Report on the pretest calibration study for the computer-based academic skills assessments of The Praxis Series: Professional assessments for beginning teachers. Princeton, NJ: Educational Testing Service.

PARGrade 3.0 [Computer software]. (1990). Costa Mesa, CA: Economics Research.

PARScore 3.0 [Computer software]. (1990). Costa Mesa, CA: Economics Research.

PARTest 3.0 [Computer software]. (1990). Costa Mesa, CA: Economics Research.

Phinney, M. (1991). Computer-assisted writing and writing apprehension in ESL students. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 189-204). New York: Newbury House.

-57-
Powers, D. E., Fowles, M. E., Farnum, M., & Ramsey, P. (1994). Will they think less of handwritten essays if others wordprocess theirs? Effects on essay scores of intermingling handwritten and word-processed essays. Journal of Educational Measurement, 31, 220-233.

Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 237-255). New York: Academic Press.

Reid, J. (1986). Using the writer's workbench in composition teaching and testing. In C.W. Stansfield (Ed.), Technology and language testing (pp. 167-188). Washington, DC: TESOL.

Schaeffer, G., Steffen, M., & Golub-Smith, M. (1993). Introduction of a computer adaptive GRE General Test (Research Report # 93-57). Princeton, NJ: Educational Testing Service.

Sheehan, K., & Lewis, C. (1992). Computerized mastery testing with nonequivalent testlets. Applied Psychological Measurement, 16, 65-76.

Steinberg, L., Thissesn, D., & Wainer, H. (1990). Validity. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 187-231). Hillsdale, NJ: Lawrence Erlbaum.

Stenson, H. (1988). Testat: A supplementary module for SYSTAT (version 2.0). Chicago, IL: SYSTAT.

Stevenson, J., & Gross, S. (1991). Use of a computerized adaptive testing model for ESOL/bilingual entry/exit decision making. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 223-235). New York: Newbury House.

Stocking, M. L. (1987). To simulated feasibility studies in computerized adaptive testing. Applied Psychology: An International Review, 35, 263-277.

Stocking, M. L. (1992). Controlling item exposure rates in a realistic adaptive testing paradigm (Research Report # 93-2). Princeton, NJ: Educational Testing Service.

Stocking, M. L. (1994). An alternative method for scoring adaptive tests (Research Report # 94-48). Princeton, NJ: Educational Testing Service.

Stocking, M. L., & Lewis, C. (1995a). Controlling item exposure conditional on ability in computerized adaptive tests (Research Report #95-24). Princeton, NJ: Educational Testing Service.

Stocking, M. L., & Lewis, C. (1995b). A new method of controlling item exposure in computerized adaptive tests (Research Report #95-25). Princeton, NJ: Educational Testing Service.

Stocking, M. L., & Swanson, L. (1993). A method for severely constrained item selection in adaptive testing. Applied Psychological Measurement, 17, 277-292.

Suen, H. K. (1990). Principles of test theories. Hillsdale, NJ: Lawrence Erlbaum Associates.

Swanson, L. & Stocking, M. L. (1993). A model and heuristic for solving very large item selection problems. Applied Psychological Measurement, 17, 151-166.

Taylor, C., Jamieson, J., Eignor, D., & Kirsch, I. (1997). Measuring the effects of computer familiarity on computer-based language tasks. Unpublished manuscript. Princeton, NJ: Educational Testing service.

Testmaster [Computer software]. (1988). Zurich, Switzerland: Eurocentres.

Thissen, D. (1990). Reliability and measurement precision. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 161-186). Hillsdale, NJ: Lawrence Erlbaum.

Thissen, D., & Mislevy, R. J. (1990). Testing algorithms. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 103-134). Hillsdale, NJ: Lawrence Erlbaum.

-58-
Tung, P. (1986). Computerized adaptive testing: Implications for language test developers. In C.W. Stansfield (Ed.), Technology and language testing (pp. 11-28). Washington, DC: TESOL.

Wainer, H. (1992). Some practical considerations when converting a linearly administered test to an adaptive format (Research Report #92-13). Princeton, NJ: Educational Testing Service.

Wainer, H., Dorans, N. J., Green, B. F., Mislevy, R. J., Steinberg, L., & Thissen, D. (1990). Future challenges. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 233-272). Hillsdale, NJ: Lawrence Erlbaum.

Wainer, H. C., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185-201.

Wainer, H., & Mislevy, R. J. (1990). Item response theory, item calibration and proficiency estimation. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (pp. 65-102). Hillsdale, NJ: Lawrence Erlbaum.

Wald, A. (1947). Sequential analysis. New York: Wiley.

Ward, W. C. (1988). The College Board Computerized Placement Tests: An application of computerized adaptive testing. Machine-Mediated Learning, 2, 217-282.

Weiss, D. J. (1983). New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.

Wright, B. D., Linacre, J. M., & Schulz, M. (1990). BIGSTEPS: General-purpose Rasch analysis program (version 2.00). Chicago, IL: Mesa Press.s

-59-

Return to the article