Browsing by Subject "Educational tests and measurements"
Now showing 1 - 20 of 21
Results Per Page
Sort Options
Item A study of the relationships found among age, grade, intelligence, school marks and traits of school children of the Plainview public schools for the school year 1928-1929(Texas Tech University, 1929-05) Ballengee, Eugene MarvinNot availableItem A study of the use of standardized achievement and mental ability test results in selected Texas school districts(Texas Tech University, 1965-08) Gray, C. GNot availableItem Analysis of tests used in evaluating children with delayed speech(Texas Tech University, 1966-05) O'Neal, JaniceNOT AVAILABLEItem Connotations of performance level categories used in high stakes testing(2004) Burt, Winona Madelain; Stapleton, Laura M.Item Correlates of academic achievement for Mexican American students(Texas Tech University, 1969-08) Smith, George Worth,Not availableItem Developing statistical inquiry: prospective secondary mathematics and science teachers' investigations of equity and fairness through analysis of accountability data(2004) Makar, Katie M.; Confrey, Jere.; Marshall, Jill Ann.Concerns about equity in the ways that schools are using the data from the results of their students’ state-mandated exams (Confrey & Makar, in press) prompted this mixed-method study, based on the model of Design Research (Cobb et al., 2003). The study was conducted to provide insight into the ways that understanding of the statistical concepts of variation and distribution, developed in the context of learning about equity and assessment, can allow prospective teachers to broaden their understanding of equity and gain experience with conducting an inquiry of an illstructured problem through the use of data generated by high-stakes tests to investigate equity and fairness in the accountability system. The study took place in an innovative one-semester course for preservice teachers designed to support and develop understanding of equity and fairness in accountability through data-based statistical inquiry (Confrey, Makar, and Kazak, 2004). The prospective teachers’ investigations were conducted using Fathom Dynamic Statistics (Finzer, 2001), a learning software built for novice data analysts which emphasizes visualization and building inferential thinking through highlighting relationships between multiple variable displays. Semistructured investigations during the course led up to a three-week self-designed inquiry project in which the prospective teachers used data to investigate an area of interest to them about equity in accountability, communicating their findings both orally and as a written paper. Results from the study provide insight into prospective teachers’ experiences of conducting inquiry of ill-structured problems and their struggle with articulating beliefs of equity. The study also reports how statistical concepts documented in structured settings showed that the subjects developed rich conceptions of variation and distribution, but that the application of these concepts as evidence in their inquiry of an ill-structured problem was more challenging for them. No correlation was found between the level of statistical evidence in the structured and open-ended inquiry settings, however there was a significant correlation between prospective teachers degree of engagement with their topic of inquiry and the depth of statistical evidence they used, particularly for minority students. Implications and suggestions for improving the preparation of teachers in the areas of statistical reasoning, inquiry, equity, and interpreting assessment data are provided.Item Development of a vocational evaluation battery for mentally retarded persons.(Texas Tech University, 1975-05) Permenter, Nancy AnnNot availableItem Does team-based testing promote individual learning?(2011-05) Walker, Joshua David; Robinson, Daniel H.; Schallert, Diane; Svinicki, Marilla; Borich, Gary; Muir-Broaddus, JacquelineTeam-based testing gives students a chance to earn additional points on individual unit tests by immediately re-taking the test as a team competing against other teams. This instructional approach has enjoyed widening implementation and impressive anecdotal support, but there remains a dearth of empirical studies evaluating its prescribed processes and promoted outcomes. Although the posited effectiveness and appeal of team-based testing seem consistent with the benefits of test-enhanced learning and collaborative learning in general, several limitations are readily apparent. Namely, the current format of the individual and team readiness assurance tests is expressly multiple-choice. Though there are some advantages of this type of question (e.g., ease of administering and grading), the long-term cognitive disadvantage relative to short-answer questions is well documented. Furthermore, it is not clear whether the proposed gain in learning through this format is attributable to the group effect -- be it social or cognitive, or simply to repeated exposure to the test items. Therefore, this study measured the effects of initial test question Format (short-answer vs. multiple-choice), Mode (individual vs. group), and Exposure (once vs. twice) on four delayed measures of learning: Old multiple-choice items (ones students had initially been tested over), Old short-answer items, New multiple-choice items, and New short-answer items. Two weeks after watching a video-recorded lecture, 208 college students took a thirty-item test comprising both the old and new items in multiple-choice and short-answer formats. Results revealed that 1) taking an initial test twice is better than once when the delayed test has old short-answer items or new multiple-choice items, 2) taking an initial short-answer test is better than multiple choice when the delayed test has either old multiple-choice, old short-answer, or new multiple-choice items, and 3) taking an initial team test is no different than taking an individual test when it comes to long-term learning. Particularly noteworthy from these results is how a) the effects of short-answer tests and taking tests twice are not present within Team conditions, and b) taking a multiple-choice test twice is as effective as taking a short-answer test once. Implications are discussed in light of learning theory and instructional practice.Item Educational Mobility in Nutrition and Dietetics Through Equivalency Testing(Texas Tech University, 1972-08) Leong, Elizabeth Ngar-SeongNot Available.Item The effects of high-stakes testing on central office organizational culture: changes in one school district(2007) Champion, Bret Alan, 1969-; Olivárez, Rubén; Ovando, Martha N., 1954-The purpose of this study was to determine what impact high-stakes testing had on one school district's central office organizational culture, and how changes affected district-wide practices, central office administrators and campus principals. Three research questions guided the study: 1) What changes in the central office organizational culture occurred due to the increased implementation of and pressure from high-stakes testing? 2) How have the changes in the central office culture affected district administrators and campus leaders? 3) How have changes in central office organizational culture affected district-wide practices? This study utilized a qualitative methodology and a case study approach, focusing on one Texas school district. Three types of data collection methods were used: focus groups, interviews, and document review. The data were coded and analyzed using the constant comparison method in order for themes and propositions to surface. This resulted in a rich description of the case and provided answers to the three research questions. The findings of the study revealed that high-stakes testing has affected the central office organizational culture, as well as campus and district administrators, in four distinct ways: It has instilled fear of failure and fear of losing one's job; it has invoked frustration, both because of the narrow focus of the test and the demands of outside stakeholders; it has inhibited freedom, particularly in goal-setting; and it has improved focus by ensuring the use of research-based teaching practices and detailed student achievement data analysis. These changes have led to six alterations in district-wide practices: more precise student data analysis, reactive and targeted intervention for particular grade levels and students, increased discussion about testing throughout the district, improved curriculum alignment in classrooms, research-based professional development, and district support staff members becoming aware of testing demands. The findings contribute to literature in the field by investigating the connection between two areas of research, high-stakes testing and school district central office organizational culture. The study generated information to assist practitioners as they work to maintain or improve school district organizational culture while implementing high-stakes testing or other high-impact, mandated changes.Item Examiner effects on the testing of Mexican-American bilingual children in the early elementary grades.(Texas Tech University, 1976-12) Morales, Edward SNot availableItem An investigation of the optimal test design for multi-stage test using the generalized partial credit model(2010-12) Chen, Ling-Yin; Dodd, Barbara Glenzing; Borich, Gary D.; Whittaker, Tiffany A.; Davis, LaurieAlthough the design of Multistage testing (MST) has received increasing attention, previous studies mostly focused on comparison of the psychometric properties of MST with CAT and paper-and-pencil (P&P) test. Few studies have systematically examined the number of items in the routing test, the number of subtests in a stage, or the number of stages in a test design to achieve accurate measurement in MST. Given that none of the studies have identified an ideal MST test design using polytomously-scored items, the current study conducted a simulation to investigate the optimal design for MST using generalized partial credit model (GPCM). Eight different test designs were examined on ability estimation across two routing test lengths (short and long) and two total test lengths (short and long). The item pool and generated item responses were based on items calibrated from a national test consisting of 273 partial credit items. Across all test designs, the maximum information routing method was employed and the maximum likelihood estimation was used for ability estimation. Ten samples of 1,000 simulees were used to assess each test design. The performance of each test design was evaluated in terms of the precision of ability estimates, item exposure rate, item pool utilization, and item overlap. The study found that all test designs produced very similar results. Although there were some variations among the eight test structures in the ability estimates, results indicate that the performance overall of these eight test structures in achieving measurement precision did not substantially deviate from one another with regard to total test length and routing test length. However, results from the present study suggest that routing test length does have a significant effect on the number of non-convergent cases in MST tests. Short routing tests tended to result in more non-convergent cases, and the presence of fewer stage tests yielded more of such cases than structures with more stages. Overall, unlike previous findings, the results of the present study indicate that the MST test structure is less likely to be a factor impacting ability estimation when polytomously-scored items are used, based on GPCM.Item Selection of factors for prediction of scholastic success in introductory food and nutrition courses(Texas Tech University, 1964-05) Foree, Sherrell BellNot availableItem Self-regulation in L2 oral narrative tasks performed by adult Korean users of English(2001-08) Kim, Young-Woo; Garza, Thomas J.; Schallert, Diane L.When second language (L2) users experience difficulty in performing a task in English, they often engage in efforts to overcome their difficulties through strategic behaviors aimed at achieving the goals of the task. If those efforts take the form of, or are accompanied by, verbal expressions, these verbal expressions are often referred to as private speech, and their function described as selfregulatory, by second language researchers taking a Vygotskyan perspective. In this study, these claims were inspected and re-defined by linking a Vygotskyan perspective on self-regulation with a metacognitive perspective. Eight Korean graduate students enrolled in a U.S. university participated in this study. They were videotaped as they performed two narrative tasks, one using a series of pictures that had no words and a second, a recall task in which they watched a movie clip and retold the story they had seen. They were also interviewed as they watched their narrative performance. During the interview, they provided their thoughts on using English and on engaging in self-regulatory behaviors. Their utterances and gestures in the narrative tasks were recorded, transcribed, and analyzed. The interviews were recorded, partially transcribed, and analyzed. Results and discussion included the finding of support for previous studies that L2 users’ private speech functions as a self-regulatory process and plays an important role in the process. There were also findings that revealed limitations in explaining L2 users’ self-regulatory behaviors from a simple Vygotskyan conception of private speech. Several theoretical concepts from a more general metacognitive perspective, including aspects that refer to contextualization and frame, were effective in explaining the social context in which L2 self-regulatory behaviors occur. Theoretical and practical implications of the results of this study and possible future research topics are also addressed.Item Strategies for controlling testlet exposure rates in computerized adaptive testing systems(2003-05) Boyd, Aimee Michelle; Dodd, Barbara GlenzingExposure control procedures in computerized adaptive testing (CAT) systems protect item pools from being compromised, however, this impacts measurement precision. Previous research indicates that exposure control procedures perform differently for dichotomously scored versus polytomously scored CAT systems. For dichotomously scored CATs, conditional selection procedures are often the optimal choice, while randomization procedures perform best for polytomously scored CATs. CAT systems modeled with testlet response theory have not been examined to determine optimal exposure control procedures. This dissertation examined various exposure control procedures in testletbased CAT systems using the three-parameter logistic testlet response theory model and the partial credit model. The exposure control procedures were the randomesque procedure, the modified within .10 logits procedure, two levels of the progressive restricted procedure, and two levels of the Sympson-Hetter procedure. Each of these was compared to a baseline no exposure control procedure, maximum information. The testlets were reading passages with six to ten multiple-choice items. The CAT systems consisted of maximum information testlet selection contingent on an exposure control procedure and content balancing for passage type and the number of items per passage; expected a posteriori ability estimation; and a fixed length stopping rule of seven testlets totaling fifty multiple-choice items. Measurement precision and exposure rates were examined to evaluate the effectiveness of the exposure control procedures for each measurement model. The exposure control procedures yielded similar results for measurement precision within the models. The exposure rates distinguished which exposure control procedures were most effective. The Sympson-Hetter conditions, which are conditional procedures, maintained the pre-specified maximum exposure rate, but performed very poorly in terms of pool utilization. The randomization procedures, randomesque and modified within .10 logits, yielded low maximum exposure rates, but used only about 70% of the testlet pool. Surprisingly, the progressive restricted procedure, which is a combination of both a conditional and randomization procedure, yielded the best results in its ability to maintain and control the maximum exposure rate and it used the entire testlet pool. The progressive restricted conditions were the optimal procedures for both the partial credit CAT systems and the threeparameter logistic testlet response theory CAT systems.Item Teaching in a high stakes testing environment: one teacher's practices and perspectives(Texas Tech University, 2002-08) Graves, Ingrid SmokerThe study was situated in an elementary school located in an urban setting in the southwestern part of the United States. The school serves a low SES minority neighborhood of predominantly Hispanic students who have traditionally not scored well on standardized wide-scale achievement tests. Often teachers in this setting engage their students in test preparations which reflect a skill-based approach to literacy teaching as opposed to a focus on higher level thinking and a transactional approach to literacy instruction. The teacher who is the focus of this study embraces a learner-centered view of instruction. She aims to promote critical thinking among her students through the reading of good literature, engagement in activities that build on students' knowledge base, and promotion of learner constructed knowledge. Her expectations for student accomplishment do not endorse a deficit view of children's intellect and potential. The results of the study suggest that teachers who facilitate student's knowledge development rather than engaging in test preparations, along with principals who support teachers in the "literacy of thoughtfulness," can help minority students score well on standardized achievement tests.Item The effect of familiarity with the examiner on WISC-R verbal, performance, and full scale scores(Texas Tech University, 1980-05) Irons, Donna ANot availableItem The effects of mental imagery activities on early concept acquisition(Texas Tech University, 1978-08) Oxford, Patricia Ann HoyNot availableItem The Minnesota spatial relations test in relation to a hand-skill test, interest inventory areas, an intelligence test, and certain environmental data(Texas Tech University, 1954-08) Hoey, Robert JamesNot availableItem Validity and reliability estimates of the interpersonal relationship scale(Texas Tech University, 1976-08) Kratzke, Jeanette KayThe study reports on the refinement of the Interpersonal Relationship Scale, a new semantic differential measure for assessing interpersonal relationship adjustment. The 120 item scale was designed for use in marital and other dyadic relationships. Despite widespread criticism of the concept of adjustment, the study proceeds from the position that a new measure which is theoretically grounded, relevant, valid, and reliable is necessary since marital and other relationships continue to be researched. In this study, predictive, content, concurrent, and construct validity as well as high scale reliability is reported utilizing a sample of 94 counseling and non-counseling subjects. It is concluded that the IRS is instrumental in assessing interpersonal relationship adjustment, but some methodological issues still remain unresolved.