Assessing for Quality

By Richard L. Starcher

Educational assessment is a hot topic in the United States these days. Politicians and accrediting councils as well as educators are fueling the debate. This attention to assessment is both good and bad. It is good so many people genuinely care about quality in education. However, the conversation also is generating a great deal of negative energy in the form of confusion, frustration and even anger, largely because not all participants are adequately informed.

Of course, assessment is nothing new. Teachers have always assessed, be it in a modern school or in a traditional African village. What is "new" is an intensified and generalized interest in producing evidence of quality in America's schools and colleges.

I recently received an email forwarded by a colleague that brought to the fore the same concern in the African context. A chief academic officer was writing to his academic department heads about engaging in assessment to produce evidence of academic quality. The message complained of a trend among faculty members to replace final exams with other forms of assessment. It then lauded the value of final examinations for achieving an overview of the subject and for evaluating and maintaining academic quality through the use of external moderators. The communication concluded by mandating final examinations in all courses.

In the spirit of cordial, "academic" debate and with a view to improving the quality of theological education in Africa, I'd like to suggest an alternative view of assessment that may or may not include "traditional" final examinations and certainly would not impose them as the only sanctioned assessment tool. I pray my views will generate thoughtful discussion on assessment "for learning" (not merely "of learning"), as well as on how best to foster transformational learning in African theological colleges.

Current Thinking on Assessment

What is most striking in the above communication is the absence of any direct reference to what students actually learn. Recent literature consistently and inextricably links assessment to learning outcomes (Wiggins, 1998; Walvoord & Anderson, 1998, Dick, Carey & Carey, 2001; Huot, 2002; Wiggins & McTighe, 2005; Wormeli, 2006). Further, much of the literature casts considerable doubt upon the efficacy of the traditional final exam in providing an accurate measure of learning. Williams (2006, p. 107), for example, bluntly states, "The closed book, invigilated final examination has become an anachronism. Most significantly, it is an assessment instrument that does not assess deep conceptual understanding and process skills. Indeed, the anecdotal evidence one often hears from students is that 'cramming' the night before amounts to 'data dumping' on the day, with little knowledge retention thereafter." Hence, what is intended as a means of assuring quality (i.e., mandated, invigilated, externally moderated final examinations) actually undermines true excellence in higher education.

A Working Definition of Assessment

The literature contains multiple definitions of assessment. My discussion of the topic is limited to student assessment (as opposed to institutional, program or teacher assessment). I use the term interchangeably with its twin "evaluation." I see it as related to but distinct from testing or grading. Interestingly, Walvoord and Anderson (1998, p. 1) use the term grading to describe almost exactly what I mean by assessment. "When we (the authors) speak of grading, we are not referring to a process of merely bestowing isolated artifacts or final course marks.... [It] includes tailoring the test or assignment to the learning goals of the course, establishing criteria and standards, helping students acquire the skills and knowledge they need, assessing student learning over time, shaping student motivation, feeding back results so students can learn from their mistakes, communicating about students' learning to the students and to other audiences, and using results to plan future teaching methods. When we talk about grading, we have student learning most in mind."

The Place of Assessment

Effective assessment must be considered in the context of instructional design. Unfortunately, many (if not most) instructors start designing instruction (be it a lecture, a course, or even a program of study) with the content to be covered. If they are teaching a standardized syllabus they might ask themselves, "How do I go about covering everything in this syllabus in just 12 short weeks?" Even if they have the liberty to write their own syllabus they might start by asking, "What content should I put in my syllabus?" However, this approach almost inevitably leads to an overemphasis on content coverage at the expense of truly transformative learning. Further, assessment often becomes little more than an afterthought. (By the way, the same pitfalls await those designing entire programs of study, but instead of starting with a syllabus they start with a list of courses to be taught.)

Grant Wiggins and Jay McTighe, noted American experts on assessment and instructional design from whom I will borrow heavily, propose an alternative approach. It involves a three-step process called "backward" design: 1) identify desired results, 2) determine acceptable evidence, and 3) plan learning experiences. It is backward "from the perspective of much of habit and tradition in our field. A major change from common practice occurs as designers must begin to think about assessment before deciding what and how they will teach" (Wiggins & McTighe, 2005, p. 19).

An Understanding of Understanding

I will return later to the three steps in the backward design process, but I first want to introduce another important concept; namely, "understanding." Understanding is not the same thing as knowledge. All good teachers intuitively grasp the difference. Good teachers are never satisfied with students' ability to recite facts. They want them to do something with that acquired knowledge. Wiggins and McTighe (2005, p. 37) define an understanding as "a mental construct, an abstraction made by the human mind to make sense of many distinct pieces of knowledge."

This definition recalls Benjamin Bloom¹s foundational work on higher levels of cognition in his 1956 Taxonomy of Educational Objectives, wherein he proposed six levels of critical thinking: 1) knowledge, 2) comprehension, 3) application, 4) analysis, 5) synthesis and 6) evaluation. In addition to his work in the cognitive domain, Bloom (Krathwohl, Bloom & Masia, 1964) proposed a five-level taxonomy of educational objectives for the affective domain: 1) receiving, 2) responding, 3) valuing, 4) organization, and 5) characterization by a value or value complex. Anita Harrow (1972) proposed a six-level expansion of Bloom¹s taxonomy into the psychomotor domain: 1) reflex motions, 2) basic-fundamental movements, 3) perceptual abilities, 4) physical abilities, 5) skilled movements, and 6) non-discursive communication.

An explication of these taxonomies exceeds the scope of the topic at hand. I cite them here primarily to note that school examinations most commonly target only the lowest levels in the cognitive domain and largely ignore the affective and psychomotor domains altogether. However, when we reflect on the educational objectives of our theological colleges in Africa, we see clearly they far exceed knowledge, comprehension and even application. In fact, our truly most important educational objectives are not cognitive at all. Instead, they target beliefs, values and behavior.

Assessment and Educational Objectives

The first step in Wiggins and McTighe's instructional design process is "identifying desired results." Instructors ask themselves, "What should students know, understand, do and value at the end of the instructional unit" (be it a lesson, a course or an entire program of study)? Following Dick, Carey and Carey (2001), I prefer thinking in terms of four domains (instead of the classic three mentioned above), namely: 1) verbal information (i.e., recitation of facts), 2) intellectual skills (e.g., forming concepts, applying rules and solving problems), 3) psychomotor skills (generally coordinated mental and physical activity), and 4) attitudes (e.g., right choices or decisions). Of course, not every lesson will seek to achieve all four types of objectives, but it is worthwhile to consider the importance of all four in the planning process.

Even at this stage of the instructional design process it is imperative to "think like an assessor" (Wiggins & McTighe, 1998, p. 12) because educational objectives must be measurable to be operational. However, let me quickly add that I¹m not using the term "measurable" here in a strictly quantitative sense. Rather, I mean the instructor must have some way of knowing whether or not the objective has been achieved. For example, one educational objective in a course on homiletics might be, "Students will demonstrate the ability to utilize stories to illustrate a biblical or theological truth." This is only a valid course objective if the instructor conceivably can know whether or not students have achieved it.

When setting educational objectives and planning assessment activities it is helpful to establish curricular priorities. If we stop to think about it, we quickly realize not everything we present in class is of equal importance. Wiggins and McTighe (2005, p. 71) frame this distinction in terms of: 1) that which is "worth being familiar with," 2) that which is ³important to know and do,² and 3) "big ideas and core tasks."

For example, in a course on biblical hermeneutics I may want students to "be familiar with" the history of biblical interpretation from pre-Christian times through the modern era, but I won't really be terribly disappointed if they can't recall the seven rules of Hillel five years after the end of the course. What's "important to know" is the significance of the historical, cultural and literary contexts for arriving at a valid interpretation of a biblical text. I will be disappointed if they've forgotten this lesson even twenty years after the end of the course. However, at the end of the day, the "core task" is for students to demonstrate mastery of the hermeneutical principles they have learned by actually interpreting a biblical text. Further, the "big idea" is for them to leave the course so committed to the integrity of the word of God that they will never deliberately twist its meaning for their own purposes.

This prioritization of educational objectives has significant implications for assessment because teachers tend to test for what is easy to test rather than for what is truly important. We need to get beyond "thinking of assessment as a means of generating grades" (Wiggins & McTighe, 2005, p. 148) and view it as evidence of the achievement of our educational outcomes. "In effective assessments, we see a match between the type or format of the assessment and the needed evidence of achieving the desired results. If the goal is for students to learn basic facts and skills, then paper-and-pencil tests and quizzes generally provide adequate and efficient measures. However, when the goal is deep understanding, we rely on more complex performances to determine whether or not our goal has been reached" (ibid., p. 170).

Assessment Types

Many of our educational institutions recognize two basic types of assessment: continuous and final. However, Wiggins and McTighe's four-fold typology is more helpful in determining acceptable evidence of the achievement of educational objectives. It includes: 1) informal checks for understanding, 2) quiz and test items, 3) academic prompts, and 4) performance tasks.

  1. Informal checks

    It is important not to think of assessment wholly in terms of written assignments or tests we can grade. Good teachers assess constantly by various means, including "questioning, observations, examining student work and think-alouds" (Wiggins & McTighe, 2005, p. 153). These methods are particularly helpful in assessing students¹ attitudes and beliefs. While many teachers naturally practice informal checking, all teachers should make it an intentional assessment strategy. Such observations coupled with one-on-one debriefing sessions can be a very effective means of helping students grow in the areas most commonly neglected in formal theological education.
     
  2. Quiz and test items

    This common type of assessment consists of "simple, content-focused items that: 1) assess for factual information, concepts and discrete skill, 2) use selected response (e.g., multiple choice, true-false, matching) or short answer formats, 3) are convergent, typically having a single best answerŠ and 4) are typically secure (i.e., items are not known in advance)" (ibid., p. 153). It is most useful for assessing that which is worth being familiar with. This is the form of assessment for which students most commonly cram the night before, forgetting the things they memorized as they exit the exam venue.
     
  3. Academic prompts

    Academic prompts facilitate assessment of higher levels of cognition. They commonly involve "open-ended questions or problems that require students to think critically, not just recall knowledge, and to prepare a specific academic response . . . Such questions or problems: 1) require constructed responses to specific prompts under school and exam conditions, 2) are 'open,' with no single best answer . . . , 3) involve analysis, synthesis, and evaluation, 4) typically require an explanation or defense of the answer given and the methods used.... 5) may or may not be secure, 6) involve questions typically only asked of students in school" (Wiggins & McTighe, 2005, p. 153).

    While academic prompts can be used under traditional exam conditions (i.e., essay exams), they also can take the form of interactive reading assignments, research papers, book reviews, observation reports, oral presentations and other relatively complex cognitive tasks. Because the "prompts" (or questions) have no single correct response and require analysis, synthesis and evaluation, "exam security" need not be an issue. Nevertheless, they remain "artificial" in that they do not assess real life performance. If used under traditional exam conditions they represent an improvement over so-called "objective" tests but still remain subject to some of the same downfalls (e.g., cramming, data dumping and minimal retention). Further, they tend to target almost exclusively the cognitive domain.

    I should insert a word here about supposedly objective exam questions (e.g., multiple-choice). Some instructors and administrators favor them as a more accurate measure of learning because they easily produce a numeric score. However, as Walvoord and Anderson note, they are far from objective. "The selection of items, the phrasing of questions, the level of difficulty ­ all these judgments are made by the teacher according to circumstances" (1998, p. 11). Further, they favor students with certain learning styles while penalizing others. Good judgment rather than objectivity should be our goal. "Your job is to render informed and professional judgment to the best of your ability. You will want to establish the clearest and most thoughtful criteria and standards that your professional training can supply" (ibid.).

    4. Performance tasks

    "Understanding is revealed in performance. Understanding is revealed as transferability of core ideas, knowledge, and skill on challenging tasks in a variety of contexts. Thus, assessment for understanding must be grounded in authentic performance-based tasks" (Wiggins & McTighe, 2005, p. 153). In an "ideal" learning environment, performance tasks would be actual activities carried out in the real world (e.g., preaching a sermon in a local church). However, in a school setting instructors often aim for replication or simulation. Performance tasks can take various forms (e.g., projects, portfolios, simulations) but they share certain common traits.

    1. They replicate "the ways in which a person¹s knowledge and abilities are tested in real-world situations" (Wiggins & McTighe, 2005, p. 154).
    2. They replicate "key challenging situations" in which field workers are "truly 'tested' in the workplace, in civic life, and in personal life" (ibid.).
    3. They requires students to use "knowledge and skills wisely and effectively to address challenges or solve problems that are relatively unstructured" (ibid.).
    4. They allow instructors to assess "the student's ability to efficiently and effectively use a repertoire of knowledge and skill to negotiate a complex multistage task" (ibid.).

Let me illustrate the performance task approach with a "real" classroom situation. For a number of years I have taught a course entitled Principles of Teaching. One of the course objectives is for students to be able to employ effectively the theories, principles and practices discussed and demonstrated in class (learning theory, creative teaching methods, etc.). Hence, the course's final assessment piece is a series of simulated teaching sessions in which small teams of students teach their classmates. Those not teaching are required to evaluate the lesson against the theories, principles and practices discussed and demonstrated in class. The assessment replicates a challenging, real-world situation and requires students to utilize the truly important understandings gained through the course as they use a repertoire of knowledge and skill to negotiate a complex, multistage task.

Now, the above described exercise is not the only assessment instrument employed in this course. I also use reading reports to check student understanding of basic principles. (Quizzes might serve the same purpose.) I use journals to assess growth in understanding of and appreciation for important course themes. I use observation reports to assess higher levels of cognitive achievement (analysis, synthesis and evaluation). I use self-evaluation reports to assess students' commitment to practicing principles of good teaching. Nevertheless, the course "final" is a performance task because it best assesses the achievement of essential course objectives. Replacing this task with a traditional paper-and-pencil examination would devalue truly important understandings, unduly elevate trivial ones, and contradict the very principles of effective teaching and learning the course seeks to communicate.

Assessment and Grading

As Walvoord and Anderson (1998, p. 9) observe, "grading is deeply embedded in higher education." It is the coin of the educational realm. Students labor intensely for the reward of good grades, often losing sight of learning in the process. While teachers might bemoan this situation, they likely will not have the capacity to change it. The wiser course might be to harness the power of grades by aligning them to important educational objectives. Hence, the weighting of assignments and other evaluative activities should reflect the importance of the "understandings" assessed. In other words, a final assessment piece worth 40 to 60 percent of a student's grade should do more than test (short-term) memorization of discrete facts. It should demonstrate mastery of essential cognitive, affective and psychomotor skills.

Conclusion

Assessment is important. It should provide evidence of quality in higher theological education. However, it is crucial to frame quality in terms of learning objectives and use assessment tools befitting those objectives. Theological education's objectives include facts retention, intellectual skills, coordinated mental and physical activity, as well as beliefs, values, choices and decisions. Hence, our theological colleges need to utilize a variety of assessment tools, favoring those apt to assess essential, enduring understandings.

It is disheartening at times to note an abiding attachment in African higher education to practices inherited from the colonizers now abandoned by the colonizers themselves. Africa has a rich educational heritage that predates the colonial era. Traditional African educators knew to assess learning by means of informal checks, observations and performance tasks. While invigilated, externally moderated, pencil-and-paper examinations may be an appropriate assessment instrument for some learning objectives in an institutionalized educational setting; they certainly are not the best summative assessment tool in all cases. Both traditional African wisdom and contemporary research favor performance-based assessment and other alternative means of evaluating student learning.


REFERENCES

Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives: Classification of educational goals. Book 1: Cognitive domain. New York: Longman.

Dick, W., Carey, L., & Carey, J. O. (2001). The systematic design of instruction (5th ed.). New York: Longman.

Harrow, A. J. (1972). A taxonomy of the psychomotor domain: A guide for developing behavioral objectives. New York: D. McKay Co.

Huot, B. (2002). Toward a new discourse of assessment for the college writing classroom. College English (65)2:163-180. www.jstor.org. Retrieved 31 October 2006.

Krathwohl, D. R., Bloom, B. S., & Masia, B. B. (1964). Taxonomy of educational objectives. Book 2: Affective domain. New York: Longman.

Walvoord, B. & Anderson, V. J. (1998). Effective grading: A tool for learning and assessment. San Francisco: Jossey-Bass.

Wiggins, G. (1998). Educative assessment: Designing assessment to inform and improve student performance. San Francisco: Jossey-Bass.

Wiggins, G., & McTighe, J. (1998). Understanding by design. Alexandria, VA: ACSD.

Wiggins, G., & McTighe, J. (2005). Understanding by design (2nd ed.). Alexandria, VA: ACSD.

Williams, J. B. (2006). The place of the closed book, invigilated final examination in a knowledge economy. Educational Media International 43(2):107-119.

Wormeli, R. (2006). Accountability: Teaching through assessment and feedback, not grading. American Secondary Education 34(3)14-27.


FURTHER READING ONLINE