If you love your methods, let them go


When designing assessment, it’s problematic to focus too much on technology or to hold on too tightly to particular methods.

Assessment methods such as viva voce, pen and paper exams, practical exams, and authentic assessment* are currently being touted as more secure ways of testing student knowledge in the context of widely-available generative AI technologies (GenAI). There is, arguably, a place for each of these methods in our repertoire, but whatever methods are chosen need to make sense in relation to the learning goals for our particular students, at their particular levels in their particular discipline, and also in relation to the broader picture of what matters to learners, educators, employers and wider society. There is no point having a secure assessment that is valid for the wrong purposes; or that doesn’t align with the kind of education we care about. Invigilated pen and paper exams might, in many cases, be more secure than an essay, but they assess different kinds of knowledge. Criticisms of exams as focusing on simple and abstract knowledge are still relevant, even if we have lost confidence in essays.

Photo of black pot with scissors, ruler and pencils.
Photo by Viktor Ritsvall on Unsplash

Widely-available GenAI changes the nature of the problems we must negotiate in assessment, but it isn’t the core of most assessment problems. Other considerations of context (e.g., the needs pertaining to discipline and level of study, or of remote or on campus modalities), purposes (what is education for) and values (what matters to us and our students) are at least as important. Note, too, that each time someone thinks of a way to AI-proof assessment, a new AI-based technology is released that can overcome it (e.g. newer versions of ChatGPT that can “do reflection” or produce convincing accounts of process, originality parsers to beat Turnitin; plugins and apps to subvert online and, now, oral exams). Education is a series of fixes to problems created by previous fixes (see Dron, 2023).

This reflects an important problem with putting generative AI at the centre of our design gaze. If we make choices about assessment methods on the basis of which ones will be secure in the context of GenAI, we are likely to distort the purposes of assessment without creating stability. For example, critiquing AI outputs is rapidly gaining in importance, but it is certainly not the only, or even the most important, thing. Making a large proportion of assessments about this will skew what we value. It is also subject to the problem of fixing fixes: AI tools can critique themselves. The saving grace, here, is that new ways of skewing assessment may be no worse than the current skewing towards propositional knowledge and rote memory.

There is no point having a secure assessment that is valid for the wrong purposes; or that doesn’t align with the kind of education we care about.

Assessment methods are usually designed in relation to tasks and forms of knowledge that we have decided are important (even if that decision is based on practicality) in a particular disciplinary or interdisciplinary context. Perhaps, the availability of GenAI technologies is contributing to a shift in what kinds of knowledge and tasks are important, but it this shift is neither simple nor easily predictable. Firstly, the fact that something can be done by technology does not mean it should no longer be done, or learned about, by humans. Secondly, it is not feasible to determine what tasks and knowledge will be relevant in the future, in part because the balance of relevant forms of knowledge changes over time and in relation to the contexts in which people operate.

On one hand, we should not change methods just to secure assessment against GenAI. Methods of assessment have been designed and refined over years, in relation to explicitly valued forms of knowledge. On the other hand, the kinds of knowledge that are valued are now being questioned, prompted by the wide availability of GenAI (or, as I would argue, the wide availability of GenAI is making it harder to ignore questions about knowledge that we should have been asking for many years now). We sometimes cling to methods because they are embedded in our traditions, cultures and systems, rather than because they actually work well for us. Collectively, we have gone to great lengths to protect essays, exams, lectures and other established ways of doing teaching and assessment. Perhaps, the threat of generative artificial intelligence is mostly a threat to familiar methods that are deeply embedded in our policies and practices, even if implicitly.

Perhaps, to counter the threat of GenAI, or, indeed, to be able to see it as anything other than a threat, involves being open to inventing new methods, and reshaping practices and policies. This is challenging; different methods imply different negotiations of standards and shared understandings. No wonder changing methods is not entered lightly. But inventing may not mean creating entirely new methods – it may be that elements of familiar methods can be repurposed. Methods have always needed to be pieced together according to specific problems and situations. Summative assessment design is difficult, in part, because many factors need to be taken into account, including, but not limited to, standards, security, gatekeeping, alignment with the aims of the course, expectations of different stakeholders, cost, practical considerations, and implications for learning. It’s also difficult because each method is limited in terms of the kinds of learning it can evaluate. No method comes anywhere near being perfect for most educational aims. Therefore, holding too tightly to any method does not signify rigour. Indeed, it may signify a lack of attunement to context or purpose.

Perhaps, the threat of generative artificial intelligence is mostly a threat to familiar methods that are deeply embedded in our policies and practices, even if implicitly.

In my entangled pedagogy paper (Fawns, 2022), I wrote that “… the greater problem [than focusing too much on technology] may be where teachers themselves start with a method before sufficiently considering their own or their students’ purposes, values and contexts. Choices about technology, tasks, social configurations and resources are then restricted by what is possible within an already-constrained conception of teaching.” Rather than worrying that GenAI is ruining our methods (e.g. essays or exams), perhaps, we can see widely-available GenAI as part of the context that should be considered, alongside purposes and values, when deciding on, or designing, methods of teaching and assessment.

Being open to creative possibilities that are enabled by technologies seems, to me, to be a very good thing, as long as we are able to reconcile them with the kinds of tasks, knowledge and social and material configurations, and ethical considerations, that we value. This, in turn, will be helped by making explicit what we value and why, and negotiating this with stakeholders. Neither prioritising nor ignoring GenAI, or any other technology, requires a holistic, complex view within which we zoom in and out on assessment design. This complexity requires multiple kinds of expertise, which means that we cannot design as individuals. We will need to break out of silos and collaborate. As ever, I am left to conclude that more conversation is needed, but it is conversation in which: methods are held loosely, technology is in sight but not at the centre of our design gaze (or holding what colleagues and I, in Fawns et al. 2023, have called an ‘on yet around’ focus), we are actively interrogating and appreciating relevant contextual elements, and where there is space to articulate and negotiate what is important to us and what we are trying to achieve. Simple.

In a future post, I will discuss why I think that programmatic and process assessments are promising avenues to pursue. However, I want to note that neither of these are methods or formats; they are ways of organising what is assessed to create a richer picture. (* Authentic assessment is also not a method). Methods still need to be designed – in relation to other purposes, context and values – that reflect programmatic and process principles. These principles reflect a move that has long been needed in education: away from content knowledge towards a focus on helping students negotiate the complexity of learning across complex contexts. Within this, perhaps, students should learn to critique, as a matter of course, not just the outputs of AI, but anything they produce, as well as the methods and practices through which they are produced. Perhaps, we educators and assessors should do the same.

Further reading and resources