How much is too much?

Drawing the line on AI-assistance.

Jan 17, 2023

This letter discusses a bound on AI assistance to academic work.1

In a recent internal video memo, Susan McCahan, Vice Provost at the University of Toronto,2 raised a number of questions about the impact of new AI-tools. I am picking up on two central ones:

How much assistance from an AI system is too much? What constitutes academic misconduct?3

We’ll discusses the question of AI assistance here. The question of AI co-authorship is related and we’ll focus on that in the next post; academic integrity and academic misconduct will be covered after that.

Let’s clarify first what we are talking about: we’ll use the term submitted work for any work – text, video, audio, performance, or really anything – that a student submits for assessment; assessment here means an evaluation by anyone – lecturers, peers, review boards and committees, even the students themselves – with reference to the educational objectives.

As McCahan points out, the question is not exactly new: homework help from paid tutors is a reality, albeit not at the scale that we will see once ChatGPT comes to class in everyone’s back pocket. You might think we are far from that point – but we surely won’t disagree on the need to be prepared. All the more, since this gives us an opportunity to examine some of the foundations of the academy. I hope that you are quickly convinced that that is necessary since:

Assistance from an AI system is too much when it interferes with the educational objectives, or the assessment of a submitted work.

Giving meaning to this statement needs clarity on both: objectives and assessment.

Educational objectives

One way in which AI tools can become problematic is when they block the precious opportunity of beginning an enquiry with ones own first-thought. Of course, this is not only a problem of AI-tools, Albert Einstein once remarked that anyone “who reads too much […] falls into lazy habits of thinking”,4 and this is especially true of premature reading, reading before one’s own mind has had a chance to engage with the problem. In fact, I think this awareness is such an essential skill to hone, as we are going to rely more and more on the suggestively customized, synthesized answers to our queries, that I will introduce a convention to these newsletters: pause - think. The icon below will alert you that some list, hypothesis or other conceptual framework is going to follow, and you might want to engage with the question before you encounter the answer: pause, think, perhaps note down your answers and only then have a look – because if you don’t, it may become hard to question the premises that went into a particular approach.

Framing education in terms of specific objectives has a long history.5 To structure our argument, we need to define some categories that we can refer to. Let us list the different dimensions of educational objectives:

Educational objectives: dimensions and key concepts –

Knowledge: this is the domain of remembering facts and concepts, their context and relationships, and how different perspectives affect them;
Skills: this is the domain of abilities, including physical abilities, habits like documentation and time-management, and creativity;
Thinking: this is the domain of reasoning that includes abstraction, categorization and symbolic manipulation; comparison, with its related domains of evaluation and judgement, and structuring and planning; logic, and the integration of these in problem-solving and decision-making;
Appreciation: this is the domain of aesthetics, the ability to engage with the arts, the ability to bring ideas into harmony, the ability to recognize and appreciate quality, and the joy of a reflected life;
Values: this is the domain of ethics, motivation, courage, and integrity;
Socialization: this is the domain of relationality, including communication, collaboration, empathy, respect, and understanding of diverse cultural norms and customs.

Pursuing Education

The kind of holistic education we pursue at an institute of higher education covers all of these dimensions to some degree, either explicitly or implicitly. Many of these require effort and practice, and taking shortcuts may prevent achievement. Mind you, we are not talking (only) of performance – performance could indeed be outsourced to a tool, but we are talking about personal growth, or its incarnation as learning. Of course, we will have different methods to learn with AI assistance, and of course this may ultimately lead to education at a higher level for the same investment of effort and time.6 But we need to be explicit what exactly we mean when we claim that AI interferes with our objectives. A starting list could be conjugated through the dimensions we listed above:

Knowledge acquisition needs structure and engagement. In order to become available to thought, the facts of knowledge must be connected to other facts, and their retrieval must be practiced. The crucial part is the formation of associations: understanding is an aesthetic phenomenon of harmonizing associations. Above, I mentioned the importance of priming a framework of understanding with a first-thought, but consumption of synthesized solutions, rather than struggling with their construction, removes a crucial moment of practice, prevents the formation of associations, and prevents engagement, that arises, for example, when we encounter ambiguities and resolve them on our own – or accept them. This is emphatically not to say that the AI-tools are useless for knowledge discovery and learning, on the contrary: the amazing capabilities of summarization help us to focus our efforts, and the algorithm’s patience in clarifying required background knowledge in the role of a personal tutor can significantly speed up comprehension. But we need to emphasize that these ought to be supportive roles, and that the availability of answers cannot substitute for learning to answer. “Too much assistance” comes too early, and substitutes for engagement.
Skills require practice. Here too, there is enormous potential for personalized instruction: bespoke training plans, and practice problems; though the technical solutions are not yet quite ready, this is now an engineering problem which will be either solved by specialized interfaces to the existing AI that can capture performance, identify weaknesses, and suggest practice regimes; or more generalized AI, that will be able to perform in such a supporting role as part of its generic abilities. Soon. “Too much assistance” prevents practice, by substituting product for process.
Thinking requires tools. In a generic sense we can think of strengthening thinking ability to involve acquiring and honing patterns of mental manipulation: look for generalizations, consider implications and entailments, compare features and contrast their values, and more. The goal is often intuition, rather than precision –here, the analogy to the pocket calculator is simplistic but illustrative: we would not compute a square root by hand, but we should be troubled if we would not be able to estimate it. Such intuition is crucial for critical thinking: the refusal to accept pronouncements on the basis of authority, and to require being persuaded by evidence instead. “Too much assistance” makes results available that are not thoroughly understood and evaluated like we would evaluate our own thoughts.
Appreciation requires effort. The subtleties of a work of art are not apparent before we spend some time with it. Understanding quality will not come without having seen a lot, heard a lot, read a lot. The joy of insight is gained, not given. “Too much assistance” removes effort, denies opportunities, and thereby diminishes a sense of achievement.
Values require authenticity. Taking a determined ethical position requires to connect the matter at hand to a personal sense of right and wrong, proper and improper. There are no universal formulas, every situation is different, and we ourselves change over time. But if we are unable to find positions that we may properly call our own, and then substitute ideology where value pluralism is required, our inauthentic behaviour becomes a lived lie. Letting the AI decide on our behalf is such a trap. “Too much assistance” produces value judgements that are not based on our own convictions, and are therefore meaningless.
Socialization requires exposure. To the degree that we are social beings, our very being is constructed via socialization. Substituting a non-human “mind” is bound to have an impact. That said – we simply cannot know yet whether this is bad, or neutral, or even good. It would be a mistake to take a statement like “AI diminishes our social dimension of being” and make it a premise for an argument.7 Nevertheless, as long as we feel that the various aspects of socialization are valid educational objectives in and of themselves, then “too much assistance” is assistance that moves learning into a relational vacuum.

Mind you, we could expand this list a lot, though we might commit the same sin that we are accusing the AI of: enumerating aspects of a problem domain without regard to the inner coherence of our enumeration. I think the key take-aways are already apparent: “Too much assistance” is that which interferes with educational goals because it diminishes engagement and practice. “Too much assistance” is a substitution of a product for a student’s own, meaningful process. “Too much assistance” fails to become an expression of the self.

The construction of learning objectives against this background requires insight and imagination.8

But do students even want such an education?

Well … yes – but. The fact that we even need to have this discussion points to dislocations that shape our reality. Ideally, we would expect students who are investing very significantly into their education to fight tooth and nail to get the most out of their investment, i.e. to reject for themselves and in their own best interest any use of AI that is “too much”. In reality however, many are exactly concerned with a metric (good grades) and not the content (education); they are locked into a transactional model (work for recognition) which perpetuates a teacher / learner dichotomy, and they are disillusioned about how educational achievements translate into a better life. A certain academic structure – of which we are collectively a part – becomes a culture. AI only illuminates this, though this time there is no easy fix, like changing exam parameters, or engineering modes of assessment.9 And that brings us to the other side of the coin: assessment.

Assessing achievement

If we have discussed above why and where to draw the line, we must ask next how to recognize when it has been crossed. Can we still assess achievement if we cannot distinguish the student’s contribution?

Those of us who realize that ChatGPT can in general pass our courses have probably also identified the root cause by now. We assess achievement through proxy measures: the recall of isolated facts substitutes for knowledge, the eloquence of expression substitutes for sophistication of understanding, and the volume of topical writing substitutes for depth.10 We have important reasons to do that, but all three measures are failing us completely – or rather: they can be achieved by a readily available algorithm, for free. They have lost their value.11

It is not that we don’t know that we ought to be assessing quality instead. But quality cannot be quantified, nor can it be decomposed. Assessing quality is hard – and generally does not scale. However, a commitment to scale education for society is a cornerstone of the post-industrial university; the commodification of assessment plays a central role, in teaching, as well as in certifying the outcomes as a basis for career-forming qualifications. This cannot simply be removed.

Two strategies summarize the immediate reactions we have seen: we can make assignments AI-proof, or we can attempt to distinguish AI-contributions from human contributions, and only assess the latter.

Approaches to AI-proof assignments include: to assess only in supervised environments, to make assignments highly specific to in-class discourse, to base assignments on the interpretation or generation of images, and to request verifiable references.12 Some of these will be vulnerable to soon-to-be-expected improvements in the tools. But a more fundamental problem is that such assessments become increasingly disconnected from the conditions under which students will work outside the academy.

As for approaches to distinguish AI-contributions, the more I know about this topic the more skeptical I am. We read suggestions to collect a sample of “authentic” student writing for stylistic comparison, or we see reports of tools that are supposed to be able to identify AI-writing13 – or other quick-fix approaches to structural problems. I believe: the only robust approach to identify AI-contributions is to ask students to voluntarily disclose them. And students will only disclose them when there are no negative consequences, or when we even incentivize disclosure.

But is there not a third way? Are we not again focussing on the wrong thing, when we try to detect the hand rather than evaluating the contents? How about simply assuming all submissions have their GPU ghostwriters and to follow the idea instead. If a student can make a high-quality idea their own, whether it was written by the AI or taught in class, is this not what we call learning?

A new alliance

Thus, an approach that at first may sound radical, is to take the decision “how much is too much” out of the institution and put it into the hands of the student. This is not just because we ought to be sceptical about our ability to recognize, let alone assess any boundaries. It is above all, because what is “too much” is different for different people, and at different times, and in different contexts. If we nurture the sensibility of each student about the value that self improvement has to them, learning becomes an expression of the self, and not merely positioning oneself in an escalating competition.

It is not that we don’t understand what needs to be done, we have just not applied it to our own teaching, because there are technical and generational barriers involved, and we are lacking creative models and incentives. Students spend millions of hours each day with their gaming consoles, not with learning – because we are not making education equally exciting. Online Twitch streams have tens of thousands of viewers while we are concerned about diminishing class attendance – because our content is less engaging. TikTok trends reach hundreds of millions while we worry about course enrolment – because we neglect the social rewards of collective achievement. Influencers change the discourse of a whole generation while our values are fading into irrelevance – because we have not learned to harness the power of role-models. We will need to work on all of that.

Of course motivation will not replace assessment. The academy’s stewardship includes upholding standards of excellence and integrity. But such stewardship cannot substitute for enabling students to make their own decisions responsibly on their own. What we need to do however is to assess in a way that correlates with the joy of learning and the quality of achievement, and to do this in an alliance with our students.14

Policy

In the end, we need to translate this long exploration into policy. The devil has a way of getting into the details – but if we can’t follow up with policy, and policy with teeth if it need be, it is all just opinion.

What could a policy for AI-contributions look like?

Here is an example:15

The use of AI tools in the preparation of submitted work is permitted without restrictions and will not lead to deduction of marks, if it is properly documented.
For the purpose of assessment, an academic submission in its proper sense is that part of a work that surpasses the capabilities of algorithmic writing. Performing at this level is a challenge, but as faculty we commit to work alongside with you to help you succeed; this is not an issue that concerns any individual course, it is a challenge for your future.
You must be able to demonstrate your understanding and mastery of all submitted work – no matter what the relative contribution of AI-tools and your own ideas are. This will be rigorously assessed.
As well, you take full responsibility that your submitted work is factually correct and sourced according to our standards. Be aware of the risks of “fabrication” and other academic misconduct.

TLDR

Too long, did not read? “Too much AI contribution” is any part that diminishes our students’ achievement – and this needs to be evaluated across all dimensions of our learning objectives. Assessing submitted work in this context cannot be done without the students’ contribution. A new sense of alliance with students is needed, that emphasizes how our assessment aims to protect their future. Concrete policies need to express the goals, concrete exemplars need to provide role-models of change.

If we are advocating the joy of learning, let that start with us.

Feedback, comments, and experience are welcome at sentient.syllabus@gmail.com .

Sentient Syllabus is a public good collaborative. To receive new posts you can enter your email for a free subscription. If you find the material useful, please share the post on social media, or quote it in your own writing. If you want to do more, paid subscriptions are available. They have no additional privileges, but they help cover the costs.

More Resources ▷

Cite: Steipe, Boris (2023) “How much is too much?”. Sentient Syllabus 2023-01-11 https://sentientsyllabus.substack.com/p/how-much-is-too-much .

Dr. McCahan’s portfolio includesVice Provost of Innovations in Undergraduate Education, and Vice Provost of Academic Programs.

McCahan, Susan (2023). “Brief AI tool demo January 2023”. Video memo for University of Toronto Faculty, 2023-01-05.

In contrast to so many other Einstein quotes, this one is actually true: “Reading after a certain age diverts the mind too much from its creative pursuits. Any man who reads too much and uses his own brain too little falls into lazy habits of thinking, just as the man who spends too much time in the theater is tempted to be content with living vicariously instead of living his own life.” Einstein, Albert (1929) Interview conducted by George S. Viereck “What Life means to Einstein”. The Saturday Evening Post 1929-10-26 p. 113. (Source)

What I attempt here is a synthesis of elements, and the details will certainly merit more discussion. Wilhelm von Humboldt’s idea of placing self-improvement at the centre of “Bildung” is timeless, and to name just a few complementary directions, John Dewey’s thoughts about art, experience, and political competence are as fundamental as Paolo Freire’s goals of empowering critique through pedagogy. Thinking of mechanisms of how education works – perhaps between the poles of Lev Vygotsky’s social learning and Jean Piaget’s constructivist thoughts – provides important perspectives on process; modern theories of mind – to name Howard Gardner’s emphasis on multiple dimensions of intelligence as only one example – have done a lot to promote a differentiated view of the subtleties.

Although – we have to tread extremely cautiously here, due to the licence the algorithms (still) take with facts. Cf. ChatGPT's Achilles Heel.

Especially neurodiverse students may have a decidedly different opinion on that point and may in fact prefer AI interactions to those with human educators. A contribution on the message-board platform Reddit described the experience of a CS student with ADHD, who found studying with the help of ChatGPT extremely helpful. While it is not possible to verify the claim itself, the comment was picked up on Twitter, retweeted hundreds of times within a day, and liked over three-thousand times, demonstrating that the issue is hitting a nerve.

In the Sentient Syllabus Project, we are collecting concepts in the Learning Objectives document.

It really can’t be repeated often enough: the fundamental change is not that “cheating” in knowledge work has become more refined and accessible, it is that the methods are readily available to our students’ potential employers, to perform such knowledge work without pay.

Scholarship discusses these issues as “construct underrepresentation” (measures assess only part of a desired outcome), and “construct irrelevance” (measures correlate with factors that are unrelated to the desired outcome, such as family income, or the ability to cheat) – and they are at least as old as the Chinese civil examination system, established in the Sui dynasty around 606 CE. Cf. SUEN Hoi K, and YU Lan “(2006) “Chronic Consequences of High-Stakes Testing? Lessons from the Chinese Civil Service Exam”. Comparative Education Review 50(1):46–65. https://doi.org/10.1086/498328 .

These proxy measures of education have not only lost their value as metrics of achievement, the actual value of the underlying aspects of education – knowledge, understanding and depth – has been damaged by association: it will become increasingly difficult for our students who enter the workforce to demonstrate the added value they bring to a task that is assessed by such proxy measures.

These items reflect some of the “first-generation” resources that were posted by the time of this writing: “Artificial Intelligence Writing” at the University of Central Florida; “ChatGPT Resources” at Texas Tech; and “Artificial Intelligence Tools and Teaching” at the University of Iowa; and, of course, our own resources at the Sentient Syllabus Project.

Hint: these don’t work. (a) detection must happen in an “adversarial environment”, in which human ingenuity is searching for ways to circumvent the detection algorithms and there are many ways to modify the AI text; (b) any measurable discrepancy between human text and AI-text immediately becomes an engineering target: after all, it is the stated purpose of language-models to produce natural sounding, human-like writing; (c) most significantly, in an assessment context, let alone in the context of an academic misconduct investigation, the false-positive rate must be indistinguishable from zero. If we cannot prove AI-authorship, it is probably not a good use of our resources to allege it, and we surely don’t want to foster a culture that implicitly rewards those who are most capable to keep up their bald-faced denial.

To put this into practice is challenging, and to face this challenge is exactly why we founded the “Sentient Syllabus Project”.

This is a first draft, may be updated here as I can consider feedback, and will at some point enter the Sentient Syllabus Resources. Updates will be noted here.

Sentient Syllabus