OpenAI released ChatGPT nearly two years ago, in November of 2022. Which means that for the past two years, people have been debating the impact of ChatGPT and other large language models (LLMs) and their role in education. Advocates extol the virtues of the technology, and claim that it will revolutionize education. Educators, on the other hand, have more often decried LLMs for their deleterious effects in the classroom. Teachers lament their job being reduced to nothing more than LLM correctors. They compare the use of LLMs to doping in sports. LLMs have caused cheating to run rampant. It is a crisis, and to some that crisis is existential for higher education. But I remain skeptical. Claims of technological revolution and technological doomerism alike tend to be overblown. Both overestimate the degree to which technology determines our cultural and social realities.
The writing of these beleaguered educators, however, reveals more than just their disdain for LLM tools like ChatGPT. They liken LLMs to giving students an answer key. But an answer key is only useful in lieu of memorization. They lament the ease with which LLMs enable cheating, but cheating was already on the rise. They write about their seeming impotence in the face of overworked or grade oriented students. They are correct on all counts. But a savvy reader may have already noticed that these problems are not caused by LLMs. They predate them. This mirrors my own experience as a teacher: LLMs worsened existing problems rather than created new ones.
The American educational system already suffers from perverse incentives. Course work already encouraged students to focus on getting good grades rather than mastering material, and overworked students already looked to save time and effort on their assignments. I was struggling to get my students to engage with the material through their writing well before LLMs. If I gave a detailed rubric, then I got a pile of essays all with sections ripped directly from the rubric. Students rarely explored or played with concepts, and they always tailored the language and register of their writing to reflect what they thought would sound most intelligent to me. And I’ve never seen so many people all on the verge of an anxiety attack at once as when I announce to my classes that there isn’t a grading rubric. My students' fear of bad grades could make them downright frantic if they didn’t know how to tailor their work to their grades.
And why wouldn’t they do this? They had at least three other classes aside from mine. And I taught social science to engineers, so those other courses were usually major requirements and thus higher priority and more time intensive than mine. No matter their interest or desire to learn, my students were punished with poor grades in their other courses if they spent the time and energy to delve into the course material with genuine curiosity. Those who did, usually had to scale back their effort by the time midterms came around. Their work had to be efficient and purposeful.
And can we really say the purpose of education is learning, or mastery? While engineering is especially egregious in this regard, it isn’t fundamentally different from most K-12 education. As soon as standardized testing starts, teachers must teach to the test or suffer the consequences of being a “failing school.” Shepherding a love of learning and providing students with the skills to explore their own intellectual interests has to take a back seat. By the time students reach highschool they usually either see school as inconsequential to their future, or a cut-throat competition where they have to achieve the highest possible performance or lose out on opportunities to their peers. Is it any wonder that the undergrads I taught, after 12 years of training to care about test results and GPA, were willing to use whatever tools they have available to those ends?
Even so, my experience with student use of LLMs was far from the educational cataclysm some authors write about. Student use of LLMs was not as ubiquitous as you might expect. And most use was relatively innocuous. Some students used LLMs for research. Others as a sounding board, a discussion partner to test out ideas. Students whose first language was not English often used LLMs to check their grammar and sentence structure. Some students used them to write rough drafts, but I required drafts to be turned in to me, so I can safely say these drafts always received substantial edits from the students. The few times I suspected students of using an LLM to write their papers I didn’t feel the need to pursue it. The papers lacked adequately detailed use of the course material to receive a high grade anyway. Of course it felt frustrating to give feedback on a paper I knew the student didn’t write, but no less frustrating than the far more common problem of my feedback being ignored.
One event that stands out to me the most was a student who actually turned in two papers for an assignment in one of my classes. They wrote one, and Claude wrote the other. I was shocked to find that I would likely have given the essay a good grade! So I asked the student to show me their process. The work the student put in to creating the prompt honestly showed more understanding of the course material than most essays students write themselves. They had an outline of their argument, which course concepts they wanted to use to support which parts of their argument, accurate and succinct definitions for each concept, accurate and succinct definitions of the theoretical models from the course, instructions on how to use those in the essay as well, and instructions on what evidence from the course to use, and which course sources that evidence came from. In other words, they learned everything I wanted them to learn from the assignment. Sure, if they had just turned in the Claude essay they wouldn’t have written it, but for my class, that writing is a means to an end.
I also found the writing obvious to identify. Not because it was particularly bad, but because it wasn’t tailored toward the grading rubric. Normally, if I say I’m giving a grade based on accurate definitions of course concepts, students will often have a paragraph that flat out states what concepts they are using, what their definitions are, and how they intend to apply those concepts to their argument. Claude, instead, introduced each concept at the point in the essay where that concept became important. In general, the writing from Claude aimed to persuade, while the writing from my students aimed to get good grades.
One might oppose student use of LLMs on the basis that they plagiarize inherently. By definition, they can only use the writing of others. Even if that weren’t a sticking point, students that use it still are turning in writing that is technically not their own. By forbidding its use, teachers demonstrate to their students a commitment to the principles of academic honesty. This lesson alone is a good enough reason to ban LLMs for writing in a class. But that is still far from the death of the college essay.
What I found was that the very same strategies I used in my classes to deal with the existing shortfalls in education as a whole also worked to curb bad uses of LLMs.
First, I talked with my students about it. I asked them what they thought about it, and gave them my perspective. We then worked together to create the rules for the course about its use: what was off limits and what the consequences for breaking those rules were. Students are so rarely given agency in their education. They don’t have a say in their curriculum, what subjects they study, what courses they must take, which texts they use, how they’re graded, what the assignments are, or even what parts of their education are most important! The students’ only role is to accept all of this with the kind of work ethic we would expect of good, austere, puritan children. Is it any wonder that most of those students don’t actually respect the educational system and that those that do are more invested in high achievement rather than learning? Yet we know students with more agency are more interested and enjoy learning more. More agency improves information retention, subject competency, and student confidence . Democratic schools, which focus on student agency, show these positive outcomes in spades, and show no negatives compared to their peers when entering college or the workplace. When students are actually given back some agency, in my case by actually participating in setting class policy, the teacher gets a disproportionate return on the students’ investment in the course.
Second, I design my assignments and grading rubrics so that learning and getting a good grade overlap as much as possible. Rather than giving a minimum number of course concepts students must use, I grade based on how proficient they are using those concepts to make an argument. If that takes one or ten, so be it! I always make writing quality itself a minimal part of the grade, since my classes don’t teach students how to write better. The downside is that I can’t offer students a strict rubric to guide them. But the stricter the rubric, the more the assignment merely assesses a student’s ability to follow directions. By carefully tailoring my grading assessments to match my learning objectives as closely as possible, the more intertwined mastery and a good grade become. Thus, the students’ incentive to get a good grade also intertwines with their learning. If they don’t feel that using an LLM dishonestly offers them an advantage, they are less likely to use it that way.
LLMs aren’t creating a new problem. They are exacerbating problems already replete in higher education. Claiming that the problem originates with LLMs can be a form of technological determinism. This is the idea that technological development drives social change. We’ve seen similar arguments before. Way back in 2008, Nicholas Carr asked “Is Google making us stupid?” Today, we blame all sorts of social ailments on social media and smartphones. In a sense, technological determinism is obviously true. The technologies we choose, or more often which we are subjected to, clearly have an impact on the kind of society we live in. But we may overestimate how substantial technology is. Perhaps Google did change technological civilization to be more utilitarian and less thoughtful, as Carr suggests. Or perhaps the drive for increasing work productivity, longer work hours, and the need to pack more leisure into shrinking free time pushes people toward utilitarian technological solutions to those problems.
Engineers don’t develop new technologies in a vacuum. And consumers don’t purchase them in one either. During graduate school, I would have loved to sit in the library skimming book after book and journal after journal as part of my research. But it's publish or perish and those who are more utilitarian will publish more and the rest of us will be left behind. So I used Google Scholar. Even in my new office job, I would equally love to dive as deep as I could into each project I’m working on. But I have deadlines and a lot of competing projects. So I quickly google whatever I need to know and move on. Google didn’t do this. The incentives were already there. In a sense, it was our own social and cultural drive for efficiency and growth that created Google. In the same sense, students choose to use LLMs like ChatGPT because of the deficiencies in our education system. Not the other way around.
And yet, there is still something a little insidious about LLMs in education. I won’t dismiss the complaints of fellow educators so lightly. And my own experience is not universal. For example, I can accept some students’ limited use of LLMs to help with their essays because it doesn’t interfere with what I’m trying to teach them. But what if I was teaching writing? If the point of the essay was to practice the craft of writing itself, any use of LLMs would interfere with student mastery. My undergraduate physics professors banned calculators, and constantly fought against students using tools like mathematica to take shortcuts. Not because they didn’t think such tools could be useful, but because doing the work on our own supported our mastery of it. I honestly don’t know what I would do if I were an English teacher.
You don’t have to be an English teacher to recognize the consequences of LLMs either. One author equates it to using a forklift to lift weights in the gym. Accepting LLMs as part of our educational system also means accepting that education will not strengthen or grow our students’ thinking and curiosity. Even if college students aren’t all going on to be academics, the institution where they learn is still academia. Therefore, allowing students to use LLMs violates academic integrity, and the legitimacy of academia along with it. Think about the recent scandals around plagiarism in academia, then imagine the scale of such a scandal if LLM writing were normalized! Finally, consider issues of equality and fairness. The more widespread the use of LLMs becomes, the more pressure students will have to use it. If every other student is using Claude or ChatGPT to get better grades, or write job applications, or letters for prestigious internships, would you want to risk your future trying to compete without it? Just because problems predate LLMs, doesn’t mean they don’t make them worse.
The impact LLMs have had on the everyday lives of educators is undoubtedly significant. Public School Board members and University Regents could go their entire careers without having the same degree of influence over educational practices. In fact, the last time I saw this kind of existential dread was DeVos’s appointment to the Department of Education, and the coincident worries around school vouchers, charter schools, and the privatization of public education. The difference is, I vote for my school board. Even DeVos, an appointee of a President elected through a surprisingly undemocratic process, was kept in check by elected officials in Congress. But what mechanisms are there to hold Sam Altman to such a public account?
Political scientist Langdon Winner pointed out that technologies often have just as much, if not more, impact over our lives as political legislation. Yet we have much less democratic control over the development and implementation of technology. We did not choose to use CFC’s in aerosols or refrigerators, nor did we choose to add lead to gasoline. But it *did* take democratic action to solve both of these problems. LLMs show that principle in action once again. As Taylor wrote in his article “What is Democracy for?”, LLMs have had a substantial impact on education, and yet no educators were consulted in its creation and use. Now, students are the primary users of ChatGPT, and despite the fact that Open AI claims to have a way to tell if their product was used in a piece of writing, they steadfastly refuse to share it. If we truely value democracy, then why is that a decision they can unilaterally make?
As educators, I think we have all just been getting by. We know we are working in an imperfect, even harmful, educational system. But in many ways that system is manageable. LLMs didn’t create that system, or the problems it causes. LLMs, instead, pushed that system to its breaking point. Students have always been tempted to cheat because the educational system focuses on extrinsic motivations. But we could always try to sneak in some intrinsic motivations and to make cheating just hard enough that most students won’t bother. But LLMs make cheating trivially easy. LLMs didn’t make the problems with American education. The whole thing has been cobbled together with duct tape for quite a while. LLMs were just the jolt that made the whole thing fall apart.
On the one hand, LLMs reveal a problem with democracy. Teachers don’t get to choose what the educational system looks like. They just work in it and can make small adjustments around the margins. And they didn’t choose to unleash LLMs like ChatGPT onto this already barely holding together system. That's an awfully small amount of agency for a supposedly democratic society.
On the other hand, LLMs may force decision makers to finally deal with the problems of American educational systems. The sheer scale of their disruption reveals the flaws in how we have structured education in this country. It is much more obvious now how things like over work and overemphasis on grades impact learning outcomes. Perhaps now we will actually get to make the decision: is our educational system a publicly funded job preparation program, or is it a system for ensuring a thoughtful and curious citizenry?
I agree with a lot of this. I think assignments can be written--which ask students to make specific arguments and draw on specific readings-- that are difficult for LLMs (which are mostly good for clear-cut tasks: synopsis, summaries, shortening, etc.). Which is to say: the college essay is not dead and students are not going to get good grades if they use LLMs to write such papers. I also agree that assignments should be designed such that learning and grading coincide. I try to do that. And I also agree that LLMs are exacerbating problems that already exist; students were already doing something like this (copy-pasting from google searches and what not) and now they can do it even more easily.
Where I think I have some reservations is the hope that if the "incentives" of higher education are solved, things will be okay. I don't think students cheat just because they are overworked or because we emphasize grades too much (on which more below). I don't think making assignments as interesting as possible and as much worth doing as possible (which we should do anyway because its good pedagogy) will just make students stop using LLMs. So some kind of enforcement is needed (which I think is also where you end up at the end).
As to your point that students care too much about grades, one of the saddest realizations I've had this semester is that the assignment that has been most abused this semester is the open-ended reading response that is graded *just* for participation. It's not even high-stakes! All the reading response has to do is demonstrate engagement with the reading (through a few different options). They don't even have to do ALL the readings for the week, just whatever they can manage. And they don't have to do it every week, just enough times to get a certain number of points. I try to keep the readings interesting (lots of journalism, podcasts). I give them reading notes so that they know what to look for while doing the reading.
And I've been realizing all semester, just based on how everyone is writing, that students are putting the reading questions into ChatGPT and then editing the answers. Not all, obviously, but a substantial minority. The point of the reading responses is to get them to read. So if it's not even accomplishing that, what's the point? And I have a grading schema such that students can pass a course (with a D) just by doing the low-stakes participation-graded assignments. But if those are done by LLMs, then my passing grade is meaningless.
So next semester, I'm ditching the reading responses and I'll probably do in-class quizzes or free writing. It's not going to be fun for the students. Or me.
On the point of students using LLMs as a plagiarism machine rather than a tool like a calculator (and the slide rule and abacus before), I’m reminded of the decade-old video essay “Humans Need Not Apply”, wherein the author argued that lawyers would be the first workers to be replaced because LLMs are good enough at pattern recognition to make effective syntheses of case law, which is most of a lawyer’s day job.
I’ve found ChatGPT and Claude to be effective summarizers/sounding boards (and terrible screenwriters). I mostly use them to make sense of the Substack articles I get tired of reading midway through. Perhaps lawyers will uniformly use LLMs similarly, as research assistants in the future. The ideal, of course, is to fix the incentives of the education system, but I find it bemusing that way more students could look at that and want to become lawyers because it was the easiest thing to do. What fodder for satire!