GitHub Copilot may be perfect for cheating CompSci programming exercises

Source Node: 1630483

Microsoft's AI code-suggestion tool GitHub Copilot is showing itself to be so capable that educators may have to rethink how they teach computer science.

University of Massachusetts Amherst computer science professor Emery Berger earlier this month published a blog post warning educators that "students armed with [Copilot] will be bringing Uzis to a knife fight."

His concern is that Copilot will render traditional programming exercises – part of computer science training but by no means all of it – pointless because Copilot knows all the answers.

"As far as I can tell, Copilot was specifically trained on all the intro programming assignments ever," Berger wrote. "Copilot frickin’ loves intro programming assignments."

As far as I can tell, Copilot was specifically trained on all the intro programming assignments ever

For students using Copilot, he wrote, educators might as well describe their course objectives as "hitting the Tab key," in reference to the key command to generate code from a description of the desired output.

"Programming plays a role in a lot of computer science classes, and especially in introductory computer science classes," explained Berger in a phone interview with The Register. This often involves exercises to sort a list of numbers in a certain way or to find the nth element of a Fibonacci series, and so on.

"Copilot will just do them," said Berger. "It's not just that it does them and it does them well. It's also that it does them using the tools that you would want and expect your students to actually be using to write their code. If they start writing code and Copilot is installed, it will fill out the solution."

Berger said Copilot is different from searching for answers on Stack Overflow and other internet programming resources.

"You can already find examples of code online," he said. "But you know, the instructor can also Google for them and then compare that code against the code submitted with a plagiarism detector."

Copilot is different, he said, "It actually generates novel solutions. Not like they're super-crazy, sophisticated, genius solutions. But it makes new solutions that are superficially different enough that they plausibly could have come from a student."

It actually generates novel solutions ... that are superficially different enough that they plausibly could have come from a student

As a result, Berger argues, pedagogy related to programming needs to adapt. One approach, which he ridicules in his post, is "to plug our ears with our fingers and kind of shout while pretending [Copilot] doesn't exist, which is more or less the same thing as pretending plagiarism doesn't exist, and pretending that the internet doesn't exist."

"But if you care about the integrity of the process … this is just a cheating machine," he said. "Like somebody gives you a spec for an assignment, you just type in this back in comments and hit Tab, right?"

"So I don't think that it's reasonable or responsible to think that everybody is going to refrain from using this amazing cheating machine that's installed on their laptops … I think that the temptation is too great. And honestly, it's what software development is probably going to look like, very, very soon."

Berger acknowledges that Copilot is useful and says it makes sense developers would want to use the software.

"We just need to really rethink things altogether," said Berger. "Certainly from the evaluation standpoint, we can obviously just require people to do things in environments where they can't use Copilot. Just like elementary school kids don't get to use calculators when doing basic arithmetic. So we can have paper and pencil exams."

He said he has a colleague in Illinois who describes using computers that have been locked down for programming tests, so students take their exams in a controlled setting. These sorts of measures, and things like oral exams, he suggested, could help address some of the negative aspects of the availability of Copilot.

Berger also observed that Copilot has positive aspects, such as the ability to fill out boilerplate and to implement APIs.

"I don't think that memorizing the minutia of countless APIs is really interesting intellectually," he said. 'It's not the kind of thing we should really be teaching or focusing on. Do you know the exact syntax to create a DataFrame with these characteristics? I don't care. If you have to look it up on Google or on Stack Overflow, or you just hit Tab and it just does it for you, that sounds fine to me."

Nonetheless, he argues it's important for educators to make sure students are actually learning the material, which may mean rethinking how much homework assignments that can be solved with Copilot should count when calculating an overall grade.

Berger said it's probably premature to say that Copilot has had an effect on students, because the software has only been publicly available for a few months. But he argues it won't be long before its impact starts to show.

"I would like to be optimistic about this," said Berger. "But I think at minimum, we just need to be thoughtful of it. I just don't think that there are many educators out there who are aware of how much of a revolution this is." ®

Time Stamp:

More from The Register