AI-assistance tools will inevitably change the way engineers work; in some cases, they already have. In the right hands, they can be used to create efficiencies and troubleshoot problems. But, an inexperienced engineer could just as easily use them to introduce serious flaws into a codebase, made even more dangerous by the fact that AI code often looks right at first glance.
At Byteboard, we’ve been thinking a lot about how tools like ChatGPT and Github Copilot impact the role of engineers, how we think they’ll change technical interviews, and how we can adapt our assessments to the mainstream usage of these tools.
How AI impacts the future of software engineering
AI-assisted tools are no doubt about to play a major role in the future of software engineering. In the short term, ChatGPT, as well as more specialized tools like Github Copilot, have demonstrated clear strengths as well as limitations. Its primary strength is the speed at which it can generate content. It can write dozens of reasonable-sounding sentences (or lines of code) in a matter of seconds, when the equivalent content might take a human minutes or hours to create.
But its primary limitation is its trustworthiness. Both in prose and in code, it can often produce correct answers, and just as often produce answers that only *appear* correct, but are significantly flawed upon inspection. In specialized fields like software engineering, it can take significant expertise to differentiate between the two. As the complexity of the problem increases, so does the frequency of ChatGPT’s mistakes, as well as the level of expertise it takes to recognize them.
Because of this, ChatGPT is currently most useful as a speed hack. Rather than starting with a blank slate, a software engineer can start by asking ChatGPT to solve their problem for them. It will then generate a significant amount of content far more quickly than the engineer could have written on their own. But if the problem includes any meaningful complexity, then in order to produce code that *works as intended* (or prose that is truly accurate), the software engineer has to take the AI-generated content and apply significant engineering expertise to correct ChatGPT’s (often well-hidden) flaws.
In other words, a poor software engineer can use ChatGPT to quickly produce software systems that appear well-built, but contain significant flaws. But a strong software engineer can use ChatGPT to quickly produce software systems that are well-built.
In the long term, we think AI-assistance tools will become more trustworthy, and more capable of producing correct work within contexts of increasing complexity. But it is unlikely that the fundamental ideas here will change; the more advanced the specialization, the longer it will take before AI tools can be trusted to produce correct work, and the more subject-matter expertise it will take for a human to be able to recognize and fix the AI-generated flaws.
How the role of “engineer” changes
Adoption of new tools and workflows always takes time (particularly so in specialized industries), so the coming AI-assistance revolution will happen over the course of the next few years, not the next few months or weeks. Only a minority of software engineers have integrated AI-assistance tools into their workflows, so most engineering work has continued exactly as it did before the introduction of these tools.
That being said, we do think there are three primary ways AI will change engineering for the organizations using it.
- Engineers will be able to accomplish some aspects of their work much faster.Engineers will have more time to answer the “what” and “why” questions of software engineering, while the AI-assistance tools accelerate the answers to the “how” questions. This means that the role of the engineer shifts towards thinking about product and systems design, though they will still be required to retain their technical skills in order to fix the flaws of the AI-generated code.
- Engineers will spend more time on code review than on code generation. Since the AI-assistance tools generate copious amounts of code with hidden flaws, engineers who are making use of them will spend more of their time carefully reading the code that the AI generates, and less time writing code themselves. This means that attention-to-detail has become a very important skill, while sheer human productivity becomes less important.
- Organizations will have to take more care to hire engineers with the right skills.Since AI-assistance tools can generate code that “looks correct” to the untrained eye, it will be all the more important that organizations hire engineers who can tell the difference between flawed code and correct code. Additionally, since the role of the engineer will shift towards product and systems design, organizations will need to hire engineers who can effectively analyze the product space and the organization’s goals. And finally, since the pace – and, ultimately, the reach – of engineering work will accelerate, it is of great importance that organizations select engineers who can be trusted to carefully consider the impact their work will have on customers, users, vulnerable populations, and the rest of the world.
How AI-assistance tools change assessments
The introduction of AI-assistance tools has introduced a new variable in hiring for software engineers. Until now, in order to assess a candidate’s technical ability, organizations have relied heavily on coding challenges: exercises in which a candidate is asked to write an isolated, complex algorithm in response to a clearly-defined prompt. We didn’t think those were great anyway—engineers don’t work in a vacuum, and no problem they’d see in their day-to-day work would have such clear requirements. But now, we have another reason not to like them: these problems are exactly the sort of tasks that ChatGPT is able to easily solve on its own. ChatGPT performs very well on tasks with strictly-defined prompts, clear boundaries, and singular solutions.
As such, we expect organizations that use questions like that to embark on serious anti-cheating measures, like only assessing candidates in in-person settings; blocking candidates’ access to the internet or their own IDEs; and requiring candidates to write their code using unfamiliar and restrictive mediums like pen-and-paper or whiteboards. These measures not only severely limit the candidate’s ability to showcase their talent by introducing stress and unfamiliarity and cutting them off from the tools they would use on the job; they also increase costs for the organization.
The other path forward is to introduce complexity into assessments. What makes real-world applications hard for ChatGPT is that nearly all real-world problems contain a particularly messy sort of complexity – the complexity that comes from context.
“Find the shortest palindrome in a given string” is easy for ChatGPT. “Given our existing codebase, revise our song recommendation algorithm to increase exploration and engagement for new users without upsetting power users too much” is hard for ChatGPT.
To us, the real issue with asking engineers to solve problems that are easy for ChatGPT is not that it makes it easy for engineers to “cheat” by using ChatGPT. The real issue is that being able to answer those sorts of questions is not what makes someone a good software engineer.
A good software engineer is someone who can perform real-world tasks, not someone who can write complex algorithms in isolation. ChatGPT is just accelerating our collective understanding of what was already true – algorithmic coding challenges aren’t really good assessments of the expertise and skills required to be a software engineer.
At Byteboard, we’re facing this new challenge by continuing to add complexity and ambiguity to the coding tasks for our SWE assessments, thinking through what skills are becoming more (or less) necessary in the AI-assisted age, and considering a variety of mechanisms to more deeply assess how well a candidate understands the goals and context of a question. We’re also looking into other anti-plagiarism tools, but our goal is not to simply become a cheating prevention service – it is to assess candidates fairly and effectively in a role where what competency means is rapidly changing.
We aim to design assessments that make it impossible to successfully “cheat” with AI tools, because performing well requires candidates to engage specialized skills in tasks with real-world complexity. Eventually, we expect AI assistance to become like Google – a resource that everyone is expected to use to do their job most effectively. Our tools are built with that future in mind.