GitHub Copilot guided software engineers at Australia and New Zealand Banking Group (ANZ Bank) to improve productivity and code quality. This test drive was enough for the financial company to implement the generative AI programming assistant into its operational workflow.
From mid-June 2023 to the end of July 2023, Melbourne-based ANZ Bank conducted an internal trial of GitHub Copilot involving 100 of its 5,000 engineers.
The six-week trial, consisting of two weeks of preparation and four weeks of code challenges, investigated how participants felt about using GitHub Copilot with Microsoft Visual Studio Code and demonstrated how the AI-based system helps programmers The purpose of this study was to measure the impact of Code quality and software security.
The results of this experiment are documented in a slightly more cleverly titled report. “The Impact of AI Tools on Engineering at ANZ Bank, an Empirical Study on His GitHub Co-Pilot within a Corporate Environment”.
Co-authored by Sayan Chatterjee, Cloud Architect at ANZ, and Louis Liu, ANZ’s Engineering AI and Data Analytics functional area leader, the report draws on several previous studies on programming productivity using Copilot. doing.
A study by Microsoft, which now owns GitHub, found that coding with an AI assistant increased productivity by more than 55%. This is not surprising given research from other vendors.
An ACM/IEEE study on AI-assisted programming suggests that robo-assistance is more of a trade-off. The results show that Copilot generates more code, even though the quality of the generated software is worse than human-written software.
While citing the potential benefits of AI in productivity, ANZ Bank also acknowledged that the technology “poses inherent risks, uncertainties and unintended consequences with respect to intellectual property, data security and privacy”, and noted that An attempt was made to conduct an evaluation.
These risks are highlighted in the ongoing copyright lawsuit against GitHub, Microsoft, and OpenAI over Copilot, but are not addressed in this study, other than agreeing to regulatory compliance.
“Prior to commencing the experiment, risks related to intellectual property, data security and privacy were assessed in collaboration with ANZ’s legal and security teams and a set of guidelines were developed,” the company said.
The bank’s experiment investigated how Copilot impacts developer emotion and productivity, as well as code quality and security. Participating software, cloud, and data engineers had to complete six weekly algorithmic coding challenges using Python. Participants in the control group were not allowed to use her Copilot, but were allowed to search the internet and use Stack Overflow.
“The group that had access to GitHub Copilot completed tasks 42.36% faster than participants in the control group,” the report states. “…code written by Copilot participants has fewer code smells and bugs on average, meaning it is more maintainable and less likely to break in production.”
Both of these results were considered statistically significant. Regarding safety, this experiment was inconclusive.
“The experiment failed to generate meaningful data to measure the security of the code,” the report said. “However, the data suggests that Copilot did not introduce any significant security issues into the code.”
The data suggests that Copilot did not introduce any significant security issues into the code.
This may be due to the nature of the task, which was designed to be short enough to allow participants to complete the task alongside their normal daily work. As a result, the submitted assignments were fairly short and did not leave much room for bugs, the report states.
Emotionally, users using Copilot feel positive, although not as strongly, about the experience.
“They found it helpful in reviewing and understanding existing code, creating documentation, and testing code. They spent less time debugging code, reducing overall development time. “We also felt that the suggestions provided were somewhat helpful and aligned well with the project’s coding standards,” the report states.
One interesting finding is that Copilot is most useful for the most experienced programmers.
“In a productivity assessment based on Python proficiency, Copilot was found to be beneficial to participants of all skill levels, but most helpful to participants who were ‘expert’ Python programmers.” says the study, adding that AI helpers provided the biggest improvements (in terms of productivity). Save Time) Tackle difficult tasks.
While moderately positive support from participants shows that Copilot can be further improved, the report supports introducing Copilot into banks’ production workflows.
“As of this writing, GitHub Copilot already has significant adoption within organizations, with more than 1,000 users using it in their workflows,” the report concludes, regarding Copilot’s impact on productivity. It adds that an extensive investigation is underway. ®
Counterpoint: Researchers claim that AI assistance is leading to a decline in source code quality