AI-Assisted Development vs. Outsourcing

Navigating the Complexities of Measuring Developer Effectiveness

By Chris Ford, CEO & Managing Partner and BC Holmes, Chief Technologist
Published: March 19, 2024 in Blog

Thoughtworks regularly produces their “technology radar” report — an assessment of the products, tools and methodology changes that development teams are experimenting with. They also release a higher-level set of broad industry trends. This year, one such broad industry trend relates to measuring developer productivity.

Their take is in line with what many others in the industry have concluded: almost all traditional forms of measuring developer productivity — especially forms that measure output in some way, such as lines of code or number of pull requests — fail to provide a useable way of tracking developer capability or project health. I think Thoughtworks is correct in their assessment: measuring developer productivity is hard because development activities take on a lot of different forms.

But here are two complications resulting from our inability to measure real developer productivity.

First, outsourced development shops sell themselves on how much cheaper their engineers are. The daily rate of an outsourced developer from somewhere like India is often one fifth the cost of a local developer. In the minds of product leadership, that usually translates into “all else being equal, the same project costs only 20% of the overall amount if I use outsourced developers”. The problem with that assessment is that “all else” is not equal.

I was on a project several years ago where, during the QA phase of the project, the client IT manager supplemented the bug fixing team with some offshore developers. That client told us that this direction was coming from the highest levels in the organization: they were mandated, by the most senior IT leadership, to use these offshore teams.

The awkward truth was that the offshore team introduced more bugs than they resolved. To me, this was a clear and objective measure of developer effectiveness: the offshore team’s contribution, measured in bug numbers, was negative. To be fair, the project did not really have a practice of assessing the effectiveness of the different teams because, again, measuring developer effectiveness is a hard exercise. But it’s difficult not to conclude that it didn’t appear to matter that the offshore team’s contribution was a net negative because they were inexpensive.

Even though we don’t have a clear way to assess developer effectiveness, the goal of such an assessment is clear: we’re looking for ways to make projects better, faster, and cheaper. So in the absence of clear measures, we can only estimate the effectiveness of outsourcing based on past experience.

The Thoughtworks radar trends give us a new option to consider in service of this goal: AI-assisted development. That additional option segues neatly into the second complication I wanted to discuss: if we’re going to decide whether or not to use AI-assisted development tools, we’re going to need to think about the cost-benefit implication of such tools. And because measuring is hard, it’s hard to quantify the uplift that one can realize in an AI-assisted development environment. Microsoft, for example, confidently reports 55% productivity improvement when developers use GitHub Copilot. Should we take that report at face value?

Let’s try to make our own estimate.

The following diagram tries to imagine the different activities that a typical developer engages in during a 40-hour week. In truth, these percentages would vary week to week throughout the life of a project, but we’re averaging.

Our assessment is that outsourced delivery is considerably less experienced and less skilled compared to a baseline.

For example, whereas a local development team member might spend about 4 hours a week understanding the requirements, an outsourced developer spends 5 hours (25% more). What this results in is a net effort that’s approximately 17% higher than local development. That translates into almost 7 hours more development time per developer effort week.

And that doesn’t even count the increase associated with non-developer activities: we know, from past experience, that the communication effort is higher, the need to author business and design requirements is significantly increased and the cost of project coordination and oversight is major.  Further, this analysis doesn’t even broach other aspects such as code quality and maintainability. The net result might be cheaper, but it is not better and it is not faster.

By contrast, we do believe the use of AI-assisted development tools  can result in productivity improvement. We don’t believe that the overall development effort is 55% less: we’re only comfortable applying a number of that magnitude to a small slice of that typical developer work week: the first pass development. We also think that there would be uplift to other activities such as integration/verification and defect fixing. What those benefits net out to is an approximately 10% decrease to overall developer effort. These numbers are modest, but still worth considering.

I think the industry is recognizing the limitations of outsourced delivery. The cost savings aren’t what they’re advertised to be, and the delivered systems are of poor quality (if they’re delivered at all). Nonetheless, the goal of delivering projects better, faster, and cheaper is not a goal that’s going to go away. But there is a new approach that’s worth consideration: AI-assisted development. In the hands of skilled developers, we’re fairly confident in the outcome of productivity improvement. There’s no denying that this technology is still in its earliest days and needs to be honed over the course of several projects.

But the failure modes of the two approaches are also radically different: the failure mode of outsourced delivery is “after countless budget overruns, the company never delivered the project,” whereas the failure mode of AI-assisted development is “it didn’t save us as much money as we’d hoped.” I know which of those risks I’m more comfortable with.

More recent stories