The human factor in software testing – from curse to blessing

Our lives heavily depend on software, from the cars we drive to the bank cards we use to pay. And if that software is solid – that is, well tested so that few bugs reach the user – then our lives are mostly better for it. "But the reality is, unfortunately, that developers are not always willing or able to test their software, and neither are their executives," explains Andy Zaidman. As a professor of software engineering, he is therefore researching better and more robust ways of testing software, taking into account both the technical and the human factor. The goal: simply better software.

Ask an average programmer how much time is spent writing new code, and how much time is spent testing it, and the answer will probably be '50/50'. "But if you look closely, and we did, that's almost never the case," says Zaidman, "an average programmer spends about 25% of their time on engineering tests." If such a lack of attention is consistent, then problems in the code can pile up until one day things go completely wrong. "It is estimated that the US economy loses $1.7 billion every year because of software bugs. And that's just revenue, it could be much worse. Imagine, for example, that critical software systems in a hospital’s intensive care unit suddenly fail. In short, the more dependent we are on software, the more important it is that that software is sound. That's why I'm so motivated to tackle problems surrounding this essential part of software engineering."

There's a bug in the system

Back to basics: what is 'testing' and what are the errors it can prevent? In very simple terms, software code is an instruction for the computer, essentially similar to instructions like 'if you run out of milk, buy a new carton'. When writing that kind of code, the trick is to make the instructions as complete as possible – then the code will work. But when you consider that modern code, for example for a car, easily contains millions of lines of instructions, often written by hundreds or even thousands of developers over many years, then you understand that there is a high probability that errors will find their way into the code. Either because of teams not working in tune, or simply because of a lack of attention. As a result, new software is hardly ever completely correct or complete, and this means that the computer can go astray at unforeseen moments. Suppose, for example, that there is milk, but that the best-before date has passed, then the instructions mentioned earlier no longer work. At least, not if you expect to find fresh milk. In professional terms, this is called a 'bug'.

The great advantage and disadvantage of software is that you can change it all the time.

But software bugs are not always as innocent as sour milk. For example, online payment service PayPal daily gave a random man in the US new purchase credit, for years, until his credit fortune reached 92 billion dollar. "That is a very good example of how the common business philosophy of 'move fast and break things' can lead to totally unforeseen situations. The great advantage and disadvantage of software is that you can constantly adapt it. So, if there is a mistake in the code, you can fix it with an update. But then you assume that such a bug is harmless and that you detect it before it really goes wrong. Unfortunately, this is not always the case. Even in cars, where the 'hardware' is always checked extremely thoroughly: do the brakes work, are the wheels aligned properly, you name it. But the software is sometimes delivered from the factory with countless faults," Zaidman concludes with visible frustration.

On a quest with a healthy dose of reluctance

"Bugs often occur at very specific times, with very specific preconditions. Testing is therefore not easy and takes a lot of time and attention. Moreover, people prefer to be creative when it comes to making something new, rather than destructive, by which I mean that you critically examine your own creation." These are all reasons why people prefer not to test much or often, but also the reason why developers estimate that they spend about 50% of their time on it, where it is actually only 25%. ‘Chronoception' plays an important role here. "The idea that tedious tasks, such as doing the dishes, seem to take a very long time, while pleasant tasks, such as cooking, go by very quickly," summarises Zaidman. "When improving testing procedures, you also have to pay attention to those kinds of psychological phenomena."

When improving testing procedures you also have to pay attention to psychological phenomena

Apart from this psychological dimension, group processes also play an important role. After all, software development is often a group project. Testing therefore often means criticising other people's work: "People find that difficult. Very understandable, of course."

One of the possible solutions to that problem is the use of artificial intelligence (AI). AI will never complain that testing is boring. "But we ran into two major problems very quickly. An AI programme that tests a piece of software assumes that the current version of that software is error-free, and that only modifications to that software can contain mistakes.  But maybe the opposite is true? Also, of course, tests should not only show that there are errors in a piece of code, they should also help detect them: which instructions need to be changed? But those AI-generated tests are not at all easy for a software developer to understand, so that doesn't help us either."

Many more tools in the suitcase

Despite the two shortcomings, one thing was very clear after the first attempts: AI is incredibly efficient at checking code that needs to be tested, right into the very corners of that code. AI provides, to put it in professional terms, an almost perfect coverage. So back to the drawing board: how can you use the unique power of AI? In his research, Zaidman tries to make AI work with people. "We can give an initial test setup to the AI algorithm, so that it can continue to develop it and test it further. The results of those tests are then easier for the developer to understand, so they spend less time looking for the cause of the bug.”

How can you still use the unique power of AI?

Of course, once socially aware robots are with us, they may well find their way into elderly care, the classroom, and many other areas. This may raise fears about data and privacy, concerns over ethics, or the prospect of being replaced by a robot. “As researchers, we are certainly aware of our responsibilities. What helps is to listen, to take these fears really seriously and to explain our processes. We have ethics boards who oversee our research as well as data stewards who ensure data is anonymised in a responsible manner. We show what we are doing, where it is leading to and why this can be good.”