Software Reliability

By Rutger van der Berg

 

While building a formula student car, software has a somewhat irregular position. Before I go into the details, let me define what I mean by software. For our DUT cars, I can separate software into three parts.

First we have control systems, which can be seen as a box/subsystem that takes sensor inputs, and produces setpoints for the various actuators (along with some data to be logged). For our cars, this is created using Matlab/Simulink, and then converted to C++ code.

 

Second, low level code and drivers. This is the software that interfaces directly with the hardware, and defines a more usable abstraction of said hardware. Most of this is provided by the MCU manufacturer, but occasionally we find that the provided implementation does not meet our requirements, and we have modify it or write our own version.

Last, we have application logic. This includes safety and liveness checks, tracking the state of the car and providing various user interfaces.

 

As software engineers, we’re responsible for the last two categories. You’ll notice that none of what I’ve described actually affects the performance of the car. As I often say when describing what I actually do on the team, won’t make the difference between first or second place, but we can very easily make the difference between first and last place. In other words, when we make a mistake it can easily result in the car shutting down at a crucial moment. So our primary focus should always be reliability. This means testing everything.

For regular non-embedded software, this is fairly simple. There are a lot of comprehensive frameworks and tools. They won’t make testing fast or fun, but they at least make it easy. Unfortunately, most of these are designed for desktop computers, and cannot be directly used for embedded software.

This is where the previously mentioned categories come in. Application logic, if properly separated from the low level code, can be tested on any regular computer. Simply mock all communication with the low level code, and you’re good to go. This provides the advantage of being able to use standardised tools to catch most bugs.

Low level code and drivers, unfortunately, have to be run and therefore tested on the target platform. This means we can’t use the common tools, and we have to get a little creative.

More on that in a later post.