Static Code Analysis - Getting the Computer to Find Your Bugs for You

As we’ve previously covered, investing in automated development infrastructure is an important part in achieving good results in software development projects. One specific part of this infrastructure that can support the creation of good quality code is static code analysis. Static code analysis is a system where software is automatically analysed for potential issues without ever being executed - purely from looking at the written source code. Here we’ll quickly touch on what makes static code analysis special and then get stuck into some specific technical examples based on our experience of using it in our work at Spore Lab.

The fact that the analysis is based on the source code leads to some advantages:

  • Simple to run - it doesn’t matter what target platform the software will be run on, as the software doesn’t need to be compiled or executed.

  • Quick feedback - information about issues discovered is available very promptly in the development process, and the reported information generally relates directly to a specific line of code, so there is no need to analyse detailed logs, or crash output to find issues.

  • Easy integration - most development environments already understand the concept of integrating the error messages from a compiler for the developer to see. Static analysis tools often output in the same format, making workflow integration relatively simple. A lot of effort has been put into easing the integration of these tools into common build systems.

  • Incremental options - there are a number of different options available, and it is possible to incrementally deploy these, rather than requiring a large conversion process.

  • Effort free - most analysis tools don’t require additional test code to be written by developers, so the benefits are free once the initial infrastructure is in place.

From a developer's point of view static code analysis can simply be thought of as enhanced error messages from the compiler. In this respect, it integrates with existing developer workflows quite naturally, since most developers are accustomed to being informed of issues by a compiler.

At Spore Lab, we do a lot of work using open source projects for our embedded software development. This has meant a large number of our projects have been developed in C, or C++, using the GCC compiler. Fortunately there are a lot of static code analysis solutions targeting this exact environment.

One of the simplest things to start with is to enable as much analysis within the compiler as possible. GCC defaults to being fairly relaxed about possible errors, relying on developer diligence to spot possible issues. The best way to start is to turn on all of the errors. GCC supports the ‘-Wall’ flag, which despite its name only turns on a common subset of warnings. This can be enhanced with ‘-Wextra -std=c11’, which will enable a fairly comprehensive set of compiler warnings, and strict adherence to the 2011 C standard. As with all the methods we’re describing here, it is useful to implement them incrementally. That is, once any issues that are exposed by one set of flags have been resolved, then the next change can be made. GCC also supports its own static analysis engine, via the “FORTIFY_SOURCE” feature. This is used to attempt to detect buffer overflows, and is enabled by adding ‘-D_FORITFY_SOURCE=2’ parameter. There is some run-time overhead to enabling FORTIFY_SOURCE, as some of the bounds checking is implemented dynamically, however we generally consider this a worthwhile tradeoff. For performance critical code further investigation may be required.

There is an excellent open source project for C/C++ code analysis - cppcheck. Cppcheck looks for common programmer mistakes:

  • Out of bounds checking

  • Memory leaks checking

  • Detect possible null pointer dereferences

  • Check for uninitialized variables

  • Check for invalid usage of STL

  • Checking exception safety

  • Warn if obsolete or unsafe functions are used

  • Warn about unused or redundant code

  • Detect various suspicious code indicating bugs

It also has the ability to write custom extensions. This feature can be used to add some intelligence about functions in your source base. So if the code base being analysed has functions which allocate resources that need to be unallocated, Cppcheck can be extended to understand these, and provide feedback on their usage.

For slightly more involved analysis, there is scan-build feature of LLVM, also known as clang-analyzer. This is shipped as standard with Apple’s Xcode, but can also be used stand-alone. Its features are too numerous to list here, but are well summarised on their site. This tools is vigorously developed, and as it is a standard part of both LLVM and Xcode it has excellent community coverage.

Just as Apple have invested significant effort in the LLVM backend, Facebook have their own team of engineers working in this field. Infer is their latest tool to perform static analysis. While it is primarily aimed at the mobile market, it is still a general purpose tool for C, Java and Objective-C. It is relatively new, and at this stage doesn’t have the coverage or capability of most of the other tools we’ve looked at, but it is being heavily developed. As it is geared towards mobile platforms, it has a little more logic around standard mobile libraries than other tools, so may provide more insight in these areas.

Finally, in addition to the above open source tools, there are also some excellent closed source tools as well. In particular, Coverity offer a product called ‘Code Advisor’, which provides an excellent code analysis. There is both a freely available option, for open source projects, and a paid for version for closed source projects. Their paid version offers a free trial option, which we would recommend to anyone as an easy way to see the benefits of such analysis.

While the use of static code analysis tools cannot by themselves ensure your code is perfect, they can definitely make a positive contribution to the overall quality of the code. With the great tools available, the extremely low cost of utilizing them, and the high cost of a developer manually finding and fixing bugs that the tools can detect automatically, it’s an easy choice to find the time to add them to your project’s build automation infrastructure.