[trustable-software] Model Checkers

Fri Sep 21 14:50:31 BST 2018

Hi Ben,

Thanks for the feedback. Responses inline.

> I tried CBMC rather than ESBMC because it seemed easier to install and use (was 
> in apt). I found when running it that it stuck for about 20 minutes trying to 
> test every iteration of a potentially boundless loop.
> I believe this is the correct behaviour for a model checker, but it's not very 
> useful for the purposes of running over an entire Linux distribution because 
> we'll likely hit this issue thousands of times.
>
> The documentation for CBMC does tell you how to specify specific bounds for 
> loops to work around this issue, however this would effectively remove one of 
> the checks, and would be a very manual (impractical) process.

(Reposting comment for the mailing list)
It may be possible to prove properties of long loops by induction. However, I
agree that this will take too much developer time to scale to anywhere near the
amount of code we would need to check.

> On further reading it seems to me like these bounds checkers are more designed 
> for very constrained embedded systems where you actually want to ensure these 
> things are known or knowable.

Yes this is the context in which I have heard of them too.

> I then looked into MOPS which is described by the paper you referenced in [2], 
> the paper was released in 2005 and the latest release of MOPS is from 2004.
> The website for MOPS states that it's not a checker itself, but a framework for 
> adding your own checks, so I highly suspect that the reason it could be run over 
> an entire distro is that they ran a very minimal set of checks; the paper also 
> suggests this as they only found 138 bugs in the distribution, while I believe 
> it's reasonable to assume that there are a lot more than that.

Agreed. However, I think some variation of this will be the only practical
solution because anything more rigorous will have the issues you encountered
with CBMC.

> I guess to bring this closer to reality, the discussion we really need to have is:
> How can we practically produce useful static code analysis results for a full 
> distro using BuildStream or similar?

I agree, although this will be very challenging for the reasons you specified. I
think this is likely to be a balancing act between false positives and missed
bugs - which is not good news for safety. However, if we can at-least find some
bugs using something similar to MOPS, it will help.

Regards,
Tom