[C-safe-secure-studygroup] Static vs dynamic analysis: who cares?

Thu Jan 5 18:51:52 UTC 2017

Hi there.

I think that the current discussion on static analysis vs dynamic analysis
is a distraction from (my perception) of the interests of the study group.
In this writing, I will try to explain why.

Traditionally, a "static analysis" of a program is any analysis that
can be conducted on the program without executing it.  And a "dynamic
analysis" of a program is any analysis that is based on the observation
of the behavior of a set of program executions.  This distinction
should be taken with care.  The crucial point is what it means to
execute a program:

1) the entire program can be executed on the intended target and the
   intended operational environment;
2) parts of the program can be executed (with tests input generated in some
   way) on the intended target and the intended operational environment;
3) a more or less related program (e.g., compiled for a target different
   from the intended target) can be executed, entirely or in parts,
   on an instrumented environment that is different from the intended
   operational environment;
4) in abstract interpretation, (parts of) a program can be abstractly
   executed on a domain of data descriptions;
5) (parts of) a program can be symbolically executed.

(Symbolic and concrete execution are mixed in so-called "concolic"
execution, but I do not want to overcomplicate things.)
I think there is no doubt that an analysis based on (1) is a dynamic
analysis, that (4) and (5) are techniques used in various static
analyzers, and that for (2) and (3) we would need more details in order
to understand whether we are talking about a static or a dynamic analysis.
The point, though, is: is this distinction really buying us anything?

I believe it is unfortunate that TS 17961:2016 mentions "static
analysis" 12 times, sometimes in a way that seems to exclude tools
that use techniques that are based on some sort of (abstract, symbolic,
instrumented) execution of (parts of) the program.  My point is that,
for the scope of this working group, we should not care about slippery
distinctions about analysis techniques.  What we should care about
is automation.  To quote TS 17961:2016:

    The purpose of this Technical Specification is to specify
    analyzable secure coding rules that can be automatically enforced
    to detect security flaws in C-conforming applications.

Automatically enforceable, this is the key.  And another point I like
in TS 17961:2016 is in the "Scope" section:

    This Technical Specification does not specify
      - the mechanism by which these rules are enforced [...]
      [...]

I propose to focus on these points and to forget about the distinction
static analysys vs dynamic analysis.  What we want are automatic tools
where we push a button to start the analysis and, after a finite amount
of time (perhaps enforced by the provision of a stop button which
we can push when we want), they provide a set of results.
Such results might contain false positives and false negatives:
a conforming tool vendor will document, for each rule, whether
false positives and/or false negatives are possible, so that we know
whether compliance has been proved or if we have to do more work to
prove it.  The number of false positives and false negatives
is considered a quality of implementation issue by TS 17961:2016,
the idea being that the market will do what it takes to promote the
development of better tools.  Moreover, for a tool to be conforming
it is enough that "[f]or each rule, the analyzer reports a diagnostic
for at least one program that contains a violation of that rule."

Let us take an example:

   5.2 Accessing freed memory [accfree]
   Rule
   After an allocated block of dynamic storage has been deallocated by
   a memory management function, the evaluation of any pointers into
   the freed memory, including being dereferenced or acting as an
   operand of an arithmetic operation, type cast, or right-hand side
   of an assignment, shall be diagnosed.

There are many techniques that allow to deal (to some extent) with this
undecidable rule:

- abstract interpretation;
- (symbolic) model checking;
- random generation of test inputs;
- constraint-based generation of test inputs;
- language subsetting (e.g., MISRA C) and static analysis to prove
  the program is in the subset;
- valgrind;
- ...

Consider the use of valgrind: I run (a variant of) the program
under valgrind for 10 minutes, by automatically generating inputs
any way I like.  This technique is able to detect violations of
[accfree] in a finite amount of time.  It has false negatives
(of course) but (depending on the variant and on the generated inputs)
very few or no false positives.  All in all it may be better than
many tools out there: why excluding it a priori?

My point is that our worry should be that the rule is specified in
unambiguous terms, so that it is amenable to automatic cheking.  By
this I mean, that the specification should be precise enough that
consensus can be reached on whether a diagnostic given (resp., not
given) by a tool has to be considered or not a false positive (resp.,
a false negative).  This is not easy, but it is crucial.
Distinguishing the automatic analysis technique(s) employed (static,
dynamic, you-name-it) is, instead, irrelevant.
Kind regards,

   Roberto

-- 
     Prof. Roberto Bagnara

Applied Formal Methods Laboratory - University of Parma, Italy
mailto:bagnara at cs.unipr.it
                              BUGSENG srl - http://bugseng.com
                              mailto:roberto.bagnara at bugseng.com