[C-safe-secure-studygroup] Static vs dynamic analysis: who cares?

Robert Seacord rcseacord at gmail.com
Thu Jan 5 19:26:56 UTC 2017


This appears to be a reasonable and a sound argument.  I think my biggest
concern is as follow (so I would like to hear your views on this):

1. This group should involve analyzer vendors, as analyzer vendors are the
ones who will conform to the standard, and our consequently our primary
audience.
2. We are hoping that analyzer vendors will want to conform to the
standard, because they will see this as a market opportunity.
3. Analyzer vendors could claim partial conformance but would prefer to
claim conformance, as partial conformance is fairly meaningless (I'm sure I
could claim partial conformance with grep).
4. Different analysis techniques are more or less capable of diagnosing
different classes of problems.
5. Consequently, it will be difficult for a number of vendors, implementing
different analysis techniques, to agree on a non-trivial rule set which
they all feel they can enforce.

What do you think?

rCs

On Thu, Jan 5, 2017 at 1:51 PM, Roberto Bagnara via
C-safe-secure-studygroup <c-safe-secure-studygroup at lists.trustable.io>
wrote:

> Hi there.
>
> I think that the current discussion on static analysis vs dynamic analysis
> is a distraction from (my perception) of the interests of the study group.
> In this writing, I will try to explain why.
>
> Traditionally, a "static analysis" of a program is any analysis that
> can be conducted on the program without executing it.  And a "dynamic
> analysis" of a program is any analysis that is based on the observation
> of the behavior of a set of program executions.  This distinction
> should be taken with care.  The crucial point is what it means to
> execute a program:
>
> 1) the entire program can be executed on the intended target and the
>    intended operational environment;
> 2) parts of the program can be executed (with tests input generated in some
>    way) on the intended target and the intended operational environment;
> 3) a more or less related program (e.g., compiled for a target different
>    from the intended target) can be executed, entirely or in parts,
>    on an instrumented environment that is different from the intended
>    operational environment;
> 4) in abstract interpretation, (parts of) a program can be abstractly
>    executed on a domain of data descriptions;
> 5) (parts of) a program can be symbolically executed.
>
> (Symbolic and concrete execution are mixed in so-called "concolic"
> execution, but I do not want to overcomplicate things.)
> I think there is no doubt that an analysis based on (1) is a dynamic
> analysis, that (4) and (5) are techniques used in various static
> analyzers, and that for (2) and (3) we would need more details in order
> to understand whether we are talking about a static or a dynamic analysis.
> The point, though, is: is this distinction really buying us anything?
>
> I believe it is unfortunate that TS 17961:2016 mentions "static
> analysis" 12 times, sometimes in a way that seems to exclude tools
> that use techniques that are based on some sort of (abstract, symbolic,
> instrumented) execution of (parts of) the program.  My point is that,
> for the scope of this working group, we should not care about slippery
> distinctions about analysis techniques.  What we should care about
> is automation.  To quote TS 17961:2016:
>
>     The purpose of this Technical Specification is to specify
>     analyzable secure coding rules that can be automatically enforced
>     to detect security flaws in C-conforming applications.
>
> Automatically enforceable, this is the key.  And another point I like
> in TS 17961:2016 is in the "Scope" section:
>
>     This Technical Specification does not specify
>       - the mechanism by which these rules are enforced [...]
>       [...]
>
> I propose to focus on these points and to forget about the distinction
> static analysys vs dynamic analysis.  What we want are automatic tools
> where we push a button to start the analysis and, after a finite amount
> of time (perhaps enforced by the provision of a stop button which
> we can push when we want), they provide a set of results.
> Such results might contain false positives and false negatives:
> a conforming tool vendor will document, for each rule, whether
> false positives and/or false negatives are possible, so that we know
> whether compliance has been proved or if we have to do more work to
> prove it.  The number of false positives and false negatives
> is considered a quality of implementation issue by TS 17961:2016,
> the idea being that the market will do what it takes to promote the
> development of better tools.  Moreover, for a tool to be conforming
> it is enough that "[f]or each rule, the analyzer reports a diagnostic
> for at least one program that contains a violation of that rule."
>
> Let us take an example:
>
>    5.2 Accessing freed memory [accfree]
>    Rule
>    After an allocated block of dynamic storage has been deallocated by
>    a memory management function, the evaluation of any pointers into
>    the freed memory, including being dereferenced or acting as an
>    operand of an arithmetic operation, type cast, or right-hand side
>    of an assignment, shall be diagnosed.
>
> There are many techniques that allow to deal (to some extent) with this
> undecidable rule:
>
> - abstract interpretation;
> - (symbolic) model checking;
> - random generation of test inputs;
> - constraint-based generation of test inputs;
> - language subsetting (e.g., MISRA C) and static analysis to prove
>   the program is in the subset;
> - valgrind;
> - ...
>
> Consider the use of valgrind: I run (a variant of) the program
> under valgrind for 10 minutes, by automatically generating inputs
> any way I like.  This technique is able to detect violations of
> [accfree] in a finite amount of time.  It has false negatives
> (of course) but (depending on the variant and on the generated inputs)
> very few or no false positives.  All in all it may be better than
> many tools out there: why excluding it a priori?
>
> My point is that our worry should be that the rule is specified in
> unambiguous terms, so that it is amenable to automatic cheking.  By
> this I mean, that the specification should be precise enough that
> consensus can be reached on whether a diagnostic given (resp., not
> given) by a tool has to be considered or not a false positive (resp.,
> a false negative).  This is not easy, but it is crucial.
> Distinguishing the automatic analysis technique(s) employed (static,
> dynamic, you-name-it) is, instead, irrelevant.
> Kind regards,
>
>    Roberto
>
> --
>      Prof. Roberto Bagnara
>
> Applied Formal Methods Laboratory - University of Parma, Italy
> mailto:bagnara at cs.unipr.it
>                               BUGSENG srl - http://bugseng.com
>                               mailto:roberto.bagnara at bugseng.com
>
> _______________________________________________
> C-safe-secure-studygroup mailing list
> C-safe-secure-studygroup at lists.trustable.io
> https://lists.trustable.io/cgi-bin/mailman/listinfo/c-
> safe-secure-studygroup
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.trustable.io/pipermail/c-safe-secure-studygroup/attachments/20170105/2a65f1d9/attachment-0001.html>


More information about the C-safe-secure-studygroup mailing list