[trustable-software] Trustable Software Engineering

Niall Dalton niall.dalton at gmail.com
Tue Jul 19 15:44:57 UTC 2016


Paul Sherwood <paul.sherwood at codethink.co.uk> wrote:

> By this I mean software which satisfies some or all of the following
> criteria
> - may cause physical harm to people either directly or indirectly
> - may cause psychological harm
> - may cause financial harm or damage productivity
> - may cause harm to the environment, either directly or as a side-effect
> - if hacked, could be exploited to cause any of the above



I think you need to extend that to data too. Effectively, the current
machine learning trend is the equivalent of money laundering our data sins
into software. Sometimes that is poor training data by the creators[1],
sometimes it's abuse of the software by even a minority of malicious
users[2].

As terrible as they are, these are relatively benign compared to other
possible uses of the technology either directly in the physical world, or
indirectly (e.g. via the capital markets -- there are funded studies
underway on how to counteract 'weaponized' trading for this reason).



> I believe our overall objective has to be Trustable Software, i.e.
> - we know where it comes from
> - we know how to build it
> - we can reproduce it
> - we know what it does
> - it does what it is supposed to do
> - we can update it and be confident it will not break or regress
> and perhaps most importantly...
> - we have some confidence that it won't harm our communities and our
> children



These are all great goals. I used to think we could write such software --
and write it with tools that promoted safety, such as better type systems,
proofs of safety, and so on. I think that's theoretically possible but not
an especially feasible approach.

- Most software needs to be written by relatively unskilled programmers who
cannot or will not use such tools (nor should they). In an ideal world,
most programmers work in pretty high-level but somewhat safe tools. But
naturally will write unsafe apps anyway. I think we not only have to live
with this, but encourage it.

Even back when the web was simple, that was easy enough to do, by naively
putting the wrong database in public view. I really think some of the
recent attempts to cram our terrible software legacy onto the web are not
only misguided but dangerous in your definition -- e.g. WebAssembly to
enable compilation of existing C++ code for execution. I understand the
allure of dusty decks but sometimes we should grab the chance to burn the
boats.

- Skilled programmers can't rewrite the world in better tools -- we're
going to live with existing code for a long time, and given the rate of
producing more legacy code, it's hard to see that changing any time soon.
We may see use of tools like certified compilers[3] in safety critical
areas but unlikely in general use.

So are we doomed? I don't think so, but we need to change our approach.

- we have too much software at the low levels. While the upper layers will
continue their descent into madness, I think we need to rip out as much as
we can at the lower levels. Containers etc are overhyped, but they offer a
glimmer of a world where we start to strip the dependencies of applications
to a bare minimum, specify them and enforce it in some virtualized (in a
general sense) environment. Personally I'd like the OS to go back to
securely multiplexing the hardware, and get out of the way on the rest. (My
current system uses linux to boot the machine, confining it to core 0 and a
little bit of memory, while my userspace code takes direct control of the
devices of interest and never makes a system call once up and running).

- we have vast compute and storage at this point -- lets throw it at the
problem. I want baserock style tooling being used for a continuous attempt
to find versions of things that work together. (E.g. find me a version of
kafka+spark+cassandra+linux that appears to run my app ok, and then keep
looking as we all change arbitrary bits of the stacks).

- given our resources, we can use techniques like multi-versioning in wide
deployment. E.g. build reliable services out of unreliable parts by having
multiple versions / builds / integrations inflight so a systemic bug can't
take out the entire thing. Tracking this and automatically hill-climbing
towards more stability in the face of high rate of change... we need to
build a lot of tools there, and throw large amounts of hardware at it.

To sum it up -- it's all crap, the rate of crap production is increasing,
and we need to concentrate on far better shovels rather than shaking our
fists at the crap producers (of which I'm one ;-).

[1]
http://blogs.wsj.com/digits/2015/07/01/google-mistakenly-tags-black-people-as-gorillas-showing-limits-of-algorithms/

[2]
https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai-chatbot-gets-a-crash-course-in-racism-from-twitter


[3] http://compcert.inria.fr/doc/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.veristac.io/pipermail/trustable-software/attachments/20160719/66477d99/attachment.html>


More information about the trustable-software mailing list