[trustable-software] Trustable Software Engineering

Thu Jul 28 12:03:12 UTC 2016

On 2016-07-19 16:44, Niall Dalton wrote:
> Paul Sherwood <paul.sherwood at codethink.co.uk [1]> wrote:
>
>> By this I mean software which satisfies some or all of the following
>> criteria
>> - may cause physical harm to people either directly or indirectly
>> - may cause psychological harm
>> - may cause financial harm or damage productivity
>> - may cause harm to the environment, either directly or as a
>> side-effect
>> - if hacked, could be exploited to cause any of the above  
>
> I think you need to extend that to data too. Effectively, the current
> machine learning trend is the equivalent of money laundering our data
> sins into software. Sometimes that is poor training data by the
> creators[1], sometimes its abuse of the software by even a minority 
> of
> malicious users[2]. 

I hadn't really thought about this aspect properly - I naively assumed 
that learning algorithms are inherently able to adapt, but clearly this 
is another example of garbage in, garbage out. Is anyone even thinking 
about how to measure what the gaps are in any given system/algorithm's 
'knowledge'? Or will everyone just plead 'unknown unknowns'?

> As terrible as they are, these are relatively benign compared to 
> other
> possible uses of the technology either directly in the physical 
> world,
> or indirectly (e.g. via the capital markets -- there are funded
> studies underway on how to counteract weaponized trading for this
> reason).

Scary stuff. But a colleague yesterday relayed a story of how a young 
geek acquired a botnet of 75k machines overnight, with an 'algorithm' 
based on 10 common passwords. I assume there must be plenty of folks 
weaponising for that usecase with learning algorithms now.

>> I believe our overall objective has to be Trustable Software, i.e.
>> - we know where it comes from
>> - we know how to build it
>> - we can reproduce it
>> - we know what it does
>> - it does what it is supposed to do
>> - we can update it and be confident it will not break or regress
>> and perhaps most importantly...
>> - we have some confidence that it wont harm our communities and our
>> children
>
> These are all great goals. I used to think we could write such
> software -- and write it with tools that promoted safety, such as
> better type systems, proofs of safety, and so on. I think thats
> theoretically possible but not an especially feasible approach. 

Sadly, I agree.

> - Most software needs to be written by relatively unskilled
> programmers who cannot or will not use such tools (nor should they).
> In an ideal world, most programmers work in pretty high-level but
> somewhat safe tools. But naturally will write unsafe apps anyway. I
> think we not only have to live with this, but encourage it.

Encourage folks writing unsafe apps? Do you mean 'accept it, and 
plan/design/architect accordingly', or something else?

> Even back when the web was simple, that was easy enough to do, by
> naively putting the wrong database in public view. I really think 
> some
> of the recent attempts to cram our terrible software legacy onto the
> web are not only misguided but dangerous in your definition -- e.g.
> WebAssembly to enable compilation of existing C++ code for execution.
> I understand the allure of dusty decks but sometimes we should grab
> the chance to burn the boats.

One thing that would help with justifying that decision would be if we 
could establish objective ways to measure the true cost of maintenance 
vs cost of redevelopment. Trouble is most people just guess, or cite 
cocomo, or get hooked on a silver-bullet product.

I did some work on this last year [1] concluding that git-active-days 
was a useful start point, but so far I've not come across any valid work 
by others in this, which is disappointing. Any ideas?

> - Skilled programmers cant rewrite the world in better tools -- were
> going to live with existing code for a long time, and given the rate
> of producing more legacy code, its hard to see that changing any time
> soon. We may see use of tools like certified compilers[3] in safety
> critical areas but unlikely in general use.

I wonder if we should encourage companies to pitch their magic bullet 
solutions here, so we can 'review' them :) ?

> So are we doomed? I dont think so, but we need to change our 
> approach.
>
> - we have too much software at the low levels. While the upper layers
> will continue their descent into madness, I think we need to rip out
> as much as we can at the lower levels.

Now this is an interesting idea, and it aligns with a design that a 
customer described to me for maintaining simplicity in safety-critical 
systems. I wonder how we could get some 'mind share' on this.

> Containers etc are overhyped,
> but they offer a glimmer of a world where we start to strip the
> dependencies of applications to a bare minimum, specify them and
> enforce it in some virtualized (in a general sense) environment.
> Personally Id like the OS to go back to securely multiplexing the
> hardware, and get out of the way on the rest. (My current system uses
> linux to boot the machine, confining it to core 0 and a little bit of
> memory, while my userspace code takes direct control of the devices 
> of
> interest and never makes a system call once up and running).

Sounds like a simple strategy - as far as possible, don't rely on 
anyone else's code. That's fine for you and your team, but what about 
normal humans facing similar problems? :-)

> - we have vast compute and storage at this point -- lets throw it at
> the problem. I want baserock style tooling being used for a 
> continuous
> attempt to find versions of things that work together. (E.g. find me 
> a
> version of kafka+spark+cassandra+linux that appears to run my app ok,
> and then keep looking as we all change arbitrary bits of the stacks).

Again, this is a very interesting idea. I've kicked off a tiny project 
to look at machine learning for whole-stack upgrade paths. I'll point 
the guys at your requirement statement.

> - given our resources, we can use techniques like multi-versioning in
> wide deployment. E.g. build reliable services out of unreliable parts
> by having multiple versions / builds / integrations inflight so a
> systemic bug cant take out the entire thing. Tracking this and
> automatically hill-climbing towards more stability in the face of 
> high
> rate of change... we need to build a lot of tools there, and throw
> large amounts of hardware at it.
>
> To sum it up -- its all crap, the rate of crap production is
> increasing, and we need to concentrate on far better shovels

Absolutely.

> rather
> than shaking our fists at the crap producers (of which Im one ;-).

I'm one too. But I definitely write less code than you :-)

br
Paul

[1] https://lwn.net/Articles/659241/