[trustable-software] Trustable Software Engineering

Niall Dalton niall.dalton at gmail.com
Thu Jul 28 18:00:16 UTC 2016


On Thu, Jul 28, 2016 at 5:03 AM, Paul Sherwood <
paul.sherwood at codethink.co.uk> wrote:

> I hadn't really thought about this aspect properly - I naively assumed
> that learning algorithms are inherently able to adapt, but clearly this is
> another example of garbage in, garbage out. Is anyone even thinking about
> how to measure what the gaps are in any given system/algorithm's
> 'knowledge'? Or will everyone just plead 'unknown unknowns'?
>


​Folks are certainly working on it, but it turns out to be sadly baked in
set of human induced biases all over the place. It's going to take a lot of
work to clean up; not necessarily ​new algorithms, but new ways to cross
check and remove deliberate or inadvertent garbage in.



- Most software needs to be written by relatively unskilled
>> programmers who cannot or will not use such tools (nor should they).
>> In an ideal world, most programmers work in pretty high-level but
>> somewhat safe tools. But naturally will write unsafe apps anyway. I
>> think we not only have to live with this, but encourage it.
>>
>
> Encourage folks writing unsafe apps? Do you mean 'accept it, and
> plan/design/architect accordingly', or something else?
>


​Yep, accept it and build accordingly. Of course we don't give up on giving
people safer tools but we should assume they'll build unsafe things either
way. Now, how do we give them a platform on which some extra safety is
provided? ​



> Even back when the web was simple, that was easy enough to do, by
>> naively putting the wrong database in public view. I really think some
>> of the recent attempts to cram our terrible software legacy onto the
>> web are not only misguided but dangerous in your definition -- e.g.
>> WebAssembly to enable compilation of existing C++ code for execution.
>> I understand the allure of dusty decks but sometimes we should grab
>> the chance to burn the boats.
>>
>
> One thing that would help with justifying that decision would be if we
> could establish objective ways to measure the true cost of maintenance vs
> cost of redevelopment. Trouble is most people just guess, or cite cocomo,
> or get hooked on a silver-bullet product.
>
> I did some work on this last year [1] concluding that git-active-days was
> a useful start point, but so far I've not come across any valid work by
> others in this, which is disappointing. Any ideas?
>


​I appreciate what you're asking but I don't really have a clue :-)

My personal approach is to sidestep the problem by skating on the fact that
we're still in a period where massive spurts of growth happen in unexpected
ways. There are plenty of places where the legacy stuff is tiny compared to
the new growth -- so I wuss out and just work on the new things ;-)​

​I'll return to this a little below.​


Containers etc are overhyped,
>> but they offer a glimmer of a world where we start to strip the
>> dependencies of applications to a bare minimum, specify them and
>> enforce it in some virtualized (in a general sense) environment.
>> Personally Id like the OS to go back to securely multiplexing the
>> hardware, and get out of the way on the rest. (My current system uses
>> linux to boot the machine, confining it to core 0 and a little bit of
>> memory, while my userspace code takes direct control of the devices of
>> interest and never makes a system call once up and running).
>>
>
> Sounds like a simple strategy - as far as possible, don't rely on anyone
> else's code. That's fine for you and your team, but what about normal
> humans facing similar problems? :-)
>


​Well it can still be applied. I'll describe something I'm in two minds
about (mainly because it at least temporarily makes debug and analysis
harder but that may just be a tooling issue).

Remember the old "tree shakers" for lisp or smalltalk systems?​ Essentially
they'd reduce the size of the resulting image by making a closed-world
assumption and removing any code that wasn't directly used by the
application. Unikernels, as one example, try to do the same for
applications by linking "just enough" systems code to get the thing to run.
With varying degrees of human involvement. Some companies have built the
same for the equivalent of docker containers. Reduce the goop around an app
from 1GB to 10MB that tends to not change too quickly, and you've certainly
simplified the problem. Could be more aggressive and simply not support all
the system calls (e.g. make it impossible for some piece of code to open a
network socket).

​That's just an example, I think there are others.​

​Coming bottom up, I think we can do a bit more too. Again I'm torn: it
takes ~10 years to build a reliable filesystem. I wouldn't want to ditch
one on a whim. Then again, do we really need a whole filesystem to put some
data on flash? (As a spe​ed freak I like thinks like physical page
addressing in open channel SSDs, but, besides speed, the code tends to be
an awful lot simpler if you just want to smack on object onto the NAND.
Reasonable people may have different views though.).


- we have vast compute and storage at this point -- lets throw it at
>> the problem. I want baserock style tooling being used for a continuous
>> attempt to find versions of things that work together. (E.g. find me a
>> version of kafka+spark+cassandra+linux that appears to run my app ok,
>> and then keep looking as we all change arbitrary bits of the stacks).
>>
>
> Again, this is a very interesting idea. I've kicked off a tiny project to
> look at machine learning for whole-stack upgrade paths. I'll point the guys
> at your requirement statement.
>


​Sounds great -- please do let me know how that goes.
​
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.veristac.io/pipermail/trustable-software/attachments/20160728/237b844d/attachment-0001.html>


More information about the trustable-software mailing list