Testing our Assumptions about the reliability
of computer systems (including ethical considerations) February 24,
2003
Overview of lecture:
1. Examples of system failures…and “The Titanic Effect”
2. Why are computer systems unreliable?
2.2. nature of computer “systems” – idea
of complexity
2.3. lack of liability on software
vendors
Forester’s and Morrison’s ethical questions:
--> "Why isn't software guaranteed like other products?"
--> "Why does so much shoddy software exist in important systems?"
--> "Should we entrust so many decisions to complex software
programs?"
3. Why do we live with this risk?
**************
1. Examples of system failures
:-->see examples in article (military + aerospace
+ air control + banking + medical)
--> FACT: August 1996 - America OnLine’s computer systems crashed
for 16 hours
CAUSE: new software installed during
a regular update
CLAIM: AOL
computers “virtually immune” to thiskind of outage
(The Titanic Effect: see kit page )
--> FACT: June 1996 - France’s Ariane 5 rocket - to put satellites
into orbit had to be destroyed 40 seconds after it was launched
CAUSE: “design errors in the software”
“
All it takes is a modest anomaly in a digital system to bring the whole system
to its knees”(Brown, Cybertrends, 1997)
2. Why are computer systems unreliable?
2.1. Reprise Tenner’s ideas on “computer systems”?
--> in Industrial Age, in a print shop there were individual
craftsmen working with their tools--today we talk about tools
in factories as a system.
In today's factories : workers are not
tool users, they're tool managers
the managed tool is more precise, BUT "the precision of
the managed tool has a price. It may be less robust, and as it
becomes more complex, less predictable."
Remember—tools "break," but systems have "bugs."
What if you can’t
figure out why a failure happened?
--> after a system failure there is a ritual whereby the
cause of the accident is supposed to
be determined.
BUT "high technology accidents may not have clear causes
at all. They may be inherent in the complexity of the technological
systems we have created. "
so we don’t have "real
accidents" - (where you can point a finger
at the person responsible) but "normal accidents"--
-->doesn't mean frequent—it is the kind of accident you can "expect
in the normal functioning of a technologically complex operation."
system failure = normal accident
(Gladwell, “Blowup,” 1996)
the theory of "risk homeostatis"
states that we can't assume that EVEN
IF a problem has been fixed....like the new booster joints on
the Challenger scuttle, then the system is "safer"....
When do systems fail?
--> not only in operation but in design and development:
--> see Forester and Morrison article: more systems don't
make it then do...
it's GOOD that they don't make it!
e.g., Strategic Defense Initiative in
1983
(also known as "Star Wars" it was to be layered ballistic missile
defense system, ...like a shield in space over the U.S.)
needed the most complex computer software
ever designed to be the "brain" that
guides and co-ordinates an immensely complex battle management system.
to cost -
over a trillion dollars.
Problems with Star Wars System: too
many
unknowns in contrast to normal program
development:
In normal program development the:
- program will do X,
- programmers can usually anticipate how much computing power they need,
- they usually have other programs to use as models,
- they have the opportunity to debug the program,
and test it before it is given to the client.
2.2. Liability issues
"
Microsoft knows that reliable software is not cost effective. According to studies,
90% to 95% of all bugs are harmless. They’re never discovered by users,
and they don’t affect performance. It’s much cheaper to release buggy
software and fix the 5% to 10% of bugs people find and complain about." Bruce
Schneier as quoted in "Monty Phython’s Flying Circus: Microsoft and
the Aircraft Carriers" www.acm.org/ubiquity/views/m_kabay_3.html
- in 2000, it was annoucned that new
U.S. aircraft carriers, the CVN-77, will be controlled by software
from Microsoft Federal Systems. (the operating
system
will be based on Windows 2000)
Is this a good thing?
- "could lead to functional disarmament" "how do you reboot
an aircraft carrier?"
(gambling on a 5% bug rate doesn’t work on a military vessel)
- could lead to spinoffs for the rest
of us if the military demands service level agreements or terms
of performance."
What drives computer systems development
today?
- concerns for time-to-market
- novel features
- keep costs down
("with little concern for assurance, reliability or avoidance of system
security vulnerabilites" ("Risks in Features vs. Assurance":
www.csl.sri.com/users/
neumann/inside risks.html#137)
Legal situation: software related to
risks under contract law rather than more
demanding liability laws.
- liability laws in play with other engineered
artefacts….why not
software??
software vendors base their non-liability claim saying they are selling
a
"
license"not a product…no protection for consumers
- contracts
are inequitable - purchasers assume all liabilities, despite impossibility
of
assessing the security,
reliability or survivabilty of software.
Recent law suit in California who is suing Microsoft + others because
she couldn’t
read the contract she was supposed to agree to because it was inside the
shrink-wrapping)
Since now the customer takes all the
risks, there is little incentive for the
developers to ship reliable, secure systems…
How can the customer know all the risks?
--> Maybe liability law should override unjust contract disclaimers!
--> Maybe we shouldn’t expect/demand upgrades so frequently/maybe
not buy into obsolescence
3. If we rely heavily on them, and they're unreliable,
then why don't we worry more?
According to Gerald Wilde in Target Risk: humans have a tendency to
compensate for lower risks in one area by taking greater risks in another.
examples:
--> anti lock brake system experiment in Germany; more accidents
with drivers of A.B.S. system than others.
--> more accidents near marked crossings than other parts of road.
when we're convinced that anti-accident
measures are in place, we relax about
other safety features...
--> We figure air bags + seat belts will protect us, so we speed
up.
Add together the unreliable, complex
computer systems (where "normal accidents"/system
failures occur) + the fact that software systems aren’t “guaranteed” and
our human nature (i.e., our propensity to maintain a high state of
risk) and you get.......??? |