Software engineering critical systems

20,198 views 35 slides Jul 15, 2012
Slide 1
Slide 1 of 35
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35

About This Presentation

Critical Systems


Slide Content

CriticalSystems
LoganathanR.
1Prof. Loganathan R., CSE, HKBKCE

Objectives
•Toexplainwhatismeantbyacriticalsystem
wheresystemfailurecanhaveseverehuman
oreconomicconsequence.
•Toexplainfourdimensionsofdependability-•Toexplainfourdimensionsofdependability-
availability,reliability,safetyandsecurity.
•Toexplainthat,toachievedependability,you
need to avoid mistakes, detect and remove
errorsandlimitdamagecausedbyfailure.
2Prof. Loganathan R., CSE, HKBKCE

CriticalSystems
•If the system failure results in significant economic losses,
physical damages or threats to human life than the system is
calledcriticalsystems.3typesofitare:
•Safety-criticalsystems
–Failureresultsinlossoflife,injuryordamagetotheenvironment;–Failureresultsinlossoflife,injuryordamagetotheenvironment;
–Chemicalplantprotectionsystem;
•Mission-criticalsystems
–Failureresultsinfailureofsomegoal-directedactivity;
–Spacecraftnavigationsystem;
•Business-criticalsystems
–Failureresultsinhigheconomiclosses;
–Customeraccountingsysteminabank;
3Prof. Loganathan R., CSE, HKBKCE

Systemdependability
•The most important emergent property of a critical
system is its dependability. It covers the related
system attributes of availability, reliability, safety &
security.
•Importanceofdependability
–Systems that are unreliable, unsafe or insecure are often rejected
bytheirusers(refusetotheproductfromthesamecompany).
–Systemfailurecostsmaybeveryhigh.(reactor/aircraftnavigation)
–Untrustworthy systems may cause information loss with a high
consequentrecoverycost.
4Prof. Loganathan R., CSE, HKBKCE

Developmentmethodsforcriticalsystems
•Trustedmethodsandtechniquemustbeused.
•Thesemethodsarenotcost-effectiveforother
typesofsystem.
•Theoldermethodsstrengths&weaknesses•Theoldermethodsstrengths&weaknesses
areunderstood
•Formalmethodsreducetheamountoftesting
required.Example:
–Formalmathematicalmethod
5Prof. Loganathan R., CSE, HKBKCE

Socio-technicalcriticalsystemsFailures
•Hardwarefailure
–Hardware fails because of design and manufacturing
errors or because componentshave reached the end
oftheirnaturallife.
•Softwarefailure•Softwarefailure
–Softwarefailsduetoerrorsinitsspecification,design
orimplementation.
•HumanOperatorfailure
–Failtooperatecorrectly.
–Now perhaps the largest single cause of system
failures.
6Prof. Loganathan R., CSE, HKBKCE

ASimplesafetyCriticalSystem
•Exampleofsoftware-controlledinsulinpump.
•Usedbydiabeticstosimulatethefunctionof
insulin,anessentialhormonethatmetabolises
bloodglucose.bloodglucose.
•Measuresbloodglucose(sugar)usingamicro-
sensor and computes the insulin dose
requiredtometabolisetheglucose.
7Prof. Loganathan R., CSE, HKBKCE

Insulinpumporganisation
Needle
assembly
Pump
Clock
Insulin reservoir
8Prof. Loganathan R., CSE, HKBKCE
Sensor
Display1 Display2
AlarmController
Power supply

Insulinpumpdata-flow
Blood sugar
analysis
Blood sugar
sensor
Blood
Blood
parameters
Blood sugar
level
Insulin
9Prof. Loganathan R., CSE, HKBKCE
Insulin
delivery
controller
Insulin
pump
Insulin
Pump control
commands
Insulin
requirement
Insulin
requirement
computation

Dependabilityrequirements
•Thesystemshallbeavailabletodeliverinsulin
whenrequiredtodoso.
•Thesystemshallperformreliablyanddeliver
thecorrectamountofinsulintocounteractthecorrectamountofinsulintocounteract
thecurrentlevelofbloodsugar.
•The essential safety requirement is that
excessive doses of insulin should never be
deliveredasthisispotentiallylifethreatening.
10Prof. Loganathan R., CSE, HKBKCE

SystemDependability
•The dependability of a system equates to its
trustworthiness.
•Adependablesystemisasystemthatistrustedbyits
users.
•Principaldimensionsofdependabilityare:•Principaldimensionsofdependabilityare:
–Availability:- Probability that it will be up & running & able to deliver
atanygiventime;
–Reliability:-Correct delivery of services as expected by user over a
givenperiodoftime;
–Safety:-A Judgment of how likely the system will cause damage to
peopleoritsenvironment;
–Security:- A Judgment of how likely the system can resist accidental
ordeliberateintrusions;
11Prof. Loganathan R., CSE, HKBKCE

Dimensionsofdependability
Availability Reliability Security
Dependability
Safety
12Prof. Loganathan R., CSE, HKBKCE
Availability Reliability Security
The ability of the system
to deliver services when
requested
The ability of the
system to deliver
services as specified
The ability of the system
to operate without
catastrophic failure
The ability of the system
to protect itself against
accidental or deliberate
intrusion
Safety

Otherdependabilityproperties
•Repairability
–Reflectstheextenttowhichthesystemcanberepairedintheeventof
afailure
•Maintainability
–Reflects the extent to which the system can be adapted to new
requirements;requirements;
•Survivability
–Reflects the extent to which the system can deliver services while it is
underhostileattack;
•Errortolerance
–Reflects the extent to which user input errors can be avoided and
tolerated.
13Prof. Loganathan R., CSE, HKBKCE

Dependabilityvsperformance
•It is very difficult to tune systems to make
themmoredependable
•High level dependability can be achieved by
expenseofperformance.Becauseitincludeexpenseofperformance.Becauseitinclude
extra/ redundant code to perform necessary
checking
•Italsoincreasesthecost.
14Prof. Loganathan R., CSE, HKBKCE

Dependabilitycosts
•Dependabilitycoststendtoincreaseexponentiallyas
increasinglevelsofdependabilityarerequired
•Therearetworeasonsforthis
–Theuseofmoreexpensivedevelopmenttechniquesand–Theuseofmoreexpensivedevelopmenttechniquesand
hardwarethatarerequiredtoachievethehigherlevelsof
dependability
–The increased testing and system validation that is
required to convince the system client that the required
levelsofdependabilityhavebeenachieved
15Prof. Loganathan R., CSE, HKBKCE

Costsofincreasingdependability
C
o
s
t
16Prof. Loganathan R., CSE, HKBKCE
Low Medium High Very
high
Ultra-high
Dependability

Availabilityandreliability
•Reliability
–The probability of failure-free system operation
overaspecifiedtimeinagivenenvironmentfora
specificpurpose
•Availability•Availability
–The probabilitythat a system, at a point in time,
will be operational and able to deliver the
requestedservices
•Both of these attributes can be expressed
quantitatively
17Prof. Loganathan R., CSE, HKBKCE

Availabilityandreliability
•It is sometimes possible to include system
availabilityundersystemreliability
–Obviously if a system is unavailable it is not
deliveringthespecifiedsystemservices
•However,itispossibletohavesystemswith•However,itispossibletohavesystemswith
lowreliabilitythatmustbeavailable.Solong
assystemfailurescanberepairedquicklyand
donotdamagedata,lowreliabilitymaynotbe
aproblem
•Availabilitytakesrepairtimeintoaccount
18Prof. Loganathan R., CSE, HKBKCE

Faults,Errorsandfailures
•Failures are a usually a result of system errors that
arederivedfromfaultsinthesystem
•However, faults do not necessarily result in system
errors
–Thefaultysystemstatemaybetransientand‘corrected’–Thefaultysystemstatemaybetransientand‘corrected’
beforeanerrorarises
•Errorsdonotnecessarilyleadtosystemfailures
–The error can be correctedby built-inerror detectionand
recovery
–The failure can be protected against by built-inprotection
facilities. These may, for example, protect system
resourcesfromsystemerrors
19Prof. Loganathan R., CSE, HKBKCE

Reliabilityterminology
Term Description
System failure
An event that occurs at some point in time when the system
does not deliver a service as expected by its users
20Prof. Loganathan R., CSE, HKBKCE
System error
An erroneous system state that can lead to system behaviour
that is unexpected by system users.
System fault
A characteristic of a software system that can lead to a
system error. For example, failure to initialise a variable
could lead to that variable having the wrong value when it is
used.
Human error or
mistake
Human behaviour that results in the introduction of faults
into a system.

ReliabilityImprovement
•Threeapproachestoimprovereliability
•Faultavoidance
–Development technique are used that either minimise the possibility
of mistakes or trap mistakes before they result in the introduction of
systemfaults
•Faultdetectionandremoval•Faultdetectionandremoval
–Verification and validation techniques that increase the probability of
detectingandcorrectingerrorsbeforethesystemgoesintoserviceare
used
•Faulttolerance
–Run-time techniques are used to ensure that system faults do not
resultinsystemerrorsand/orthatsystemerrorsdonotleadtosystem
failures
21Prof. Loganathan R., CSE, HKBKCE

Reliabilitymodelling
•You can model a system as an input-output
mapping where some inputs will result in
erroneousoutputs
•Thereliabilityofthesystemistheprobability
thataparticularinputwilllieinthesetofthataparticularinputwilllieinthesetof
inputsthatcauseerroneousoutputs
•Different people will use the system in
differentwayssothisprobabilityisnotastatic
systemattributebutdependsonthesystem’s
environment
22Prof. Loganathan R., CSE, HKBKCE

Input/outputmapping
Input set
I
e
Inputs causing
erroneous outputs
23Prof. Loganathan R., CSE, HKBKCE
Output set O
e
Program
Erroneous
outputs

Reliabilityperception
User
Possible
inputs
24Prof. Loganathan R., CSE, HKBKCE
User
3
User
1 Erroneous
inputs
User
2

Reliabilityimprovement
•Removing X% of the faults in a system will not necessarily
improve the reliability by X%. A study at IBM showed that
removing 60% of product defects resulted in a 3%
improvementinreliability
•Programdefectsmaybeinrarelyexecutedsectionsofthe•Programdefectsmaybeinrarelyexecutedsectionsofthe
codesomayneverbeencounteredbyusers.Removingthese
doesnotaffecttheperceivedreliability
•A program with known faults may therefore still be seen as
reliablebyitsusers
25Prof. Loganathan R., CSE, HKBKCE

Safety
•Safety is a property of a system that reflects the system
shouldneverdamagepeopleorthesystem’senvironment
•Forexamplecontrol&monitoringsystemsinaircraft
•It is increasingly important to consider software safety as
moreandmoredevicesincorporatesoftware-basedcontrolmoreandmoredevicesincorporatesoftware-basedcontrol
systems
•Safety requirements are exclusive requirements i.e. they
exclude undesirable situations rather than specify required
systemservices
•Safetycriticalsoftwareare2types
26Prof. Loganathan R., CSE, HKBKCE

TypesofSafety-criticalsoftware
•Primarysafety-criticalsystems
–Embedded software systems whose failure can cause
hardware malfunction which results inhuman injury or
environmentaldamage.
•Secondarysafety-criticalsystems•Secondarysafety-criticalsystems
–Systemswhosefailureindirectlyresultsininjury.
–Eg.MedicalDatabaseholdingdetailsofdrugs
•Discussion here focuses on primary safety-critical
systems
27Prof. Loganathan R., CSE, HKBKCE

Safetyandreliability
•Safetyandreliabilityarerelatedbutdistinct
–Ingeneral,reliabilityandavailabilityarenecessary
butnotsufficientconditionsforsystemsafety
•Reliabilityisconcernedwithconformancetoa
givenspecificationanddeliveryofservice
•Reliabilityisconcernedwithconformancetoa
givenspecificationanddeliveryofservice
•Safety is concerned with ensuring system
cannotcausedamageirrespectiveofwhether
ornotitconformstoitsspecification
28Prof. Loganathan R., CSE, HKBKCE

Unsafereliablesystems
•Reasons why reliable system are not
necessarilysafe:
•Specificationerrors
•It does not describe the required behaviour in some
criticalsituationscriticalsituations
•Hardwarefailuresgeneratingspuriousinputs
•Hardtoanticipateinthespecification
•Operatorerror
•Context-sensitive commands i.e. issuing the right
commandatthewrongtime
29Prof. Loganathan R., CSE, HKBKCE

Safetyterminology
Term Definition
Accident (or
mishap)
An unplanned event or sequence of events which results in human death or injury,
damage to property or to the environment. A computer-controlled machine injuring its
operator is an example of an accident.
Hazard
A condition with the potential for causing or contributing to an accident. A failure of the
sensor that detects an obstacle in front of a machine is an example of a hazard.
Ameasureofthelossresultingfromamishap.Damagecanrangefrommanypeople
30Prof. Loganathan R., CSE, HKBKCE
Damage
Ameasureofthelossresultingfromamishap.Damagecanrangefrommanypeople
killed as a result of an accident to minor injury or property damage.
Hazard severity
An assessment of the worst possible damage that could result from a particular
hazard. Hazard severity can range from catastrophic where many people are killed to
minor where only minor damage results.
Hazard
probability
The probability of the events occurring which create a hazard. Probability values tend
to be arbitrary but range fromprobable(say 1/100 chance of a hazard occurring) to
implausible (no conceivable situations are likely where the hazard could occur).
Risk
This is a measure of the probability that the system will cause an accident. The risk is
assessed by considering the hazard probability, the hazard severity and the
probability that a hazard will result in an accident.

WaystoachieveSafety
•Hazardavoidance
–Thesystemisdesignedsothathazardsimplycannotarise.
–Eg.Press2buttonsatthesametimeinacuttingmachinetostart
•Hazarddetectionandremoval
–Thesystemisdesignedsothathazardsaredetectedandremoved–Thesystemisdesignedsothathazardsaredetectedandremoved
beforetheyresultinanaccident.
–Eg.Openreliefvalve ondetectionoverpressureinchemicalplant.
•Damagelimitation
–The system includes protection features that minimise the damage
thatmayresultfromanaccident
–Automaticfiresafetysysteminaircraft.
31Prof. Loganathan R., CSE, HKBKCE

Security
•Securityisasystempropertythatreflectsthe
ability to protect itself from accidental or
deliberateexternalattack.
•Securityisbecomingincreasinglyimportantas•Securityisbecomingincreasinglyimportantas
systemsarenetworkedsothatexternalaccess
tothesystemthroughtheInternetispossible
•Security is an essential pre-requisite for
availability,reliabilityandsafety
•Example:Viruses,unauthoriseduseofservice/datamodification
32Prof. Loganathan R., CSE, HKBKCE

Securityterminology
Term Definition
Exposure
Possible loss or harm in a computing system. This can be loss or
damage to data or can be a loss of time and effort if recovery is
necessary after a security breach.
Vulnerability
A weakness in a computer-based system that may be exploited to
causelossorharm.
33Prof. Loganathan R., CSE, HKBKCE
causelossorharm.
Attack
An exploitation of a system vulnerability. Generally, this is from
outside the system and is a deliberate attempt to cause some
damage.
Threats
Circumstances that have potential to cause loss or harm. You can
think of these as a system vulnerability that is subjected to an
attack.
Control
A protective measure that reduces a system vulnerability.
Encryption would be an example of a control that reduced a
vulnerability of a weak access control system.

Damagefrominsecurity
•Denialofservice
–The system is forced into a state where normal services
becomeunavailable.
•Corruptionofprogramsordata
–The system components of the system may altered in an
unauthorisedway,whichaffectsystembehaviour&hence
itsreliabilityandsafety
•Disclosureofconfidentialinformation
–Information that is managed by the system may be
exposed to people who are not authorised to read or use
thatinformation
34Prof. Loganathan R., CSE, HKBKCE

Securityassurance
•Vulnerabilityavoidance
–Thesystemisdesignedsothatvulnerabilitiesdonotoccur.
For example, if there is no external network connection
thenexternalattackisimpossible
•Attackdetectionandelimination•Attackdetectionandelimination
–The system is designed so that attacks on vulnerabilities
are detected and remove them before they result in an
exposure. For example, virus checkers find and remove
virusesbeforetheyinfectasystem
•Exposurelimitation
–The consequences of a successful attack are minimised.
Forexample,abackuppolicyallowsdamagedinformation
toberestored
35Prof. Loganathan R., CSE, HKBKCE