Introduction to Software Testing
-
Upload
henry-muccini -
Category
Education
-
view
1.600 -
download
4
description
Transcript of Introduction to Software Testing
Università degli Studi dell’Aquila
L19: Introduction to Software Testing
Henry Muccini
DISIM, University of L’Aquila
www.henrymuccini.com, [email protected]
The material in these slides may be freely reproduced and distributed, partially or totally, as far as an explicitreference or acknowledge to the material author ispreserved.
With very special thanks to Antonia Bertolino and Debra J. Richardson which collaborated in previous versions of these lecture notes
Software Failures: examples
Verification and Validation
Software Testing: Intro
Software Testing: Basics
AGENDA
Software Testing: Basics
Testing Process
Type of Testing
The Skype
pervasive
failures:
•Black out for two hours
•Europe, Japan, Australia,
Afghanistan, Sud Africa, Afghanistan, Sud Africa,
Malesia and Brasil have
been affected
Therac-25 safety failure:
•approximately 100 times the intended dose of
radiation
•3 people died, and 6 got injured
Factors:
•Overconfidence in
Software
•Confusing reliability
with safety
•Lack of defensive Design
see article at: http://sunnyday.mit.edu/papers/therac.pdf
•Lack of defensive Design
•Failure to eliminate
fault causes
•Inadequate software
engineering practices
•…
Ash Cloud–related
stress software
failures
Trains in the Netherlands (march 22, 2012)
Tens of thousands of people around the large cities weren’t able to travel by train Thursday morning. No trains from and to Amsterdam and Airport Schiphol from early morning until after the morning rush hour. A failure in the back-up system was the cause. ProRail said that there was a fault in the ‘switch software’. The system therefore didn’t start. And then the signals and switches could not be operated.
Checking some other articles, it simply tries to say that both primary and backup Checking some other articles, it simply tries to say that both primary and backup failed, hence no operations.
Links:http://www.elsevier.nl/web/Nieuws/Nederland/334086/Oorzaak-van-treinstoring-blijkt-fout-in-software.htm
http://www.rnw.nl/english/bulletin/trains-amsterdam-running-again
On impact on people:http://www.dutchnews.nl/news/archives/2012/03/signalling_problems_cause_rail.php
The Poste Italiane business failure:
Il sistema del MIUR per le prove di maturità:“La maturità 2.0 parte con un flop. Il sistema «commissione web», la novità dell’esame di Stato 2012, non ha funzionato. Il software, messo a punto per consentire alle commissioni di comunicare in tempo reale col cervellone centrale del Miur tutte le attività connesse con gli esami, è andato in tilt ancora prima di partire. Nelle scuole di Firenze le commissioni non sono riuscite ad inserire online i verbali delle riunioni di insediamento che si sono tenute questa mattina. ”
http://corrierefiorentino.corriere.it/firenze/notizie/cronaca/2012/18-giugno-2012/maturita-20-partenza-flop-201657781657.shtml
Prenotazioni Trenitalia:
“Il nuovo sistema di Ferrovie dello Stato è un disastro: c'è chi non riesce più a usare il proprio codice ma non può cancellarsi perché per farlo occorre usare il codice.Dalle 1 alle 3 di notte non funziona, perché fanno la manutenzione, ma mica te lo dicono …”
http://righedidiomira.blogspot.it/2012/01/sempre-trenitalia-sempre-piu-disservizi.html
Fineco, pagamento IMU F24:Con la detrazione prima casa, il mio imponibile va sotto zero e il sistema va in tilt.
http://violapost.it/?p=7351
[Fatal Defect, Ivars Peterson, 1995]
Half book is about failures in software development
http://www.wired.com/software/coolapps/news/2005/11/69355?currentPage=all
http://www.devtopics.com/20-famous-software-disasters/
NIST (National Institute of Standards and
Technology) study in 2002 [NIST],
→software errors cost the U.S. economy $59.5 billion
every year.
Standish Chaos report [Standish]Standish Chaos report [Standish]
→a clear statement of requirements is one of the three
main reasons that lead to project success, as well as
incomplete requirements are one of the main reason of
projects deletion.
[NIST] The economic impacts of inadequate infrastructure for software testing. In NIST Planning Report 02-3. 2002. http://www.nist.gov/public affairs/releases/n02-10.htm.[Standish] The standish group report: Chaos. 1995. http://www.projectsmart.co.uk/docs/chaos-report.pdf.
Validation:
does the software system meets the user's real needs?
are we building the right software?
(valid with respect to users’ needs)
Verification:
does the software system meets the requirements
specifications?
are we building the software right?
(valid with respect to the system specification)
Software Inspection (static analysis technique)
Debugging (to locate and fix bugs)
Theorem proving
Model checking (to prove a property Model checking (to prove a property correctness)
Software Testing
(None is the absolute perfect solution)
Completeness & Correctness
•• Correctness properties are Correctness properties are undecidableundecidable
•• False positive and False negativeFalse positive and False negative
Timeliness
•• The V&V process stops (most of the time) where there is The V&V process stops (most of the time) where there is
no more no more timetime
•• Time is one of the stopping rulesTime is one of the stopping rules
Cost-effectiveness
•• “Select the less that gives you the most”“Select the less that gives you the most”
•• V&V is justified especially when failures are expensiveV&V is justified especially when failures are expensive
SOFTWARE TESTING
An all-inclusive definition
Software testing consists of:
the dynamic verification of the behavior of a program
on a finite set of test cases
IMP
suitably selected from the (in practice infinite) input domain
against the specified expected behavior
[A. Bertolino]
is not
(citation from Hamlet, 1994):
I've searched hard for defects in this program, found a lot of them, and repaired them. I can't find any more, so I'm confident there aren't any.
� Testing is NOT exhaustiveexhaustive
IMP
� Testing is NOT exhaustiveexhaustive
⇒⇒What to test?What to test?
⇒⇒When to stop?When to stop?
� Testing is NOT cheapcheap
⇒⇒ test less and best!!test less and best!!
⇒⇒When to stop?When to stop?
(1) Testing Process:
� Test Selection (Category partition)
� Test Execution
� Oracle
� Regression Testing
�S
yste
ma
ticvs. A
d H
oc
�G
lossa
ry
� Regression Testing
(2) Type of Testing:
� Black Box and White Box
� Unit, Integration, System
� White Box
vs. A
d H
oc
Testing involves several demanding tasks:
→Test selection─ how to identify a suitable finite set of test cases
→Test execution─ how to translate test cases into executable runs
IMP
→Test oracle─ Deciding wheter the test outcome is acceptable or not
─ If not, evaluating the impact of the failure and its direct cause (the fault)
→Testing adequacy─ Judging wheter the test campaign is sufficient
→Test coverage
Test selection consists in the identification of a
“suitable” and finite set of test cases.
The test selection activity provides guidelines on how to
select test cases. It is driven by a ‘‘test criterion’’ and has select test cases. It is driven by a ‘‘test criterion’’ and has
to produce ‘‘suitable’’ test cases
Slide taken from Alex Orso
Test Criterion:
� A test criterion provides the guidelines, rules, and strategy by which test cases are selected. In general, a test criterion is a means of deciding which shall be a ‘‘good’’ set of test cases (Reference 117 of [Muccini08]) .
Suitability: Suitability:
� A test case is suitable if it contributes to discovering as many failures as possible, according to a test criterion.
[Muccini08] Henry Muccini, Software Testing: Testing New Software Paradigms andNew Artifacts, in: Wiley Encyclopedia of Computer Science and Engineering, John Wiley & Sons, Inc., 2008
Test Case:
� A test case is a set of inputs, execution conditions, and a
pass/fail criterion (Ref. 116 of [Muccini08]) .
A test case thus includes not only input data but also any
relevant execution conditions and procedures, and includes a
way of determining whether the program has passed or way of determining whether the program has passed or
failed the test on a particular execution (Ref. 8 of
[Muccini08]).
Test Suite:
� A test suite is a collection of test cases.
The EasyLine system is
composed of three sub-
systems:
�The SP system
�The Mobile app
�The server-side
application
How to select test cases? (test selection technique)
How many test cases? (when to stop --stopping rule?)
Which artefacts to use for selecting test cases? (code, spec?)(code, spec?)
Ad hoc or Systematic testing?
Tester’s intuition and expertise
• “Ad hoc testing” (sometime quite effective)
• Special cases
Specifications
• Equivalence partitioning
• Boundary-value analysis
Fault-based
• Error guessing/special cases
• Mutation
Usage
• SRET• Decision table
• Automated derivation from formal specs (conformance t.)
• ....
Code
• Control-flow based
• Data-flow based
• SRET
• Field testing
Nature of application, e.g.:
• Object Oriented
• Web
• GUI
• Real-time, embedded
• Scientific
• .....
No one is the best technique, but a combination of different criteria has empirically
shown to be the most effective approach
Code-based: (code graphs)
→Structural/White Box Testing
─ Test cases selected based on structure of code
─ Views program /component as white box
(also called glass box testing)
Source
Code
(Test) Inputs
Output
Internal behavior
Specification-based: (Input-output)
→Functional/Black Box Testing
─ Test cases selected based on specification
─ Views program/component as black box
Internal behavior
Bynary
Code or
Spec
(Test) Inputs
Output
We focus on “systematic” testing:
→Repeatable
→Measurable
─ best tester
IMP
─ coverage
→Based on sampling:
─ Infinite input domain, but finite set of test
cases
Two are the main sub-activities to be performed:
→B1) identify those “inputs" which force the execution of the
selected test case,
→B2) put the system in a state from which the specified test
can be launched.
B1 -- Forcing the execution of the test cases derived
according to one criterion might not be obvious
→In code-based testing, we have entry-exit paths over the
graph model and test inputs that execute the corresponding
program paths need be found
B2 -- put the system in a state from which the specified
test can be launched
→Also called, Test Pre-condition
→In Synchronous Systems:
─ Several runs in sequence are required to put the system in the test Several runs in sequence are required to put the system in the test
pre-condition
→In Concurrent Systems:
─ Non Determinism problem
Replay problem
The EasyLine system is
composed of :
�Web services
�Sensors
�Mobile applications�Mobile applications
� routing algorithms
�…
A test oracle is a mechanism for verifying the behavior of test execution
→ extremely costly and error prone to verify
→ oracle design is a critical part of test planning
Sources of oracles
input/outcome oracle→ input/outcome oracle
→ tester decision
→ regression test suites
→ standardized test suites and oracles
→ gold or existing program
→ formal specification
BACK
The Expected Output is
f*(d)
YES, given input d
f(d) = f*(d)f(d) = f*(d)
» In some cases easier (e.g., an existing version,
existing formal specification), but generally very
difficult (e.g., operational testing)
» Not enough emphasized research problem
Theoretical notions of test adequacy are usually defined in terms of adequacy criteria
→ Coverage metrics (sufficient percentage of the program structure has been exercised)
→ Empirical assurance (failures/test curve flatten out)
→ Error seeding (percentage of seeded faults found is → Error seeding (percentage of seeded faults found is proportional to the percentage of real faults found)
→ Independent testing (faults found in common are representative of total population of faults)
Adequacy criteria are evaluated with respect to a test suite and a program under test
BACK
(1) Testing Process:
� Test Selection (Category partition)
� Test Execution
� Oracle
� Regression Testing
�S
yste
ma
ticvs. A
d H
oc
�G
lossa
ry
� Regression Testing
(2) Type of Testing:
� Black Box and White Box
� Unit, Integration, System
� White Box
vs. A
d H
oc
Black box vs White box [in next lectures]
Unit, Integration, System
Performance, Stress
Regression Testing [in next lectures]Regression Testing [in next lectures]
…
Unit:
→The Unit test purpose is to ensure that the unit satisfies its functional
specification and/or that its implemented structure matches the intended
design structure
→Unit tests can also be applied for test interface or local data structure.
Integration:Integration:
→Integration testing is specifically aimed at exposing the problems that
arise from the combination of components
→Communicating interfaces among integrated components need to be tested
→Type: big-bang or incremental (top-down, bottom-up, mixed)
System:
→It attempts to reveal bugs which depends on the environment
→Recovery testing, security testing, stress testing and performance testing
SoftwareRequirementsSpecification
Architecture
UserRequirements
System
Testing
Acceptance
Testing
levels
of
abstr
action
plan &validate/verify
Unit
Implementations
Component
Specifications
ArchitectureDesign
Specification
UnitTesting
Integration
Testing
Component
Testingdesign &
analyze
integrate
& test
time
The EasyLine system is
composed of three sub-
systems:
�The SP system
�The Mobile app
�The server-side
application
Stress testing: designed to test the software with abnormal situations.
→Stress testing attempts to find the limits at which the system will fail through abnormal quantity or frequency of inputs.
→The test is expected to succeed when the system is stressed with higher rates of inputs, maximum use of memory or system resources.
Performance testing is usually applied to real-time, embedded systems in which low performances may have serious impact on the normal execution.
→Performance testing checks the run-time performance of the system and may be coupled with stress testing.
→Performance is not strictly related to functional requirements: functional tests may fail, while performance ones may succeed.