1
ASSESSMENT OF TRAINEES USING: -
OBJECTIVE STRUCTURED CLINICAL/PRACTICAL
EXAMINATION (OSCE/OSPE)
By NAKAYE ANNET OMARA(Mrs.)
2
OBJECTIVES
Definitions
Purpose
Various aspects for assessment
Types of OSCE station
Benefits of OSCE
Disadvantages of OSCE
Uses of OSCE
3
OBJECTIVES contd..
Stages in setting OSCE stations
Organization of an OSCE
WHAT IS OSCE
4
5
WHAT IS OSPE/OSCE?
A multi-station, multi-task assessment process
. . . . .
An examination where candidates pass
through a number of examination station to
answer or solve various problems (Amri,
Ngatia & Mwakilasa 1993)
An approach that assesses competences in an
objective rather than subjective manner.
Areas of assessment are carefully selected
6
WHAT IS OSPE/OSCE cont’d
It is a system of examination which may be
visualized as a room with a number of
stations that are isolated from one
another( Pemba 2007)
Each station represents a different aspect of
training that needs to be examined and the
student spends some time at each station
Definition
OSCE is a procedure where predetermined
decisions are made on the competencies to
be tested and checklists incorporating
important evaluable skills are prepared
7
Why OSCE / OSPE is?
Objective : examiners use a checklist for
evaluating the trainees
All the candidates are presented with the same
test ƒ
Specific skill modalities are tested at each
station
–ƒ History taking
–ƒ Explanation
–ƒ Clinical examination ƒ
– Procedures
8
Why OSCE is,
Structured: trainee sees the same problem
and performs the same tasks in the same
time frame.
The marking scheme for each station is
structured
ƒ Structured interaction between examiner
and student
9
OSCE – Clinical Examination
Clinical: the tasks are representative of those faced in real clinical
situations.
•Test of performance of clinical skills ƒ
• Candidates have to demonstrate their skills, not just describe the
theory
10
Purpose of OSCE
To Eliminate the Following:
Variability
Defects in competencies examined
Difficulties in conducting Exams
12
Purpose
13
14
15
What clinical competences that can be
assessed using OSCE/OSPE
Taking history
Performing physical examination like breast
examination
Taking vital signs
Heath education skills
Counseling skills
Prescribing treatment
Wound dressing
Interpretation of lab results and plotted charts
16
Types of OSPE STATIONS
Two types;
Procedure station-performs a skill practically
Question station- answers a question by
writing
17
Benefits of OSPE
More reliable
A large number of students can be tested in a short
period
Issues of assessment is determined in advance
The checklists for examiners provides for objectivity
Candidates are given chance to be assessed by
several examiners
All candidate have almost the same type of patients
18
Benefits of OSPE contd..
Allows adequate sampling of skills
Allows adequate analytical and objective
observation of a skill because the observed
skill is quite specific
19
Disadvantages
Assessment of KAP in compartments
Demanding in terms of examiners and
resources
Demands for more time for setting and
preparations
20
uses
Used for CATs, ESE,EYE.
Suitable for criterion-reference examination
where a decision of pass/fail have to be made
Used to assess areas of weakness in a
student
It can be used to identify weakness in the
system hence an approach to provide
feedback
DESIGNING THE OSCE
21
22
STAGES IN SETTING UP/Planning
OSPE STATIONS
Identification of topics and competences to
be tested by setting up a table of
specification
Deciding on the number of stations and
timing per station
Allocating competences per station
Setting up instructions for examiners, rating
scares and checklist for each station
23
STAGES IN SETTING cont;d
List resources required for each station
–Examiners markers and observers
–Patients, normal people, simulators etc
–Room, furniture, beds, clock, bell etc
Allocate marks for each station and within each
station
Prepare model answers for each station
Moderate/ crosscheck the station preferably with a
friend. –is the station testing what should be tested
24
STAGES IN SETTING cont;d
Is the station scenario and instructions clear
Can the task be achieved in the allocated
time
25
ORGANISATION
Advance planning. The following has to be
done;
Briefing of examiners
Selection and briefing of patients
Examination area and staff be prepared
Equipment must be assembled
Preparation and arrangement of stations and
check lists
26
DAY BEFORE THE EXAMINATIONS
Final check up on the preparation and station
set up
Examiners should have the scenarios and
check lists ready
The wards for exams should be checked
A coordinator and a time keeper should be in
place
27
THE DAY OF EXAMINATIONS
The coordinator and organizers should be at the
venue early.
Briefing of student about the stations especially the
movement and first position
Provision of students’ identification tags.
Labeling of the stations
Checking to confirm that all examiners are at the
venue
Ensure that instructions and other documents are
available
28
AFTER EXAMINATIONS
Compilation of marks and remarks
Provision of feed back to students
If possible provision of feedback to
examiners
29
OSPE EXAMINERS
Examiners are required for only procedure
stations
In a question station, no examiner is required
Examiner are not required work hard- this is
simplified by check list
30
TIME ALLOCATION
All stations are allocated the same time, this
enables students to move at the same pace
from station to station
Out put of each station always match with the
time allocated
Time allocated ranges from five to ten
minutes.
31
NUMBER STUDENTS FOR
ASSESSMENT
Depends on the number of stations made
Eight to twenty students may assessed
Simultaneously
UNMEB uses five stations therefore ten
students at ago
Provision for parallel station can be made in
case of many students
32
MARKS ALLOCATION
Depends strength of the scenario
Marks varies station per station however,
they must add up to 100% for all stations
33
EXAMPLE OF AN OSPE STATION
STATION 1
SCENARIO-
–In this station there is a lady awaiting routine breast
examination
INSTRUCTIONS
–Examine the right breast of this lady
–Report your findings to the examiner as you examine
34
EXAMINERS CHECKLIST
STATION 1 ( examination of the right
Breast)
Student’s number
Examiner’s name
Key areas of assessment
–Communication
–Inspection
–Palpation
Rating Scales
A rating scale is a measurement scale that
has a defined rubric. A rating scale allows
raters to quantify a
variety of skills often assessed during
OSCEs, including complex, interrelated, or
noncognitive skills.
35
While rating scales can be used to quantify
subtle aspects of performance, the scale
may be relatively objective or subjective
depending on the scale design and training
level of raters.
36
The first step in developing a rating scale is
to determine the question the scale is meant
to answer. The
following are examples of questions on which
we could base a rating scale related to verbal
communication skills:
37
Overall, how satisfactory are this student’s
verbal communication skills?
What is the overall readiness of this student
to progress to the next stage of education?
Compared to other third-year students, how
are this student’s verbal communication
skills?
38
The next step in developing a rating scale is
to select the appropriate format for the scale,
given whatyou want to measure. Two of the
most common rating scale formats are:
Likert-type scales
Behaviorally anchored rating scales (BARS)
39
Likert-type scales are typically used for
measuring attitudes or beliefs among raters,
or for capturing
global statements using a symmetric, bipolar
rating continuum.
40
Likert-type ratings make the assumption that
all points are equidistant; eg, the distance
between “strongly agree” and “agree” is the
same asbetween “disagree” and “strongly
disagree.” An example follows:
41
42
BARS are a type of Likert scale, but behavior
anchors are provided for each scoring point. Thus,
BARS items are suited for measuring traits among
examinees
On an anchored scale, students are rated
according to where their performance lies along a
continuum that is defined, or anchored, by specific
levels of behaviors or actions.
43
44
Anchored scales require users to be trained
to recognize behavioral distinctions along the
scale.
The selection and definition of these
behaviors, and the training of the raters, can
pose meaningful challenges for scale
developers.
45
Checklists
Checklists can be useful for capturing
whether a set of actions is performed
correctly.
In addition, checklists are often (but not
always) more objective than rating scales
because the scoring is often based directly
on observed actions
46
Using a checklist, raters can record what
behaviors or skills students are
demonstrating via categorical (usually
dichotomous) responses for each behavior
One common way to score checklists is to
sum the responses, so that students who
perform more actions are judged to be better
at the set of tasks.
47
An alternate way of scoring is to provide
weights for all checklist items that represent
the varying importance of each item, with the
weights applied prior to summing the items.
48
Rater Training
To prevent bias in ratings, it is important to
train and monitor raters throughout an OSCE
administration.
If raters emphasize different skills or are
inconsistent in their ratings, the resulting
scores may not be useful.
49
Raters should understand not only how a
scale will be used but also why that scale
was chosen.
The development of a scale or checklist
needs to fit with the purpose of the
assessment
50
. For example, a checklist about a
physician’s communication patterns is not
useful if the physician is being assessed on
his or her ability to explain the treatment plan
to the patient.
51
Common Rater Biases
Severity/leniency
Halo effect
Restriction of range
Primacy effect
52
Severity/leniency refers to the tendency of a
rater to give low ratings (severity) or high
ratings (leniency) to all students.
This is not always considered a bias but
rater training should be used to bring all
raters to a shared idea of acceptable levels
of severity (or leniency) in order to prevent
extreme ratings.
53
The halo effect occurs when raters rate
participants on other (appealing or not-so-
appealing) characteristics rather than the
targeted ability or trait to be measured.
54
For example, a rater may give a higher score
to a checklist item related to nonverbal
behavior because the rater thought the
student was friendly or did well on the other
checklist items but did not actually display
excellent nonverbal behaviors.
55
There may be other aspects of the student
that are irrelevant to any construct.
Restriction of Range
Restriction of range, or central tendency,
refers to the likelihood of raters to rate all
performance in the middle of the scale,
regardless of all the options available.
56
Primacy Effect
The primacy effect encourages raters to
assume that the next performance is similar
to the performances they have just seen,
causing them to compare an individual with
their group, rather than the scoring rubric.
57
Some methods to help prevent
rater biases
Providing experiences and exercises to help
raters make reliable judgments
Allowing the raters a chance to discuss and
refine the rating criteria
Providing raters with enough opportunities to
observe sample performances and locate
them along the scale correctly and
consistently
58
Providing raters with enough opportunities to
observe student performance along the
entire score scale
59
60
Checklist formats
s/nTask Well
done
Poorly
done
Not
done
total
Total
Rating scale
s/
n
Tasks Poor
1
Fair
2
Good
3
Very
good
4
Excell
ent
5
Not
done
0
61
62
Quality of an Assessment
Utility = (is a function of the instrument’s)
Reliability
Validity
Educational impact
Acceptability
Feasibility
By Van der Vleuten, C. The assessment of professional
competence: developments, research and practical implications,
Advances in Health Science Education, 1996, Vol 1: 41-67
63
Advantages of OSCEs
Careful specification of content = Validity
Observation of wider sample of activities =
Reliability
Structured interaction between examiner &
student
Structured marking schedule
Each student has to perform the same tasks =
Acceptability
64
Importance of Reliability
65
Reliability continued
66
Reliability cont’d
67
Validity
Validity evidence for a scale score is the
evidence that shows a scale is actually
measuring what it is assumed to measure
and supports the use of the score in making
inferences, judgments, and decisions about
students.
68
Validity evidence should be collected along
with reliability evidence to ensure that the
ability is not only being measured reliably but is
also the targeted ability of interest.
69
Many of the rater biases and scenarios that
negatively impact reliability also have the
potential to negatively impact the validity of
the scale scores in making inferences
70
If raters who are too lenient produce scores
that are too high, students will appear to be
more capable than they actually are.
If a checklist is too long, rater fatigue could
cause raters to make errors in observation or
rating.
71
If an anchored scale contains too many
points, this may result in distinctions being
made among student performances that are
not actually meaningful from an educational
or clinical standpoint.
72
If raters allow irrelevant aspects of the
student to impact the rating scale, this will
result in scores that are measuring these
aspects as well as the construct of interest
73
Disadvantages of OSCEs
74
Assignment
Divided in groups according to specialties
Write 2 scenarios with their check lists which
you will going to present to class
After all groups have presented then you
prepare the actual scenario for practice