Assessment.pptx module 1: professional education

geromegerero7 34 views 139 slides Sep 30, 2024
Slide 1
Slide 1 of 139
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123
Slide 124
124
Slide 125
125
Slide 126
126
Slide 127
127
Slide 128
128
Slide 129
129
Slide 130
130
Slide 131
131
Slide 132
132
Slide 133
133
Slide 134
134
Slide 135
135
Slide 136
136
Slide 137
137
Slide 138
138
Slide 139
139

About This Presentation

Assessment in learning


Slide Content

Assessment

an instrument designed to measure any characteristic, quality, ability, knowledge or skill. It comprised of items in the area it is designed to measure. Test

a process of quantifying the degree to which someone/something possess a given trait Measurement

a process of gathering and organizing quantitative or qualitative data into an interpretable form to have a basis for judgement or decision making. It is a prerequisite to evaluation. It provides the information which enables evaluation to take place. Assessment

a process of systematic interpretation, analysis, appraisal or judgement of the worth of organized data as basis for decision making. It involves judgement about the desirability of changes in students. Evaluation

It refers to the use of pen and paper objective test. Traditional Assessment

it refers to the use of methods other than pen and paper objective test which includes performance tests, projects, portfolios, journals and the likes. Alternative Assessment

it refers to the use of an assessment method that simulate true to life situations. This could be objective tests that reflect real-life situations or alternative methods that are parallel to what we experience in real life. Authentic Assessment

Assessment FOR Learning – this includes three types of assessment done before and during instruction. These are placement, formative and diagnostic. Purposes of Classroom Assessment

• Its purpose is to assess the needs of the learners to have basis in planning for a relevant instruction. • Teachers use this assessment to know what their students are bringing into the learning situation and use this as a starting point for instruction. • The results of this assessment place students in specific learning groups to facilitate teaching and learning. a. Placement – done prior to instruction

• This assessment is where teachers continuously monitor the students’ level of attainment of the learning objectives. • The results of this assessment are communicated clearly and promptly to the students for them to know their strengths and weaknesses and the progress of their learning. b. Formative – done during instruction

• This is used to determine students’ recurring or persistent difficulties. • It searches for the underlying causes of student’s learning problems that do not respond to first aid treatment. It helps formulate a plan for detailed remedial instruction. c. Diagnostic – done before or during instruction

this is done after instruction. This is usually referred to as the summative assessment. • It is used to certify what students know and can do and the level of their proficiency or competency. • The information from assessment of learning is usually expressed as marks or grades. • The results of which are communicated to the students, parents and other stakeholders for decision making. 2. Assessment OF Learning

this is done for teachers to understand and perform well their role of assessing FOR and OF learning. It requires teachers to undergo training on how to assess learning and be equipped with the following competencies needed in performing their work as assessors. 3. Assessment AS Learning

MODE DESCRIPTION EXAMPLES ADVANTAGE DISADVANTAGE Traditional The objective paper and pen test which usually assesses s low-level thinking skills. Standardized tests Teacher-made tests -Scoring is objective. -Administration is easy because students can take the test at the same time. -Preparation of instrument is time-consuming. -Prone to cheating. Performance Requires actual demonstration of skills of creation of products of learning -Practical test -oral test -projects -preparation of the instrument is relatively easy -measures behaviors that cannot be deceived -scoring tends to be subjective without rubrics -administration is time consuming Portfolio -a process of gathering multiple indicators of student progress to support course goals in dynamic, ongoing and collaborative process. -working portfolios -show portfolios -documentary portfolios -measures student’s growth and development -intelligence-fair -development is time consuming -ratings tends to be subjective without rubrics MODES OF ASSESSMENT

1. Placement Evaluation -done before instruction -determines mastery of prerequisite skills -not graded FOUR TYPES OF EVALUATION PROCEDURES

2. Summative Evaluation -done after instruction -certifies mastery of the intended learning outcomes -graded -examples include quarterly exams, unit or chapter tests, final exams

determine the extent of what the pupils have achieved or mastered in the objectives of the intended instruction determine the students in specific learning groups to facilitate teaching and learning serve as a pretest for the next unit serve as basis in planning for a relevant instruction Both Placement and Summative Evaluations:

reinforces successful learning provides continues feedback to both students and teachers concerning learning success and failures not graded examples: short quizzes, recitations 3. Formative Evaluation

determine persistent deficiencies helps formulate a plan for remedial instruction 4. Diagnostic Evaluation

1. administered during instruction 2. designed to formulate a plan for remedial instruction 3. modify the teaching and learning process 4. not graded Formative and Diagnostic Evaluation both:

Principle 1: Clarity of Learning Targets • Clear and appropriate learning targets include what students know and can do and the criteria for judging student performance. PRINCIPLES OF HIGH QUALITY ASSESSMENT

the method of assessment to be used should match the learning targets Principle 2: Appropriateness of Assessment Methods

A balanced assessment sets target in all domains of learning or domains of intelligence. A balanced assessment makes use of both traditional and alternative assessments. Principle 3: Balance

is the degree to which the assessment instrument measures what it intends to measure. It is also refers to the usefulness of the instrument for a given purpose. It is the most important criterion of a good assessment instrument. Principle 4: Validity

1. Face validity – is done by examining the physical appearance of the instrument to make it readable and understandable. 2. Content validity – is done through a careful and critical examination of the objectives of assessment to reflect the curricular objectives. Ways in Establishing Validity

–is established statistically such that a set of scores obtained in another external predictor or measure. It has two purposes: concurrent and predictive. a. Concurrent validity – describes the present status of the individual by correlating the sets of scores obtained from two measures given at a close interval. b. Predictive validity – describes the future performance of an individual by correlating the sets of scores obtained from two measures given at a longer time interval. 3. Criterion-related validity

– is established statistically by comparing psychological traits of factors that theoretically influence scores in a test. a. Convergent validity – is established if the instrument defines another similar trait other than what it is intended to measure. Ex. Critical thinking may be correlated with creative thinking test b. Divergent Validity – is established if an instrument can describe only the intended trait and not the other traits. Ex. Critical thinking test may not be correlated with reading comprehension test. 4. Construct validity

this refers to the degree of consistency when several items in a test measure the same thing and the stability when the same measures are given across time Split-Half method, test-retest method, parallel or equivalent form Principle 5: Reliability

fair assessment is unbiased and provides students with opportunities to demonstrate what they have learned. Principle 6: Fairness

When assessing learning, the information obtained should be worth the resources and time required to obtain it. The easier the procedure, the more reliable the assessment is. Principle 7: Practicality and Efficiency

Assessment takes place in all phases of instruction. It could be done before, during and after instruction. Principle 8: Continuity

Assessment targets and standards should be communicated. Assessment results should be communicated to important users. Assessment results should be communicated to students through direct interaction or regular ongoing of feedback on their progress. Principle 9: Communication

Assessment should have a positive consequence to students; that is; it should motivate them to learn. Assessment should have a positive consequence to teachers; that is, it should help them improve the effectiveness of their instruction. Principle 10: Positive Consequences

Teachers should free the students from harmful consequences of misuse or overuse of various assessment procedures such as embarrassing students and violating students right to confidentiality. Teachers should be guided by laws and policies that affect their classroom assessment. Administrators and teachers should understand that it is inappropriate to use standardized student achievement to measure teaching effectiveness. Principle 11: Ethics

is a process of gathering information about student’s learning through actual demonstration of essential and observable skills and creation of products that are grounded in real world contexts and constraints. It is an assessment that is open to many possible answers and judged using multiple criteria or standards of excellence that are pre-specified and public. Performance-based Assessment

1. Demonstration type – this is a task that requires no product. Examples: cooking demonstration, presentations 2. Creation-type – this is a task that requires tangible products Example: project plan, research paper, project flyers Types of Performance-based Task

is also an alternative to pen and paper objective test. It is a purposeful, ongoing, dynamic and collaborative process of gathering multiple indicators of the learner’s growth and development. Portfolio assessment is also performance based but more authentic than any performance-based task. Portfolio Assessment

1. Content principle – suggests that portfolios should reflect the subject matter that is important for the students to learn. 2. Learning principle – suggests that portfolios should enable the students to become active and thoughtful learners. 3. Equity principle -explains that portfolios should allow students to demonstrate their learning styles and multiple intelligences. Principles Underlying Portfolio Assessment

1. The working portfolio is a collection of a student’s day to day works which reflect his/her learning. 2. The show portfolio is a collection of a student’s best works. 3. The documentary portfolio is a combination of a working and a show portfolio. Types of Portfolios

Levels of Learning Outcomes Description Some Question Clues Knowledge Involves remembering or recalling previously learned material or a wide range of materials -list, define, identify, name, recall, state, arrange Comprehension Ability to grasp the meaning of material b translating materials from one form to another or by interpreting material -describe, interpret, classify, differentiate, explain, translate Application Ability to use learned material in new and concrete situations -apply, demonstrate, solve, interpret, use, experiment Analysis Ability to break down material into its component parts so that the whole structure is understood -analyze, separate, explain, examine, discriminate, infer Synthesis Ability to put parts together to form a new whole -integrate, plan, generalize, construct, design, propose Evaluation Ability to judge the value of material on the basis of a definite criteria -assess, decide, judge, support, summarize, defend A. COGNITIVE DOMAIN

Categories Description Some Illustrative Verbs Receiving Willingness to receive or to attend to a particular phenomenon or stimulus -acknowledge, ask, choose, follow, listen, reply, watch Responding Refers to active participation on the part of the student -answer, assist, contribute, cooperate, follow-up Valuing Ability to see worth or value in a subject activity -adopt, commit, desire, display, explain, initiate Organization Bringing together a complex of values, resolving conflicts between them, and beginning to build an internally consistent value system -adapt, categories, establish, generalize, integrate, organize B. AFFECTIVE DOMAIN

Categories Description Some Illustrative Verbs Imitation Early stages in learning a complex skill after an indication or readiness to take a particular type of action -carry out, assemble, practice, follow, repeat, sketch, move Manipulation A particular skill or sequence; is practiced continuously until it becomes habitual and done with some confidence and proficiency -acquire, complete, conduct, improve, perform, produce Precision A skill has been attained with proficiency and efficiency -achieve, accomplish, excel. master, succeed Articulation An individual can modify movement patters to meet a particular situation -adapt, change, excel, reorganize, rearrange Naturalization An individual responds automatically and creates new motor ways of manipulation out of understanding, abilities and skills developed arrange, combine, compose, construct, create, design C. PSYCHOMOTOR DOMAIN

MAIN POINT OF COMPARISON TYPES OF TESTS Purpose Psychological Educational -aims to measure students’ intelligence or mental ability in a large degree without reference to what the students has learned (e.g. aptitude test, personality tests, intelligence tests) -aims to measure the result of instructions and learning (e.g. achievement tests, performance test) DIFFERENT TYPES OF TEST

Scope of Content Survey Mastery -covers a broad range of objectives -covers a specific objective -measures general achievement in certain subjects -measures fundamental skills and abilities -constructed by trained professional -typically constructed by the teacher

Language Mode Verbal Non-verbal -words are used by students in attaching meaning to or responding to test items -students do not use words in attaching meaning to or in responding to test items

Construction Standardized Informal -constructed by a professional item writer -constructed by a classroom teacher -covers a broad range of content covered in a subject area -covers a narrow range of content Uses mainly multiple choice -various types of items are used -items written are screened and the best items were chosen for the final instrument -teacher picks or writes items as needed for the test -can be scored by a machine -scored manually by the teacher -interpretation of results is usually norm-referenced -interpretation is usually criterion-referenced

Manner of Administration Individual Group -mostly given orally or requires actual demonstration of skill -this is a paper and pen test -one on one situations, thus, many opportunities for clinical observation -loss of rapport, insights and knowledge about each examinee -chance to follow up examinee’s response in order to clarify or comprehend more clearly -same amount of time needed to gather information from one student

Effect of Biases Objective Subjective -scorer’s personal judgment does not affect the scoring -affected by scorer’s personal opinions, biases and judgments Worded that only one answer is acceptable -several answers are possible -little of no disagreement on what is the correct answer -possible to disagreement on what is the correct answer

Time Limit and Level of Difficulty Power Speed -consists of series of items arranged in ascending order of difficulty -consists of items approximately equal in difficulty -measures student’s ability to answer more and more difficult items -measures students speed or rate and accuracy in responding

Format Selective Supply -there are choices for the answer -there are no choices for the answer -multiple choice, true and false, matching type -short answer, completion, restricted or extended essay -can be answered quickly -may require a longer time to answer -prone to guessing -less chance to guessing but prone to bluffing -time consuming to construct -time consuming to answer/score

Nature of Assessment Maximum Performance Typical Performance -determines what individuals can do when performing at their best -determines what individuals will do under natural conditions

Interpretation Norm-referenced Criterion-referenced -result is interpreted by comparing one student’s performance with other student’s performance. -results is interpreted by comparing students’ performance based on a predefined standard (mastery) -some will really pass All or none may pass -there is competition for a limited percentage of high scores There is no competition for a limited percentage of high score -typically covers a large domain of learning tasks -typically focuses on a delimited domain of learning tasks -emphasizes discrimination among individuals in terms of level of learning -emphasizes description of what learning tasks individuals can and cannot perform -favors items of average difficulty and typically omits very easy and very difficult items Matches items difficulty to learning tasks, without altering items difficulty or omitting easy -interpretation requires clearly defined group Interpretation requires a clearly defined and delimited achievement domain

Reference Interpretation Provided Condition that must be present Ability-referenced How are students performing relative to what they are capable of doing Good measures of the students maximum possible performance Growth-referenced How much have students changes or improved relative to what they were doing Pre and post measures of performance that are highly reliable Norm-referenced How well are students doing with respect to what is typical or reasonable Clear and understanding of whom students are being compared to Criterion-reference What can students do and not do Well-defined content domain that was assessed FOUR COMMONLY-USED REFERENCES FOR CLASSROOM INTERPRETATION

1. Selective Type – provides choices for the answer. a. Multiple Choice – consists of a stem which describes the problem and 3 or more alternatives which give the suggested solutions. The incorrect alternatives are the distractors. b. True-False or Alternative Response - consists of declarative statement that one has to mark true or false, right or wrong, correct or incorrect, yes or no, fact or opinion and the like. c. Matching Type – consists of two parallel columns: Column A, the column of premises from which a match is sought: Column B, the column of responses from which the selection is made. TYPES OF TEST ACCORDING TO FORMAT

Type Advantages Limitations Multiple Choice -more adequate sampling of content -tend to structure the problem to be addressed more effectively -can be quickly and objectively scored -prone to guessing -often indirectly measure targeted behaviors -time-consuming to construct Alternative Response -more adequate sampling of content -easy to construct -can be effectively and objectively scored =prone to guessing -can be used only when dichotomous answers represent sufficient response options -usually must indirectly measure performance related to procedural knowledge Matching Type -allows comparison of related ideas -concepts or theories -effectively assesses association between a variety of items within a topic -encourages integration of information -can be quickly and objectively scored -can be easily administered -difficult to produce a sufficient number of plausible premises -not effective in testing isolated facts -may be limited to lower levels of understanding -useful; only when there is a sufficient number of related items -may be influenced by guessing

a. Short Answer – uses a direct question that can be answered by a word, phrase, a number or a symbol b. Completion Test – consists of an incomplete statement 2. Supply Test

Advantages Limitations -easy to construct -require the student to supply the answer -many can be included in one test -generally limited to measuring recall of information -more likely to be scored erroneously due to a variety of responses

a. Restricted - limits the content of the response by restricting the scope of the topic. b. Extended Response – allows the students to select any factual information that they think is pertinent, to organize their answers in accordance with their best judgment 3. Essay Test

Advantages Limitations -measure more directly behaviors specified by performance objectives -examine students’ written communication skills -require the students to supply the response - provide a less adequate sampling of content -less reliable scoring -time-consuming

1. Use your TOS (Table of Specification) as guide to item writing. 2. Write more test items than needed. 3. Write the test items well in advance of the testing date. 4. Write each items so that the task to be performed is clearly defined. 5. Write each test items in appropriate reading level. GENERAL SUGGESTIONS IN WRITING TESTS

6. Write each test item so that it does not provide help in answering other items in the test. 7. Write each test item so that the answer is one that would be agreed upon by experts. 8. Write test items so that it is the proper level of difficulty. 9. Whenever a test is revised, recheck its relevance .

1. Word the item/s so that the required answer is both brief and specific. 2. Do not take statements directly from textbooks to use as a basis for short answer items. 3. A direct question is generally more desirable than an incomplete statement. 4. If item is to be expressed in numerical units, indicate type of answer wanted. SPECIFIC SUGGESTIONS

5. Blanks should be equal in length. 6. Answers should be written before the item number for easy checking. 7. When completion items are to be used, do not have too many blanks. Blanks should be at the center of the sentence and not at the beginning.

1. Restrict the use of essay questions to those learning outcomes that cannot be satisfactorily measured by objective items. 2. Formulate questions that will call forth the behavior specified in the learning outcome. 3. Phrase each question so that the pupil’s task is clearly indicated. 4. Indicate an approximate time limit for each question. 5. Avoid the use of optional questions. B. ESSAY TYPE

Alternative Response 1. Avoid broad statements. 2. Avoid trivial statements. 3. Avoid the use of negative statements especially double negative. 4. Avoid long and complex sentences. C. SELECTIVE TYPE

1. Use only homogenous materials in a single matching exercise. 2. Include an unequal number of responses and premises and instruct the pupils that response may be used once, more than once or not at all. 3. Keep the list of items to be matched brief and place the shorter responses at the right. 4. Arrange the list of responses in logical order. 5. Indicate in the directions the base for matching the responses and premises. 6. Place all the items for one matching exercise on the same page. Matching Type

1. The stem of the item should be meaningful by itself and should present a definite problem 2. The item should include as much of the item as possible and should be free of irrelevant information. 3. Use a negatively stated item only when significant learning outcome requires it. 4. Highlight negative words in the stem for emphasis. 5. All the alternatives should be grammatically consistent with the stem of the item. 6. An item should only have one correct or clearly best answer. Multiple Choice

ALTERNATIVE ASSESSMENT

When to Use Specific behaviors or behavioral outcomes are to be observed. Possibility of judging the appropriateness of students actions A process or outcomes cannot be directly measured by paper and pencil tests Advantages Allow evaluation of complex skills which are difficult to assess using written tests. Positive effect on instruction and learning, Can be used to evaluate both the process and the product Limitations Time-consuming to administer, develop and score Subjectivity in scoring Inconsistencies in performance on alternative skills PERFORMANCE AND AUTHENTIC ASSESSMENTS

Characteristics 1. Adaptable to individualized instructional goals. 2. Focus on assessment of products. 3. Identify students’ strengths rather than weaknesses. 4. Actively involve students in the evaluation process. 5. Communicate student achievement to others. 6. Time-consuming. 7. Need of scoring plan to increase reliability PORTFOLIO ASSESSMENT

TYPES DESCRIPTION Showcase A collection of students’ best work Reflective Used for helping teachers, students and family members think about various dimensions of student learning (e.g. effort, achievement) Cumulative A collection of items done for an extended period of time Analyzed to verify changes in the products and process associated with student learning Goal-based A collection of works chosen by students and teachers to match pre-established objectives Process A way of documenting the steps and processes a student has done to complete a place of work.

scoring guide consisting of specific pre-established performance criteria used in evaluating student work in performance assessments RUBRICS

1. Holistic Rubric – requires the teacher to score the overall process or product as a whole, without judging the component parts separately 2. Analytic Rubric – requires the teacher to score individual components of the product or performance first, then sums the individual scores to obtain a total score. Two Types

1. Closed –Item or Forced-Choice Instruments – ask for one or specific answer a. Checklist – measures student’s preferences, hobbies, attitudes, feelings, beliefs, interests, etc. by marking a set of possible responses. AFFECTIVE ASSESSMENTS

1. Rating Scale – measures the degree or extent of one’s attitude, feelings and perception about ideas, objects and people by marking a point along 3 or 5 point scale. 2. Semantic Differential Scale – measures the degree of one’s attitudes, feelings and perceptions about ideas and people by marking a point along 3 or 7 or 11 point scale of semantic adjectives. 3. Likert Scale – measures the degree of one’s agreement or disagreement on positive or negative statements about objects and people b. Scales – these instruments that indicate the extent or degree of one’s response.

c. Alternate Response – measures students’ preferences, hobbies, attitudes, feelings, beliefs, interests, by choosing two possible responses. d. Ranking – measures students preferences or priorities by ranking a set of response

a. Sentence Completion – measures students preferences over a variety of attitudes and allows students to answer by completing an unfinished statement which may vary in length. b. Surveys – measures the values held by an individual by writing one or many responses to a given question. c. Essays – allows the students to reveal and clarify their preferences, hobbies, attitudes, feelings, beliefs and interest by writing their reaction or opinions to a given question. 2. Open-ended Instruments – they are open to more than one answer.

VALIDITY – the degree to which a test measures what is intended to be measured. It is the usefulness of the test for a given purpose. It is the most important criteria of a good examination. CRITERIA TO CONSIDER IN CONSTRUCTING GOOD TESTS

a. Appropriateness of test – it should measures the abilities, skills and information it is supposed to measure. b. Directions – it should indicate how the learners should answer and record their answers c. Reading Vocabulary and Sentence Structure – it should be based on intellectual level of maturity and background experience of the learners. d. Difficulty of Items – it should have items that are not too difficult and not too easy to be able to discriminate the bright from slow pupils. FACTORS influencing the validity of test in general

e. Construction of Items- it should not provide clues so it will not be a test on clues nor should it be ambiguous so it will not be a test on interpretation. f. Length of Test – it should just be of sufficient length so it can measure what is it supposed to measure and not that it is too short that it cannot adequately measure the performance we want to measure. g. Arrangement of Items – It should have items that are arrange in ascending level of difficulty such that are arrange in ascending level of difficulty such that it starts with the easy ones so that pupils will pursue on taking the test. h. Pattern of Answer – it should not allow the creation of patterns in answering the test.

1. Face Validity – is done by examining the physical appearance of the test 2. Content Validity – is done through a careful and critical examination of the objectives of the test so that it reflects the curricular objectives. 3. Criterion-related Validity – is established statistically such that a set of scores revealed by a test is correlated with scores. WAYS OF ESTABLISHING VALIDITY

Concurrent Validity – describes the present status of the individual by correlating the sets of scores obtained from two measures given concurrently

Predictive Validity – describes the future performance of an individual by correlating the sets of scores obtained from two measures given at a longer time interval.

Construct validity – is established statistically by comparing psychological traits or factors that influence scores in a test, e.g. verbal, numerical, spatial, etc.

Convergent Validity – is established if the instrument defines another similar trait other that what it intended to measure (e.g. Critical Thinking Test may be correlated with Creative Thinking Test)

Divergent Validity – is established if an instrument can describe only the intended trait and not other traits (e.g. Critical Thinking Test may not be correlated with Reading Comprehension Test)

it refers to the consistency of scores obtained by the same person when retested using the same instrument or one that is parallel to it. RELIABILITY

1. Length of the test – as a general rule, the longer the test, the higher the reliability. A longer test provides a more adequate sample of the behavior being measured and is less distorted by chance of factors like guessing. 2. Difficulty of the test – ideally, achievement tests should be constructed such that the average score is 50 percent correct and the scores range from zero to near perfect. The bigger the spread of scores, the more reliable the measured difference is likely to be. A test is reliable if the coefficient of correlation is not less than 0.85. 3. Objectivity – can be obtained by eliminating the bias, opinions judgments of the persons who checks the test 4. Administrability – the test should be administered with clarity and uniformity so that the scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral instructions. Factors affecting Reliability

5. Scorability – the test should be easy to score such that directions for scoring are clear, the scoring key is simple, provisions for answer sheets are made 6. Economy – the test should be given in the cheapest way, which means that answer sheets must be provided so the test can be given from time to time 7. Adequacy – the test should contain a wide sampling of items to determine the educational outcomes or abilities so that the resulting scores are representatives of the total performance in the areas measured.

Steps: 1. Score the test. Arrange the scores from highest to lowest. 2. Get the top 27% (upper group) and below 27% (lower group) of the examinees. 3. Count the number of examinees in the upper group and lower group who got the item correctly. 4. Compute for the Difficulty Index of each item. 5. Compute for the Discrimination Index of each item. ITEM ANALYSIS

Df = where: Df = difficulty index Discrimination In dex= DU - DL where: DU= difficulty index for upper group DL = difficulty index for lower group  

Difficulty Index ( Df )  Description Action Taken 0.00-0.25 Very difficult Revise/Discard 0.26 -0.75 Average/Right difficulty Retain 0.76 – 1.00 Very easy Revise/Discard INTERPRETATION

Discrimination Index  Description Action Taken 0.46 to 1.00 Positive Discriminating Power Retain -0.50 to 0.45 Could not discriminate Revise -1.00 to -0.51 Negative Discriminating Power Discard INTERPRETATION

Item 1 Distractor Analysis Sample A B C D 12 25 13 10 TOTAL 3 10 2 1 UPPER GROUP 5 6 5 LOWER GROUP

•Leniency error – faculty tends to judge better than it really is. •Generosity error – faculty tends to use high end of scale only. •Severity error – faculty tends to use low end of scale only. •Central tendency error – faculty avoids both extremes of the scale. •Bias – letting other factors influence score (e.g. handwriting, types) •Halo – effect: letting general impression (e.g. student’s prior work) SCORING ERRORS AND BIASES

•Contamination effect – judgement is influenced by irrelevant knowledge about the student or other factors that have no bearing on performance level (e.g. student appearance) • Similar to me effect – judging more favorably those students whom faculty see as similar to themselves (e.g. expressing similar interest or point of view) • First- impression effect – judgment is based on early opinions rather than on a complete picture (opening paragraph) • Contrast effect – judging by comparing student against other students instead of established criteria and standards • Rater drift – unintentionally redefining criteria and standards over time or across a series of scoring ( e.g. getting tired and cranky and therefore more severe, getting tired and reading more quickly/leniently to get the job done.

Measurement Characteristics Examples Nominal Groups and label data Gender (1 male: 2 female) Ordinal Distance between points are indefinite Income (1 low, 2- average, 3- high) Interval Distance between points are equal No absolute zero Test scores. temperature Ratio Absolute zero Height, weight FOUR TYPES OF MEASUREMENT SCALES

1. Normal/Bell-shaped/Symmetrical 2. Positively Skewed- most scores are below the mean and there are extremely high scores. 3. Negatively Skewed – most scores are above the mean and there are extremely low scores. 4. Leptokurtic – highly peaked and the tails are more elevated above the baseline. SHAPES OF FREQUENCY POLYGONS

5. Mesokurtic – moderately peaked 6. Platykurtic – flattened peak 7. Bimodal Curve- curve with 2 peaks or modes 8. Polymodal Curve – curve with 3 or more modes 9. Rectangular Distribution – there is no mode.

• Skewers – distortion or asymmetry in a symmetrical bell curve or normal distribution in a set of data. • Positively-skewed (or right skewed) –the mean is greater than the median, it means more students understand the course or the test was simple • negatively-skewed (or left skewed)- the mean is less than the median; it means students did not understand the test concepts or they were not taught well

MEASURES OF CENTRAL TENDENCY AND VARIABILITY

Assumptions when used Appropriate Statistical Tools     Measures of Central Tendency (describes the representative value of a set of data) Measures of Variability (describes the degree of spread or dispersion of a set of data) When the frequency distribution is regular or symmetrical (normal) Usually used when data are numeric (interval or ratio) Mean – the arithmetic average Standard Deviation – the root mean square of the deviations from the mean When the frequency distribution is irregular or sked. Usually when the data is ordinal Median – the middle score in a group of scores that are ranked Quartile Deviation – the average deviation of the 1 st and 3 rd quartiles from the median When the distribution of scores is normal and quick answer is needed Usually when the data are nominal Mode – the most frequent score Range – the difference between the highest and the lowest score in the distribution

• The value that represents a set of data will be the basis in determining whether the group is performing better or poorer than the other groups. How to Interpret the Measures of Central Tendency

• The result will help you determine if the group is homogenous or not. • The result will help you determine the number of students that fall below and above the average performance. How to Interpret the Standard Deviation

• Standard Deviation is a measure of how spread out number are; a measurement that indicates how much a group of scores vary from the average; it can tell you how much a group of grades varied on any given test, it might be able to tell you if the test was too easy or too difficult. • A small standard deviation means that the values in a statistical data set are close to the mean of the data set while the large SD means the values are farther away from the mean.

• The result will help you determine if the group is homogenous or not. • The result will also help you determine the number of students that fall below and above the average performance. How to Interpret the Quartile Deviation

-tells the percentage of examinee that lies below one’s score • Percentile Scores – percentage of scores in its frequency distribution that are equal to or lower than it. • Scores of students are arranged in rank order from lowest to highest- the scores are divided into 100 equally sized groups or bands. The lowest score is “in the 1st percentile”. (there is no 0 percentile). The highest score is in the “99th percentile” • If you’re score is in the 60th percentile, it means that you scored better than 60 percent of all the test takers. PERCENTILE

-tells the number of standard deviations equivalent to a given raw scores. Z-SCORES

a. could represent: -how a student is performing in relation to other students (norm-referenced grading) -the extent to which a student has mastered a particular body knowledge (criterion-referenced grading); how a student is performing in relation to a teacher’s judgment of his or her potential GRADES

-certification that gives assurance that student has mastered a specific content or achieved a certain level of accomplishment. -selection that provides basis in identifying or grouping students for certain educational paths or programs -direction that provides information for diagnosis and planning -motivation that emphasizes specific material or skills to be learned and helping students to understand and improve their performance. b. could be for:

Criterion-Referenced Grading – or grading based on fixed or absolute standards where grade is assigned based on how a student has met the criteria or a well –defined objectives of a course that were spelled out in advance. It is then up to the student to earn the grade he or she wants to receive regardless of how other students in the class have performed. This is done by transmuting test scores into marks or rating. c. could be assigned by using:

or grading based on relatives standards where a student’s grade reflects his or her level of achievement relative to the performance of other students in the class. In this system, the grade is assigned based on the average of test scores. Norm –referenced Grading-

whereby the teacher identifies points or percentages for various tests and class activities depending on their performance. The total of these points will be the bases for the grade assigned to the student. Point or Percentage Grading System

where each student agrees to work for a particular grade according to agreed-upon standards. Contact Grading System

The following points provide helpful reminders when preparing for and conducting parent-teacher conferences. 1. Make plans for the conference. Set the goals and objectives of the conference ahead of time. 2. Begin the conference in a positive manner. Starting the conference by making a positive statement about the student sets the tone for the meeting. 3. Present the student’s strong points before describing the areas needing improvement. It is helpful to present examples of the student’s work when discussing the student’s performance. 4. Encourage parents to participate and share information. Although as a teacher you are in charge of the conference, you must be willing to listen to parents and share information rather than “talk at” them. Conducting Parent-Teacher Conferences

5. Plan a course of action cooperatively. The discussion should lead to what steps can be taken by the teacher and the parent to help the student. 6. End the conference with a positive comment. At the end of the conference, thank the parents for coming and say something positive about the student, like “ Lucas has a good sense of humor and I enjoy having him in class.” 7. Use good human relation skills during the conference. Some of these skills can be summarized by following the do’s and don’ts.

Thank You!
Tags