It has been known that assessment for learning should be part of effective planning of teaching and learning. Thus, the understanding on what, why, and how on assessment seem really urgent for teachers. Related to that, this essay is aimed to to present the differences between measurement, assessment, evaluation, and testing, discuss whyness of assessment, comparing assesment versus evaluation and grading, discuss some key concepts in assessment, types and approaches to assessment, assessment on language competence, some principles for good assesment, and take a look at testing: why and how. Finally, the conclusion will be drawn in the end of the discussion. It is expected that the overview on the assessment principles will guide the assessment training and professional development of teachers and administrators to run more productive assessment in English Language Teaching (ELT).
Differences between Measurement, Assessment, Evaluation, and Testing
It seems that understanding the differences between measurement, assessment, evaluation, and testing is fundamental to the knowledge base of professional teachers and effective teaching. These words so closely related that teachers or teachers to be might use them interchangebly or be confused.
As Burhan (2009) states, of the three, evaluation seems to have the broadest coverage. Citing TenBrink (1974), he defines evaluation as the process of obtaining information and using it to form judgments which in turn are to be used in decision making. Expanded initially and finally, the process is made up five stages, namely, (1) preparing, (2) collecting the data, (3) making judgments, (4) making decision, and (5) reporting.
Assesment, as Burhan (2009) states, does not include decision making and reporting. It focuses mainly on data gathering and placing a value on something. Put it another way, assessment seems to cover stages 1, 2, and 3 of the evaluation process. Meanwhile, the last term, testing, is the narrowest in scope. It is one of the techniques for collecting the data or scores. It can be used with other techniques such as observation and interviews. It does not deal with the making judgments.
Again, about assessment, one connotative meaning of assesment is that it is used or done both in formal situation and in informal ways. As complementary efforts, informal assesment is encouraged to be done to get additional data in addition to the data obtaind from formal testing. The term assesment is recently ofen encountered in TEFL or TESOL publications.
Additionally, Kizlik (2010) elucidates some differences on measurement, assessment, and evaluation. He states that measurement refers to the process by which the attributes or dimensions of some physical object are determined. One exception seems to be in the use of the word measure in determining the IQ of a person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or preferences also applies. However, when we measure, we generally use some standard instrument to determine how big, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard instruments refer to instruments such as rulers, scales, thermometers, pressure gauges, etc. We measure to obtain information about what is. Such information may or may not be useful, depending on the accuracy of the instruments we use, and our skill at using them. There are few such instruments in the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a classroom is in terms of square feet, we measure the temperature of the room by using a thermometer, and we use Ohm meters to determine the voltage, amperage, and resistance in a circuit. In all of these examples, we are not assessing anything; we are simply collecting information relative to some established rule or standard.
Assessment, as he further states, is therefore quite different from measurement, and has uses that suggest very different purposes. Assessment is a process by which information is obtained relative to some known objective or goal. Assessment is a broad term that includes testing. A test is a special form of assessment. Tests are assessments made under contrived circumstances especially so that they may be administered. In other words, all tests are assessments, but not all assessments are tests. We test at the end of a lesson or unit. Whether implicit or explicit, assessment is most usefully connected to some goal or objective for which the assessment is designed. A test or assessment yields information relative to an objective or goal. In that sense, we test or assess to determine whether or not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward. Either the skill exists at some acceptable level or it doesn’t. Skills are readily demonstrable. Assessment of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We can assess a person’s knowledge in a variety of ways, but there is always a leap, an inference that we make about what a person does in relation to what it signifies about what he knows.
Meanwhile, evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed to provide information that will help us make a judgment about a given situation. Generally, any evaluation process requires information about the situation in question. A situation is an umbrella term that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness, goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been made. To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of criteria. These three terms are certainly connected, but it is useful to think of them as separate but connected ideas and processes.
The key concepts in assesment are also clarified by Cameron (2002). She states that evaluation refers to a broader notion than assessment, and refers to a process of systematically collecting information in order to make a judgment. Evaluation can thus concern a whole range of issues in and beyond language education: lesson, courses, programs, and skills can all be evaluated. If we were to evaluate a course, we would need to collect many different types of information: course information, observation of lessons, interviews with pupils and teachers, course feedback questionnaires, examination results. Analyzing and combining the different types of information would enable a judgment to be made about the success, or the viability, or cost-effectiveness, of the course.
She further states that assessment is concerned with pupil’s learning or performance, and thus provides one type of information that might be used in evaluation. Testing is a particular form of assessment, that is connected with measuring learning through performance.
As Scanlan (2003) states, to many teachers (and students), “assessment” simply means giving students tests and assigning them grades. This conception of assessment is not only limited, but also limiting. It fails to take into account both the utility of assessment and its importance in the teaching/learning process.
In the most general sense, as he states, assessment is the process of making a judgment or measurement of worth of an entity (e.g., person, process, or program). Educational assessment involves gathering and evaluating data evolving from planned learning activities or programs. This form of assessment is often referred to as evaluation (see section below on Assessment versus Evaluation). Learner assessment represents a particular type of educational assessment normally conducted by teachers and designed to serve several related purpose (Brissenden and Slater, n.d.). These purposed include: motivating and directing learning, providing feedback to student on their performance, providing feedback on instruction and/or the curriculum, and ensuring standards of progression are met
For teachers and curriculum/course designers, carefully constructed learner assessment techniques can help determining whether or not the stated goals are being achieved. According to Brissenden and Slater (year unknown) as cited by Scanlan (2003), classroom assessment can help teachers answer the following specific questions:
To what extent are my students achieving the stated goals?
How should I allocate class time for the current topic?
Can I teach this topic in a more efficient or effective way?
What parts of this course/unit are my students finding most valuable?
How will I change this course/unit the next time I teach it?
Which grades do I assign my students?
Meanwhile, for students, learner assessment answers a different set of questions:
Do I know what my instructor thinks is most important?
Am I mastering the course content?
How can I improve the way I study in this course?
What grade am I earning in this course?
Explaining the importance of assesment, Brissenden and Slater as cited by Scanlan (2003) states that first and foremost, assessment is important because it drives students learning. Whether we like it or not, most students tend to focus their energies on the best or most expeditious way to pass their ‘tests.’ Based on this knowledge, we can use our assessment strategies to manipulate the kinds of learning that takes place. For example, assessment strategies that focus predominantly on recall of knowledge will likely promote superficial learning. On the other hand, if we choose assessment strategies that demand critical thinking or creative problem-solving, we are likely to realize a higher level of student performance or achievement. In addition, good assessment can help students become more effective self-directed learners.
As indicated above, motivating and directing learning is only one purpose of assessment. Well-designed assessment strategies also play a critical role in educational decision-making and are a vital component of ongoing quality improvement processes at the lesson, course and/or curriculum level.
Assessment versus Evaluation and Grading
In terms of why and how the measurements are made, the following table (Apple & Krumsieg, 1998) as cited by Scanlan (2003) compares and contrasts assessment and evaluation on several important dimension, some of which were previously defined.
Dimension Assessment Evaluation
Timing Formative Summative
Focus of Measurement Process-Oriented Product-Oriented
Relationship Between Administrator and Recipient Reflective Prescriptive
Findings and Uses Diagnostic Judgmental
Modifiability of Criteria, Measures Flexible Fixed
Standards of Measurement Absolute (Individual) Comparative
Relation Between Objects of A/E Cooperative Competitive
The bottom line? Given the different meaning ascribed to these terms by some educators, it is probably best that whenever you use these terms, you make your definitions clear.
Based on the above discussion, grading grading could be considered a component of assessment, i.e., a formal, summative, final and product-oriented judgment of overall quality of worth of a student's performance or achievement in a particular educational activity, e.g., a course. Generally, grading also employs a comparative standard of measurement and sets up a competitive relationship between those receiving the grades. Most proponents of assessment, however, would argue that grading and assessment are two different things, or at least opposite pole on the evaluation spectrum. For them, assessment measures student growth and progress on an individual basis, emphasizing informal, formative, process-oriented reflective feedback and communication between student and teacher. Ultimately, which conception you supports probably depends more on your teaching philosophy than anything else.
Some Key Concepts in Assesment
To ensure our comprehension on assessment, some important concepts in assesment are noted by Cameron (2002) as follows.
1. Formative and summative assessment
Formative assessment aims to inform ongoing teaching and learning by providing immediate feedback. A teacher who assesses pupils’ understanding of a listening text and uses the outcomes to change her plan and give more practice before moving on to a speaking activity, is carrying out formative assessment. Ideally, formative assessment should influence both teaching and learning by giving feedback to both teacher and learner. Summative assessment, on the other hand, aims to asses learning at the end of a unit, term, year, or course, and does not feed back into the next round of teaching.
2. Diagnostic and Achievement Assessment
Many assessment activities provide both formative and summative information, but it is helpful to be clear as to the primary purpose and one of an assessment because this can affect what kind of information the activity needs to produce. An assessment of pronunciation skills that a formative will need to tell us where pupils are having difficulty so that the teacher can decide how to give extra practice; a test that gives a list of marks will not help the teacher make such decisions, but an activity that produces a description of each child’s performance will. This example highlights the distinction between assessing achievement, i.e. what a learner can do, and diagnostic assessment that aims to establish what a child can and can not yet do, so that further learning opportunities can be provided.
3. Criterion-referenced and Norm-referenced Assessment
If we assess learner’s achievement, we can produce a ranking of learners which says that child X has learnt more than child Y and less that child Z; this would be norm-referenced. Alternatively, we can compare a learner’s performance, not to other learners, but to a set of criteria of expected performance or learning targets. Criterion-referenced assesment can match the child’s performance against an expected response on an item, or it may make use of a set of descriptors along a scale, on which a learner is placed.
The concepts of validity and reliability are used to describe the technical quality of assessment practices. They are more often applied to testing, although are also important in alternative assessment. Validity is more important, particularly in alternative assessment, and concerns how far an assesment assesses what it claims to. If a test does not measure what it claims to, then there are clearly dangers in using it.
Reliability measures how well a test or assessment assesses what it claims to: would the assessment produce the same results if it were taken by the same pupils on different occasions, or if the same test or assesment was scored by different people? (Gipps and Stobart, 1993).
Validity and reliability can be conflicting needs for assessment techniques and procedures. The most reliable assessments will be pencil and paper tests in which each item measures only a single aspect of a skill and which give each testee a numerical mark. But the most valid assessments will be on those that collect a lot of information about performance on several aspects of a skill. When validity increased, reliability decreased.
Types and Approaches to Assessment
Numerous terms are used to describe different types and approaches to learner assessment. Although somewhat arbitrary, it is useful to these various terms as representing dichotomous poles as explained by Scanlan (2003).
Formative <---------------------------------> Summative
Informal <---------------------------------> Formal
Continuous <----------------------------------> Final
Process <---------------------------------> Product
Divergent <---------------------------------> Convergent
1. Formative vs. Summative Assessment
Formative assessment is designed to assist the learning process by providing feedback to the learner, which can be used to identify strengths and weakness and hence improve future performance. Formative assessment is most appropriate where the results are to be used internally by those involved in the learning process (students, teachers, curriculum developers).
Summative assessment is used primarily to make decisions for grading or determine readiness for progression. Typically summative assessment occurs at the end of an educational activity and is designed to judge the learner’s overall performance. In addition to providing the basis for grade assignment, summative assessment is used to communicate students’ abilities to external stakeholders, e.g., administrators and employers.
2. Informal vs. Formal Assessment
With informal assessment, the judgments are integrated with other tasks, e.g., lecturer feedback on the answer to a question or preceptor feedback provided while performing a bedside procedure. Informal assessment is most often used to provide formative feedback. As such, it tends to be less threatening and thus less stressful to the student. However, informal feedback is prone to high subjectivity or bias.
Formal assessment occurs when students are aware that the task that they are doing is for assessment purposes, e.g., a written examination or OSCE. Most formal assessments also are summative in nature and thus tend to have greater motivation impact and are associated with increased stress. Given their role in decision-making, formal assessments should be held to higher standards of reliability and validity than informal assessments.
3. Continuous vs. Final Assessment
Continuous assessment occurs throughout a learning experience (intermittent is probably a more realistic term). Continuous assessment is most appropriate when student and/or instructor knowledge of progress or achievement is needed to determine the subsequent progression or sequence of activities. Continuous assessment provides both students and teachers with the information needed to improve teaching and learning in process. Obviously, continuous assessment involves increased effort for both teacher and student. Final (or terminal) assessment is that which takes place only at the end of a learning activity. It is most appropriate when learning can only be assessed as a complete whole rather than as constituent parts. Typically, final assessment is used for summative decision-making. Obviously, due to its timing, final assessment cannot be used for formative purposes.
4. Process vs. Product Assessment
Process assessment focuses on the steps or procedures underlying a particular ability or task, i.e., the cognitive steps in performing a mathematical operation or the procedure involved in analyzing a blood sample. Because it provides more detailed information, process assessment is most useful when a student is learning a new skill and for providing formative feedback to assist in improving performance.
Product assessment focuses on evaluating the result or outcome of a process. Using the above examples, we would focus on the answer to the math computation or the accuracy of the blood test results. Product assessment is most appropriate for documenting proficiency or competency in a given skill, i.e., for summative purposes. In general, product assessments are easier to create than product assessments, requiring only a specification of the attributes of the final product.
5. Divergent vs. Convergent Assessment
Divergent assessments are those for which a range of answers or solutions might be considered correct. Examples include essay tests, and solutions to the typical types of indeterminate problems posed in PBL. Divergent assessments tend to be more authentic and most appropriate in evaluating higher cognitive skills. However, these types of assessment are often time consuming to evaluate and the resulting judgments often exhibit poor reliability. A convergent assessment has only one correct response (per item). Objective test items are the best example and demonstrate the value of this approach in assessing knowledge. Obviously, convergent assessments are easier to evaluate or score than divergent assessments. Unfortunately, this “ease of use” often leads to their widespread application of this approach even when contrary to good assessment practices. Specifically, the familiarity and ease with which convergent assessment tools can be applied leads to two common evaluation fallacies: the Fallacy of False Quantification (the tendency to focus on what’s easiest to measure) and the Law of the Instrument Fallacy (molding the evaluation problem to fit the tool).
Assessment on Language Competence
To evaluate the learners’ ability to use a language, assessment should be geared to measuring their communicative competences or language proficiency. Omaggio (1883: 2) ini Burhan (2009: 41) summarizes that language proficiency includes the mastery of these competences:
1. Grammatical Competence
Grammatical competence includes knowledge of vocabulary and rules of pronunciation/spelling, word-formations, and sentence formations. Such competence is an important concern for the communicative approach ini order for the learners to understand and express accurately the literal meaning of utterances
2. Sociolinguistic Competence
It addresses the extent to which grammatical forms can be used or understood appropriately to communicate in various social settings.
3. Discourse Competence
Discourse competence refers to the mastery or combining sentences and ideas to achieve unified spoken and written text through cohesion in forms and coherence in thought
4. Strategic Competence
Such competence is the ability of the language users to use verbal and nonverbal communication strategies when the communication is interrupted due to interference, distraction, or inadequate other competences
To put it another way, to measure the learners’ language proficiency is to assess the learners’ language acquisition. Acquiring a language means getting the ability to use the language in real communication which demands appropriate functions of language.
Fifth and Macintosh (1984: 10) in Burhan (2009: 96) state that there are two approaches in doing the assessment as follows.
1. The Pragmatic Approach
This approach is concerned with the actuality of the teaching-learning situations. It is an assesment of what is going on in the classroom. It is an assesment of what is going on in the classroom. It is done to discriminate between the learners and the analysis of the result is intended to ensure that the assesment is well balanced. The choice of the assesment techniques depends on the opportunities presented by unexpected outcomes. The final grading is postponed until all outcomes of assesment can be properly balanced and adjusted.
2. The Predetermined Approach
The predertermined approach relies on the plan set up before. In this approach, the objectives are set up the outset of the instruction and some criteria are formulated to determine the level of mastery. Pretesting and posttesting of assesment material are carried out to ensure that it is appropriate to the learners and relevant to the subject being taught and to ensure that the results are taken into account.
3. Techniques of Assesment
There is a considerable range of techniques to measure the abilities and acquired skills of the learners. The selection of the technique of several techniques depends on (1) the purpose of the assesment, (2) the time and resources, and (3) the age and the ability of the learners. Several techniques can be applied and Fifth and Macintosh (1984: 52) lists the techniques as follows: (1) written assesment, (2) practical assesment, (3) oral assesment and aural assesment, (4) learner questionnaires, and (5) coursework (including projects and fieldwork).
4. Evaluation Criteria for Language Proficiency
Communicative competence of second or foreign language learners can be determined on the basis of the following three coexisting and interrelated hierarchies of judgmental criteria: (1) function, (2) content, and (3) accuracy. The question that must be asked is “What were they able to communicate, and how well?”. The what refers to the topic or context. The How well refers to the linguistic accuracy and cultural authenticity. The first and the second criteria, seem to be concerned mainly with the use (of language) and the situation (in which the language is used) while the third criterion is based on the form. Put it more technical way, criteria one and two are sociolinguistically oriented, and the third linguistically oriented.
Principles for Good Assesment
Another important question on assesment is: what principles which provide the most essential, fundamental "structure" of assessment knowledge and skills that result in effective educational practices and improved student learning? McMillan (2000) tries to elaborate the principles as follows.
1. Assessment is inherently a process of professional judgment.
2. Assessment is based on separate but related principles of measurement evidence and evaluation.
3. Assessment decision-making is influenced by a series of tensions.
4. Assessment influences student motivation and learning.
5. Assessment contains error.
6. Good assessment enhances instruction.
7. Good assessment is valid.
8. Good assessment is fair and ethical.
9. Good assessments use multiple methods.
10. Good assessment is efficient and feasible.
11. Good assessment appropriately incorporates technology.
Testing: Why and How
Testing is certainly not the only way to assess students, but there are many good reasons for including a test in our language course, as stated by Frost (2004).
1. A test can give the teacher valuable information about where the students are in their learning and can affect what the teacher will cover next. They will help a teacher to decide if her teaching has been effective and help to highlight what needs to be reviewed. Testing can be as much an assessment of the teaching as the learning
2. Tests can give students a sense of accomplishment as well as information about what they know and what they need to review.
3. Tests can also have a positive effect in that they encourage students to review material covered on the course.
However, Frost (2004) also notes why testing doesn't work. According to him, there are many arguments against using tests as a form of assessment:
1. Some students become so nervous that they can't perform and don't give a true account of their knowledge or ability
2. Other students can do well with last-minute cramming despite not having worked throughout the course
3. Once the test has finished, students can just forget all that they had learned
4. Students become focused on passing tests rather than learning to improve their language skills.
Frost (2004) admits that using only tests as a basis for assessment has obvious drawbacks. They are 'one-off' events that do not necessarily give an entirely fair account of a student's proficiency. As we have already mentioned, some people are more suited to them than others. There are other alternatives that can be used instead of or alongside tests.
1. Continuous assessment
Teachers give grades for a number of assignments over a period of time. A final grade is decided on a combination of assignments.
A student collects a number of assignments and projects and presents them in a file. The file is then used as a basis for evaluation.
The students evaluate themselves. The criteria must be carefully decided upon beforehand.
4. Teacher's assessment
The teacher gives an assessment of the learner for work done throughout the course including classroom contributions.
To summarize, what is most essential about assessment is understanding how general, fundamental assessment principles and ideas can be used to enhance student learning and teacher effectiveness. This will be achieved as teachers and administrators learn about conceptual and technical assessment concepts, methods, and procedures, for both large-scale and classroom assessments, and apply these fundamentals to instruction. Finally, the comprehension on the assessment principles will guide the assessment training and professional development of teachers and administrators to run more productive assessment.
Burhan, Akhyar. 2009. Second Language Teaching and Linguistics. Palembag:
Grafika Telindo Press.
Cameron, Lynne. 2001. Teaching Languages to Young Learners. UK: Cambridge
Frost, Richard. 2004. Testing and assessment. Retrieved at May 27, 2010 from
Hughes, Arthur. 2000. Testing for Language Teachers. UK: Cambridge
Kizlik, Bob. 2010. Measurement, Assessment, and Evaluation in Education. Retrieved at May 27, 2010 from http://www.adprima.com/measurement.htm.
McMillan, James H. (2000). Fundamental assessment principles for teachers and
school administrators. Practical Assessment, Research & Evaluation, 7(8).
Retrieved June 6, 2010 from http://PAREonline.net/getvn.asp?v=7&n=8.
Scanlan, Craig L. 2003. Assessment, Evaluation, Testing and Grading. Retrieved
at May 27, 2010 from