Table of contents
Acknowledgement
List of abbreviations
Part I: Introduction
1.Rationale
2.Aims of the study
3.Scope of the study
4.Methods of the study
5.Design of the study
Part II: Development
Chapter one: Literature review
1.1.Language testing
1.2.Communicative language tests
1.3.Testing reading skills
1.3.1.Multiple choice questions
1.3.2.Short answer questions
1.3.3.Cloze
1.3.4.Selective deletion gap filling
1.3.5.C tests
1.3.6.Coloze elide
1.3.7.Information transfer
1.3.8.Jumbled sentences
1.3.9.Matching
1.3.10.Jumbled paragraphs
1.4.Major characteristics of a good test
1.41.Reliability
1.4.2.Validity
1.4.2.1.Content validity
1.4.2.2.Face validity
1.4.2.3.Criterion-related validity
1.4.2.4.Construct validity
1.4.3.Practicality
1.4.4.Discrimination
1.5.Achievement tests
1.5.1.Class progress test
1.5.2.Final achievement test
Summary
Chapter two: Methodology
2.1.A quantitative study
2.2.The selection of participants
2.3.The materials
2.4.Methods of data collection and data analysis
2.5.Limitations of the research
Summary
Chapter three: Discussion
3.1-The content area of the test
3.2-The relative weights of the different parts of the test
3.3-Constructing the test
3.4-Administering the test
3.5-Marking the test
3.6-Test scores interpreting and evaluation
3.6.1.The frequency distribution
3.6.2.The central tendency
3.6.2.1.The mode
3.6.2.2.The median
3.6.2.3.The mean
3.6.3.The dispersion
3.6.3.1.The low-high
3.6.3.2.The range
3.6.3.3.The standard deviation
3.7-Test item analysis and evaluation
3.7.1.Item difficulty
3.7.2.Item discrimination
3.8.Estimating reliability
Summary
Part III: Conclusion and recommendations
References
Appendices
62 trang |
Chia sẻ: maiphuongtl | Lượt xem: 2311 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Đề tài Thiết kế và đánh giá một bài kiểm tra tiếng anh chuyên ngành cho sinh viên xây dựng dân dụng tại Trường đại học dân lập Hải Phòng, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
tive study
Like qualitative research, quantitative research comes in many approaches including descriptive, correlational, exploratory, quasi-experimental, and true-experimental techniques.
As a teacher of Civil Engineering English, I designed this reading test to understand better how things are really operating in my own classroom as well as to describe the performance of my learners in the reading skill. After 120 period reading course 50 How many? Be specific!
students were chosen from three different classes (XD501, XD 502, XD 503) to do a reading test in the time given (60 minutes) and then the results collected from the testing papers would Em oi, phai consistent ve thi chu. Trong research report, khi ke ve nhung thu em da lam thi tat ca deu o thi qua khu hoac tuong lai trong qua khu.
be described in different terms with the use of the descriptive statistics technique. The correlational research technique was also used to find out the reliability coefficient latter in the study.
2.2.The selection of Participants
The students at Haiphong Private University mainly come from different towns and cities in the North of Vietnam. They are generally aged between 18 and 22, or older.
At the university, they study for eight terms in four years. There students are classified into majors and non-majors of English. The latter usually have to learn a foreign language, in this case English, in only two years of their whole student lifeIn the first three terms, they study General English and in the fourth term English for Specific Purposes (ESP). After two years’ English learning, they are required to be able to read and translate their ESP at intermediate level. However, students often have varying English levels prior to the course due to the fact that at secondary school they learned different languages, including Russian, French, and Chinese. It is therefore important for teachers to apply appropriate methods in teaching them GE as well as ESP to help them become more proficient. It is also critical that teachers give them suitable tests which meet their need and Em nen de y den cau truc song song trong khi viet nhe.
the requirements of society at the same time.
2.3.The Materials
During the first three terms the CE students are required to learn all the 15 units in Elementary Headway, and the first 8 units in Pre-intermediate Headway. These three terms include 205 periods in all, 75 periods for each term. In the fourth term, they study 120 periods of ESP usig a the 15-unit textbook on English for Civil Engineering.
2.4.Methods of data collection and data analysis
To collect data for the research, a 34-item test of Civil Engineering English reading was delivered to 50 students of the Construction Department. These non-majors did the test within the time frame given (60 minutes). Then the test papers were collected, and then were marked, analysed, and interpreted. Doing these things did point out how many students did the test well, how many performed badly, the most frequent scores the testees got, how these scores ranged, how many scores deviated from the mean, etc.
2.5.Limitations of the research
Like in any other studies, some limitations cannot be avoided in this one. Firstly, because of the limitation of time as well as of ability, the author could design only one reading test to be conducted on 50 students, which might not be a large number. Yet, it is hoped that the results could be reliable and valid enough for the researcher to make inferences and come to certain conclusions. Secondly, instead of designing different types of test, the author was able to make solely one type, that is an achievement test to measure the progress her students had made in terms of reading skills after undertaking the course of English for Civil Engineering in their last term in 2004-2005 school year. From the results, the author could also measure the effectiveness of her teaching.
Summary
This chapter gives a brief account on a quantitative study, in which the author used the descriptive statistics and correlational technique to analyse the data. Following the methods, the selection of participants and materials has also been dealt with. A quick introduction of the data collection and data analysis methods was also presented and finally came the limitations of the research.
Chapter three: Discussion
This chapter is the discussions firstly about the content area of the test, how the test was divided, how to construct and mark the test. Afterwards, the whole test results and each test item would be analysed and then interpreted. Finally, the author will evaluate the test based on the four criteria of a good test as mentioned in the previous chapter.
3.1. The content area of the test
The following topic checklist of the course book will help to point out the content area of the reading test. Help seize co nghia la gi va giup ai?
The Topic checklist of the course book
Topic
Material
Number of unit/ page
Architectural composition
Skeleton construction
Concrete, reinforced
concrete, prestressed
concrete
Ultimate carrying capacity and factor of safety
Pre-cast products
Breakwaters
Conveying, placing,
compacting, and curing
Concrete and strength test
Asphalt concrete
Materials and properties
Structure
Location
Actions and sequences
Arch and arch beam bridges
Shear forces and bending moments in beams
Matrix methods in the calculation of structure
The hinge
English of Civil
Engineering
Unit 1-p.1
Unit 2-p.6
Unit 3-p.10
Unit 4-p.14
Unit 5-p.20
Unit 6-p.24
Unit7-p.1 (Book 2)
Unit 8-p.8
Unit 9-p.12
Unit 10-p.20
Unit 11-p.26
Unit 12-p.29
Unit 13-p.32
Unit 14-p.35
Unit 15-p.35
Unit 16-p.42
Unit 17-p.46
3.2.The relative weights of the different parts of the test
The test is composed of 5 parts, and the weighting of each part is illustrated in the following table:
Test of reading
Part
Input
Response/ Item type
Scores
Weighting
1
Factual text,
approx.120 words
5 comprehension
questions
10
20%
2
5 word columns
Matching to make 5 sentences
10
20%
3
10 jumbled
sentences
Rearranging
10
20%
4
10 statements
True / False
10
20%
5
Factual text, approx.
120 words with 5
blanks
Blank filling
10
20%
3.3.Constructing the test
To construct the reading test used in this research, the author went through the following procedures:
Statement of the problem
There was a need for this achievement test to be administered at the end of the course of training in the reading of Civil Engineering English (the students are graduates). The test was intended to find out what progress was being made after 120 period study and also what were the greatest difficulties in learning that the students still had at the end of the course. Thanks to that future courses may give more attention in these areas. Backwash is considered important; the test should encourage the practice of the reading skills that the students need in their university study. The time allowed was one hour.
Specifications
CONTENT
Types of text: The academic texts were from the course book entitled ‘English for Civil Engineering’. One sample text is provided in Appendix 1.
Addressees: Non-native speaker university students at HPU, or more specifically non-majors of CE at HPU.
Topics: The topics were suitable for the candidates and the type of test, and the subject area were neutral.
Operation: The test has 5 tasks and the candidates had to scan to locate specific information, to match words/ phrases to make correct statements, to arrange words/ phrases to make complete sentences, to decide whether the given statements are true or false, and finally to fill blanks with the given words.
FORMAT AND TIMING
Scanning: 1 passage with about 120 words in length.
5 short answer items, the items in the order in which relevant information appears in the texts. Responses were controlled.
Time: 10 minutes.
Detailed reading
-5 columns of words. Responses were controlled.
Time: 10 minutes.
-10 jumbled sentences to be rearranged. Responses were controlled.
Time: 20 minutes.
-10 statements to be marked T or F. Responses were controlled.
Time: 10 minutes.
-1 passage with about 120 words in length.
5 gaps to be filled. Responses were expected.
Time: 10 minutes.
CRITICAL LEVELS OF PERFORMANCE
All test items were written such that any student completing the course successfully would be able to respond correctly to all of them. Allowing for ‘performance errors’ on the part of candidates, a critical level of 80 percent was set. The students reaching this level would be the ones succeeding in terms of the course’s objectives.
SCORING PROCEDURES
There was a detailed key and the scoring was completely objective.
SAMPLING
The texts were chosen from a variety of topics in the course book. Draft items were written before the test was officially used.
ITEM WRITING AND MODERATION
All the items in the test were based on a consideration of what a competent non-major would be able to obtain from the texts. Considerable time was set side for moderation and rewriting of items.
KEY
There was a detailed key for the test results. The key is provided in Appendix 2
After having followed the above procedures, the test was designed as follow:
Haiphong public university Achievement test
Testee's full name:..................................................... Skill: Reading
Mark:
Level: Intermediate
Time allowed: 60 minutes
Question 1: Read the following passage then answer the questions given below
Conveying devices may be wheelbarrow, bottom dump bucket, dump truck. If necessary concrete may be pumped through hoses and steel pipelines. The mode of transport depends on the quality of concrete to be placed, the equipment available and other factors. The method employed must prevent the separation of the materials, called segregation, and insure that concrete of good quality is deposited in the form.
The forms are made of timber or metal of a size and shape suitable for the finished work. They must be of sufficient strength and rigidity to support the wet material and allow it to be properly compacted. They are so constructed that they may be easily removed when the concrete has hardened. The interior of the forms must be oiled or soaped to prevent the concrete from adhering to the forms.
1-What are the conveying devices mentioned in the passage?
...................................................................................................................................
2-What does the mode of transport depend on?
...................................................................................................................................
3-What are the forms made of?
...................................................................................................................................
4-Why must the forms be of sufficient strength and rigidity?
...................................................................................................................................
5-What must we do with the interior of the forms before placing?
...................................................................................................................................
Question 2: Use the words/ phrases given below to make sentences describing the properties of materials.
Steel
Stone
Glass wool
Brick
has the property of
high tensile strength
good sound isolation
good thermal isolation
high compressive strength
This means
it can resist high compressive forces
it can resist high tensile forces
it does not transmit heat easily
it does not transmit sound easily
6-...............................................................................................................................
...............................................................................................................................
7-...............................................................................................................................
...............................................................................................................................
8-...............................................................................................................................
...............................................................................................................................
9-...............................................................................................................................
...............................................................................................................................
Question 3: Arrange the following words/ phrases to make complete sentences
10-fire/ weather/ has/ high/ concrete/ resistance/ and.
...............................................................................................................................
11-lost cost/ durable/ is/ and/ concrete/ pre-cast/ at.
...............................................................................................................................
12-made/ different/ materials/ from/ concrete/ is.
...............................................................................................................................
13-solid/ reinforcement/ is/ widely/ in/ spaced/ slabs.
...............................................................................................................................
14-aggregate/ 20mm/ 40mm/ coarse/ in/ ranges/ to/ size/ from.
...............................................................................................................................
15-during/ segregate/ conveying/ may/ concrete.
...............................................................................................................................
16-be/ spread/ shall/ mixture/ by/ asphalt/ paver.
...............................................................................................................................
17-elastic/ clay/ rubber/ is/ plastic/ but/ is.
...............................................................................................................................
18-a/ done/ is/ concrete/ mixer/ mixing/ in.
...............................................................................................................................
19-vibrators/ driven/ be/ by/ electricity/ air/ or/ compressed/ may.
...............................................................................................................................
Question 4: Use your knowledge of the subject to decide whether the following statements are true or false. (Write T or F)
20-Glass wool is a heavy material.
21-Rubber cannot be stretched or compressed.
22-Concrete is a light material so it is easy to lift.
23-We can burn wood because it is combustible.
24-Mild steel can resist corrosion.
25-Rubber is plastic while clay is elastic.
26-Because copper is a good conductor of heat so heat can be easily transferred through it.
27-We can easily scratch glass because it is soft.
28-Concrete cannot be bunt because it is non-combustible.
29-Stainless steel is corrosion-resistant.
Question 5: Fill each blank below with ONE of the given words.
multi-story minimum timber
architecture maximum possible
impossible low steel
The modern skeleton structure is the result of rational use of steel and concrete in building. Among its characteristic features are the reduction of all load-carrying members to (30).................sizes and clear division between structural and non-structural elements. The skeleton is composed of rigidly connected beams and columns. It is a particular suitable form for (31) ................. buildings. The great strength of modern building materials makes it (32) ................ to build higher and higher, to meet today’s ever increasing demands. The pattern of our large cities is being determined by skeleton structures of steel and concrete just as decisively as the pattern of medieval cities was determined by the (33) .................frame. Widespread use has made the modern skeleton structure a central theme of contemporary (34) .................
3.4.Administering the test
In order to accomplish the two purposes of test administration for this reading test-collecting feedback to assess usefulness of the reading course and making inferences about test takers’ language ability-it is necessary to have some control over the procedures for administering it. These involve guiding the test takers through the following process of taking the test:
Preparing the testing environment
The first step in the test administration was preparing the testing environment to be consistent with the specifications in the test blueprint. This involved arranging the place of testing (rooms C101 and C102), the materials (50 test papers) and equipment (fans, tables and desks for the students, chairs for the examiners, lights), personnel (2 examiners), time of testing (60 minutes), and physical conditions under which the test is administered. The weather at that time was quite good for the students to do the test.
Communicating the instructions
‘The second step in administering the test was to give the instructions in such a way that they would be understood by all the test takers. When administering the test it is essential that the test takers receive the full benefit of the instructions’ (Bachman and Palmer, 1981: 233). This included the obvious steps of providing suitable conditions (time, lighting, lack of distraction) for reading written instructions with the help of the two examiners.
Maintaining a supportive environment.
The next step is maintaining a supportive testing environment throughout the test. This includes avoiding distractions due to temperature, noise excessive movement, etc.
Collecting the tests.
The final step in the test administration was collecting the tests.The testing papers were collected by the examiners after the allowed time in each testing room was over. When they were being collected, the test takers left at their own peace.
3.5. Marking the test
The testing papers were marked according to the band scores on the 0 – 10 scale as officially approved by the HPU board of examiners after they were collected.
3.6. Test scores interpreting and evaluation
The results of language tests are most often reported as numbers of scores, and it is these scores, ultimately, that test users will make use of. The test scores of the 50 student participants were interpreted and analysed. This very analysis will simply provide a summary of how the students did the test, and to check on the test’s reliability and to have some idea of how dependable the test scores were. The following steps will provide the reader with an outline of how such analysis can be conducted.
3.6.1.The frequency distribution:
Frequency distribution Em nen format lai bang di, vi bang thua nhieu cho trong qua, va cung hoi kho hieu. Minh khong hieu truc nao dung de ta cai gi, va em list duoc diem cua bao nhieu em.
isa record of testees’ scores ranging from the lowest to the highest marks in a test. Raw marks are marks awarded by counting the number of correct answers on a test. The frequency distribution of the reading test that the author conducted is presented by the diagram below:
(It is essential to remember that the total score of the test is 50, however, after the marking the total score each student got was divided by 5 to suit the 0-10 scale previously approved by the board of examiners)
The diagram above can be seen as self-explanatory: the vertical dimension indicates the number of candidates scoring within a particular range of scores; the horizontal dimension shows what these ranges are.
When looking at the diagram, it is clearly seen that the students got different marks ranging from 1 to 9, i.e . the lowest score was 1 and the highest was 9. The charts also tells that the set of scores was distributed quite unevenly, for example no student got marks 1.5, 2.5, 8.5; the score that most of the students got was 5.5. It also points out clearly the outcome (the students who got marks 5, 5.5, 6, 6.5, 7, 7.5, 8, and 9 would pass, and those getting marks under 5 would fail,) of the test.
3.6.2.The central tendency
A convenient way of summarizing data is to find single statistic, called the CENTRAL TENDENCY, which represents an entire set of numbers. Central tendency can be defined as ‘the propensity of a set of numbers to cluster around a particular value’ (Brown and Rodgers, 2002: 128). Three statistics are often used to find central tendency:, the mode, the median, and the mean.
3.6.2.1.-The mode
The MODE is the value in a set of numbers that occurred most frequently. In a way, the mode is the simplest of the three central tendency statistics discussed here because it requires no computation. In this case the mode is 5.5 because it is the most frequent value.
3.6.2.2.The median
The MEDIAN is the point in the distribution below which 50% of the values lie and above which 50% lie. To find the median for this case, first place the values in order from low to high. Then, examine the value above and below which 50% marks lie. Here the median is 5.
3.6.2.3.The mean
The most widely used measure of central tendency is the MEAN, which is more commonly called the AVERAGE. The mean is the sum of all the values in a distribution divided by the total number of values (50).
The formula for the mean is:where: M= mean
å= sum of (or add up)
N= the number of the scores
x = the raw score
f = the frequency with which a score occurs
Using the formula above we have:
table
x f xf
1 1 = 1
1.5 0 = 0
2 2 = 4
2.5 0 = 0
3 1 = 3
3.5 2 = 7
4 4 = 16
4.5 2 = 9
5 7 = 35
5.5 12 = 66
6 4 = 24
6.5 1 = 6.5
7 6 = 42
7.5 4 = 30
8 3 = 24
8.5 0 = 0
9 1 = 9
åxf = 267.5
From the above analysis we have the mean » 5.5 and the median = 5.As a result there’s a quite fairly correspondence between the mean and the median. When comparing to the results the students got last terms it is possibly accepted because when studying General English the score they got after their exams were a little higher (the median and the mean generally ranged from 6 to 7). It is because of some reasons, firstly they had longer time to get in touch with the General English (at least 225 periods). Secondly this English was not so hard. Therefore with the mean of 5.5 and the median of 5, the test results are quite satisfactory.
3.6.3.The dispersion
Knowing about the central tendency of a set of numbers is a highly helpful way of characterizing the most typical behavior in a group. It doesn’t, however, tell us anything about the way the numbers spread out around that central or typical behavior. To know such a thing we need to find out the dispersion, which can be defined as ‘the degree to which the individual numbers vary away from the central tendency’ (Brown and Rodgers, 2002: 130). There are three primary ways of examining dispersion: the low-high, the range, and the standard deviation.
3.6.3.1.The low-high
The LOW-HIGH involves finding the lowest value and the highest value in a set of numbers. When looking at the marks the testees got and by putting the numbers in order from high to low, we can see immediately that the lowest value was 1 and the highest value was 9. Thus, the low-high is 1-9.
3.6.3.2.The range
The RANGE is the difference between the highest and the lowest scores, i.e. it is the highest value minus the lowest. The formula for the range of the reading test results is written as follow:
Range = H-L where: H= highest value ® the range of the test results = 9-1= 8
L= lowest value
The test with a big range proves that there was a wide range of abilities among the testees. 3.6.3.3.The standard deviation (SD)
The best overall indicator of dispersion of the reading test is the STANDARD DEVIATION. It is the degree to which the group of scores deviate from the mean. Brown (1988: 69) defined it as ‘a sort of average of the differences of all scores from the mean’. The standard deviation is ‘a sort of average’ because you are averaging some values by adding them up and dividing by the number of values, just as you did in calculating the mean. So the equation for the standard deviation starts with adding the squared difference between the value and the mean (5.5) up and dividing the number of the test takers (50):
where: SD: standard deviation
X: values
M: the mean of the values
N: the number of the values
Values Mean Difference Squared difference (D2)
1 - 5.5 = -4.5 20.25
1.5 - 5.5 = -4 16
2 - 5.5 = -3.5 12.25
2.5 - 5.5 = -3 9
3 - 5.5 = -2.5 6.25
3.5 - 5.5 = -2 4
4 - 5.5 = -1.5 2.25
4.5 - 5.5 = -1 1
5 - 5.5 = -0.5 0.25
5.5 - 5.5 = 0 0
6 - 5.5 = 0.5 0.25
6.5 - 5.5 = 1 1
7 - 5.5 = 1.5 2.25
7.5 - 5.5 = 2 4
8 - 5.5 = 2.5 6.25
8.5 - 5.5 = 3 9
9 - 5.5 = 3.5 12.25
åD 2 = 106.25
As seen above the Standard Deviation is the squares root of the variance. The standard deviation (SD) is a very powerful measure of ‘dispersion’. In this case we have a large standard deviation (1.46) therefore it shows us the following:
-the score distribution of the test was wide.
-the test has spread the students out.
-there was a wide range of ability among the testees.
3.7.Test item analysis and interpretation
The results obtained from the test can be used to provide valuable information concerning: + the performance of the students as a group,
+ the performance of individual student,
+ the performance of each of the items comprising the test® the difficulty level and the level of discrimination.
Therefore all the 34 items of the reading test were analysed in terms of item difficulty and item discrimination as follow:
3.7.1.The item difficulty
The Item difficulty (the index difficulty or facility value=FV) of an item shows how easy or difficult the particular item proved in the test.’ (Heaton, 1988: 175)
The formula of item difficulty (FV) is:
where: R: the number of correct answers
N: the number of the testees
i.e. Level of difficulty=proportion of students getting it right= the average score on this item.
*Note: the FV value does not tell us who got it right. It tells us nothing about discrimination.
The scales for item difficulty are:
ve (very easy) with FV=0.81¸1 (i.e. 81 to 100% students got it right)
e (easy) with FV= 0.61¸0.8 (i.e.61 to 80% students got it right)
ok with FV=0.41¸0.6 (i.e.41 to 60% students got it right)
d (difficult) with FV=0.21¸0.4 (i.e. 21 to 40% students got it right)
vd (very difficult) with FV=0¸0.2 (i.e. 0 to 20% students got it right)
The calculation for the item difficulty is presented in the table below:
Item difficulty
Items
Conclusions
R
FV
ve
e
ok
d
vd
1
42
0.82
Ö
2
32
0.64
Ö
3
44
0.88
Ö
4
34
0.68
Ö
5
32
0.64
Ö
6
45
0.90
Ö
7
25
0.50
Ö
8
41
0.82
Ö
9
22
0.44
Ö
10
33
0.66
Ö
11
23
0.46
Ö
12
32
0.64
Ö
13
20
0.40
Ö
14
32
0.64
Ö
15
49
0.98
Ö
16
31
0.62
Ö
17
30
0.60
Ö
18
24
0.48
Ö
19
15
0.30
Ö
20
41
0.82
Ö
21
16
0.32
Ö
22
46
0.92
Ö
23
32
0.64
Ö
24
20
0.40
Ö
25
24
0.48
Ö
26
16
0.32
Ö
27
31
0.62
Ö
28
8
0.16
Ö
29
19
0.38
Ö
30
7
0.14
Ö
31
7
0.14
Ö
32
6
0.12
Ö
33
12
0.24
Ö
34
7
0.14
Ö
From the results Doi het thi sang thi qua khu khi report diem cua hoc sinh
in the table above it is clearly seen that items 1, 3, 6, 8, 15, 20, 22 were fairly easy since they had the index of difficulty of more than 0.8 or 80%. In these cases, at least 81% of the students taking the test answered correctly. Items 2, 4, 5, 7, 10, 12, 14, 16, 17, 23, 27 could be seen as easy since their index of difficulty ranged from 0.61 to 0.8. With the FV ranging from 0.41 to 0.6 items 9, 11, 18, 25 were all right for the students. A few items were difficult (including items 13, 19, 21, 24, 26, 28, 33 with FV ranging from 0.21to 0.4) and very difficult (items 28, 30, 31, 32, 34 with FV ranging from 0 to 0.2).
3.7.2. The item discrimination
Item discrimination (D) indicates the extent to which the item discriminates between the testees, separating the more able testees from the less able.
The formula of item discrimination is:
where CU: the number of the correct answers of the upper half
CL: the number of the correct answers of the lower half
The scales for item discrimination are:
gd (good discrimination) with D=0.6 ¸1
md (medium discrimination) with D=0.3¸0.59
bd (bad discrimination) with D=0¸0.29
bi (bad item) with D <0
The conclusions about the item discrimination of the test are shown in the table below:
Item discrimination
Items
Conclusions
CU
CL
D
gd
md
bd
bi
1
15
27
-0.24
Ö
2
16
16
0
Ö
3
34
10
0.48
Ö
4
19
15
0.08
Ö
5
17
15
0.04
Ö
6
29
16
0.26
Ö
7
23
19
0.08
Ö
8
19
22
-0.06
Ö
9
14
8
0.12
Ö
10
17
16
0.02
Ö
11
13
10
0.06
Ö
12
24
8
0.32
Ö
13
11
9
0.04
Ö
14
20
12
0.16
Ö
15
25
14
0.22
Ö
16
10
21
-0.22
Ö
17
22
8
0.28
Ö
18
15
9
0.12
Ö
19
8
7
0.02
Ö
20
34
7
0.54
Ö
21
11
5
0.12
Ö
22
21
15
0.12
Ö
23
21
11
0.20
Ö
24
13
7
0.12
Ö
25
17
7
0.20
Ö
26
13
3
0.20
Ö
27
19
12
0.14
Ö
28
3
5
-0.04
Ö
29
15
4
0.22
Ö
30
6
1
0.10
Ö
31
4
3
0.02
Ö
32
5
1
0.08
Ö
33
8
4
0.08
Ö
34
5
2
0.06
Ö
D can range from +1 to –1. D of +1 shows perfect correlation with the testees’ results on the whole test and D of –1 discriminates in entirely wrong way.
High discrimination index shows that the item at the right level of difficulty and discriminates well. Low discrimination index shows that the items discriminate poorly because it is too difficult for everyone, both ‘good’ and ‘bad’. Therefore the results from the table above show that most items had low level of discrimination, except items 3, 12, 20 having the medium level of discrimination and items 1, 8, 16, 28 which are considered the bad ones since they had too low index of discrimination, which means that they failed to tell which students were good or bad.
3.8. Estimating reliability
There are a number of ways to estimate a test’s reliability. I shall only present here only the split half method. It is first to divide the test very carefully into two equivalent halves. Then for each student, two scores are calculated: the score on the upper part, and the score on the lower part. The more similar are the two sets of scores, the more reliable the test is, i.e. the ideal reliability coefficient is 1. A test with a reliability coefficient of 1 is the one which would give precisely the same results for a particular set of candidates regardless of when it is administered. A test which has the reliability coefficient of 0 would give sets of results quite unconnected with each other, and the test would fail to be a reliable one.
To find out the reliability coefficient of the test used in this study, the author would like to begin with the Spearman coefficient because it is conceptually the easiest to understand. Spearman coefficient is often simply called SPEARMAN RHO, or symbolized by the Greek letter p. The equation for Spearman rho is as follow:
where: p = Spearman rho correlation coefficient
D = the difference between the ranks
N= the number of cases
The scales for reliability are:
0.8®1.0 strong correlation (i.e. high reliability)
0.6®0.8 medium correlation (i.e. medium reliability)
0.4®0.6 weak correlation (i.e. low reliability)
0.2®0.4 very weak correlation (i.e. very low reliability)
To find out p we have the following table, in which:
S: Student
SU: Score on the upper half
SL: Score on the lower half
D: The difference between SU and SL
D2 : Squared difference
S
SU
SL
D
D
1
4
1
3
9
2
6
4
2
4
3
7
3.5
3.5
12.25
4
10
5
5
25
5
9.5
8
1.5
2.25
6
10
7.5
2.5
6.25
7
10
9.5
0.5
0.25
8
12
7.5
4.5
20.25
9
8.5
12
-3.5
12.25
10
10.5
10
0.5
0.25
11
15
8
7
49
12
14.5
8
6.5
42.25
13
16
10
6
36
14
16
9
7
49
15
8.5
7
1.5
2.25
16
12.5
12
0.5
0.25
17
16.5
8
8.5
72.25
18
15.5
9
6.5
42.25
19
13.5
11
2.5
6.25
20
17
10
7
49
21
14
11
3
9
22
13
12
1
1
23
16
11
5
25
24
12
15
-3
9
25
16
11
5
25
S
SU
SL
D
D2
26
15
13
2
4
27
10
18
-8
64
28
14
14
0
0
29
20
8
12
144
30
15
13
2
4
31
13
15
-2
4
32
15
15
0
0
33
20
10
10
100
34
17
13
4
16
35
22
8
4
16
36
24
8
6
36
37
14
18
-4
16
38
15
17
-2
4
39
12
20
-8
64
40
26
8
16
324
41
20
14
6
36
42
21
14
7
49
43
21
17
4
16
44
28
18
2
4
45
17
11
6
36
46
18
10
8
64
47
28
12
16
256
48
25
15
10
100
49
27
13
14
196
50
28
16
12
144
å D2 = 2206.5
(The total score of the test is 50. It is not essential for each student’s score to be divided by 5)
The strength of the relationship between the two sets of scores of the reading test is given by the correlation coefficient. When calculated as directed in the box above, this turns out to be 0.9. This coefficient relates to the two half test. But the full tests is of course twice as long as either half, and we know that the longer the test is, the greater the reliability will be. So the full reading test should be more reliable than the coefficient of 0.9 indicated. By means of a formula, it is possible to estimate the reliability of the whole test as follow:
Reliability of whole test = 2 x coefficient for split half
1 + coefficient for split half
Using the formula, we obtain a figure of 0.95, which indicates that the test was highly reliable. In conclusion, to judge whether the above reading test was a good one or not, the author will come back to the four criteria: reliability, validity, discrimination and practicality.
Once again, reliability and validity are critical for any test and are referred to as essential measurement qualities. There is a relationship between reliability and validity. On the one hand, a test may be reliable without being valid. On the other hand, if the test is not reliable, it cannot be valid at all. The above calculation of reliability coefficient showed that the test was reliable, therefore it is now possible to say that the test was also valid.
The next quality concerned is discrimination. It is the capacity to discriminate the different students to reflect the difference in the performance of the individual in a group. The test items were in a wide difficult scale, ranging from ‘very easy item’ to ‘very difficult item’, i.e. the test was neither too easy nor too difficult, so the test was the one which could realize its purpose of discrimination between candidates.
Also to prepare for the administration of the test, practicality was always considered during the development of the test. All the things to be prepared were all available, for example: sitting, marking and the test did not cost much and did not take much time.
From all the above discussion it is possible to say that this is a fairly good test though it is far from being a perfect one. For designing a better test in the future, some suggestions will be presented later at the end of the research.
Summary
In this chapter a reading test for the non-majors of CE at HPU was designed after the relative weights what is this?
of the different parts of the test were clearly pointed out. Afterwards, the author also presented the information regarding the administration and marking of the test. And then the test results were interpreted in terms of dispersion, frequency distribution and central tendency. The reliability coefficient was also established so that it was easier to regard the test as a reliable one. Finally each item of the test was analysed in terms of level of difficulty and level of discrimination so that it is easier to tell whether each item was very easy, easy, average, difficult, or very difficult for the testees and whether the item discriminated students well or not.
Part III Conclusion and recommendations
This research is aimed at designing and evaluating Muc tieu cua nghien cuu nay khong phai la de design ma la de evaluate mot bai kiem tra cho hoc sinh. Em nen xem xet lai muc tieu cua minh de viet va dat ten lai cho toan bo de tai, neu khong nguoi doc se expect the wrong thing.
an appropriate English reading test for the non-majors of Civil Engineering at Haiphong Public University. It is composed of three parts: part I, part II, and part III.
Perhaps in part III chapter three is the most important one since in this chapter the information regarding the construction of a reading test was constructed. Moreover, the author also gave an account of all the necessary steps in the administration and marking of the test. After collecting the test results, the author interpreted and evaluated them in terms of frequency distribution, central tendency and dispersion. More carefully, she also analysed and interpreted each test item in terms of level of difficulty and discriminating power which were presented in the two separating tables. This was useful to decide whether the test was suitable for the students or not. Additionally, the reliability coefficient was found out to make sure that this reading test did really satisfy all the four criteria: reliability, validity, discrimination, and practicality, though it might not be a perfect test just yet. When the test is considered appropriate, it can be used especially at Haiphong Public University where the subject matter of English for Civil Engineering is still regarded as a new one and the existing tests for it are not many.
Like any other research, this one cannot avoid some limitations. Firstly, the test was limited to testing learners’ reading ability. Secondly the test items mostly did not show good discrimination, so it maybe not a perfect one. However, the research was done with my great effort and in much time, therefore it is truly worthwhile for me. It is apparent that besides many other components such as teaching materials and learning activities, tests are part of educational programs and they always serve pedagogical purposes, tests give chance for any teacher to look back at his teaching, and tests can promote student learning, ect. Testing is important, to be sure, therefore when designing a test we must pay attention to the test usefulness by considering the important qualities such as reliability, validity, practicality, discrimination, and so on with respect to specific tests, and not solely in terms of abstract theories and statistical formulae. Moreover, we must consider these qualities from the very beginning of the test planning and development process.
For a better test form I have some suggestions as follow:
Firstly, though the test can be maintained or changed to be shorter or longer depending on the time allowed, Some test items should be changed, for instance the numbers of very easy and very difficult items like items 1, 3, 6, 8, 15, 22, 28, 30, 31, 32, 34 can be reduced to be smaller to leave space for the other item types.
Secondly, this achievement test was based on the syllabus-content approach since the subject matter was rather new and difficult for the students. The test was designed basing on what the students have already learnt in the course book. If the test had been syllabus-objective based, the students would have coped with problems in testing such as they were tested what they have not learnt and not prepared. However, it will be more favorable if a final achievement test is based on syllabus-objective approach. This is because it will provide accurate information about individual and group achievement, and it is likely to promote a more beneficial backwash effect on teaching. This can be explained by the two following reasons: firstly, at least the tester must be clear of the course objective in constructing the test and enables it to follow students’ achievement over those objectives; secondly, this can help to work against the poor teaching practice that syllabus-content tests fail to do.
Hopefully this research will be useful for those who are interested in designing their own tests, especially the reading ones.
References
Alderson, J. C., Clapham, C. and Wall, D. (1995). Language Test Construction and Evaluation. Cambridge: Cambridge University Press.
Bachman, L.F (1990). Fundamental Considerations in Language Testing. Oxford: Oxford University Press.
Bachman, L.F., and Palmer, A.S., (1981). Language Testing in Practice: Designing and Developing Useful Language Test. Oxford University Press.
Brown, J.D., and Rodger, T.S., (2002). Doing Second Language Research. Oxford University Press.
Canale, M. and Swain, M. (1980). Approaches to Communicative Competence. Singapore: Seameo Regional Language Center. Occasional Papers, 14, April.
Davies, A. (1990). Principles of Language Testing. Oxford, UK; Cambridge, Mass, USA: Blackwell Publishers.
Harrision, A. (1987, 1991). A Language Testing Handbook. Macmillan Publishers.
Heaton, J.B., (1990). Classroom Testing. Longman Group UK Limited.
Heaton, J.B., (1988). Writing English Language Tests. Longman Group UK Limited.
Henning, G. (1987). A Guide to Language Testing. Cambridge: Newbury House Publishers.
Hughes, A. (1989). Testing for Language Teachers. Cambridge: Cambridge University Press.
Littlewood, W. T. (1981). Communicative Language Teaching. Cambridge University Press.
McNamara, T.F. (2000). Language Testing. Oxford University Press.
Nunan, D. (1991). Language Teaching Methodology. UK: Prentice-Hall International.
Richards, J.C. and Rodgers, T. S. (1986). Approaches and Methods in Language Teaching. Cambridge University Press.
Viete, R. (1992). Running the Gauntlet: English Language Testing and Support for NESB Applicants to Post-primary Teacher Training Course in Victoria. Melbourne: Monash University: Unpublished M. Ed. thesis.
Weir, C.J. (1990). Communicative Language Testing. Prentice Hall International (UK) Ltd.
Weir, C.J. (1993). Understanding and Developing Language Tests. New York: Prentice Hall International.
Appendix 1
A sample text
Unit 10: Materials and properties
I-Look at these picture and translate into Vietnamese
1. A man can easily lift a large roll of glass wool but not a concrete beam.
Glass wool is light but concrete is heavy.
2. A man can bend a rubber tile but not a concrete tile.
Rubber is flexible but concrete is rigid.
3. Wood can burn but concrete cannot burn.
Wood is combustible but concrete is non-combustible.
4. Water vapor can pass through stone but not through bitumen.
Stone is permeable but bitumen is impermeable.
5. You can see through glass but not through wood.
Glass is transparent but wood is opaque.
6. Stainless steel can resist corrosion but mild steel cannot.
Stainless steel is corrosion-resistant but mild steel is not corrosion-resistant.
7. Heat can be easily transferred through copper but not through wood.
Copper is a good conductor of heat but wood is a poor conductor of heat.
8. Rubber can be stretched or compressed and will then return to its original shape but clay cannot.
Rubber is elastic but clay is plastic.
9. Bitumen can be dented or scratched easily but glass cannot.
Bitumen is soft but glass is hard.
II-Look at these diagrams. Match the letters A-H in the diagrams with the sentences below:
A B C
D E F
G H
Now complete these sentences with properties:
a-The polythene membrane can prevent moisture from rising into the concrete floor. This means that polythene is ................
b-The T-shape aluminium section can resist chemical action. This means that aluminium is ................
c-The stone block cannot be lifted without using a crane. This means that stone is ................
d-The corrugated iron roof cannot prevent the sun from heating up the house. This means that iron is ................
e-Glass wool can help to keep a house warm in winter and cool in summer. This means that glass wool is ................
f-The ceramic tiles on the floor cannot be scratched easily by people walking on them. This means that ceramic tiles are ................
g-Asbestos sheeting can be used to fireproof doors. In other words asbestos is ................
h-Black cloth blinds can be used to keep the light out of a room. This means that cloth is ................
III-Look at the picture below and answer the questions
a b c
d
e f
a-Why is glass used for window panes?
Because glass is ............................................................................................
b-Why is glass wool used to keep the heat in hot-water tanks?
Because glass wool has the property of ..........................................................
c-Why is some steel covered with a thin layer of zinc?
Because zinc is ..............................................................................................
d-Why are some fire doors covered with asbestos sheets?
Because asbestos is ........................................................................................
e-Why are some metal sheets formed into a corrugated shape?
Because the corrugated shape makes the sheet..............................................
f-Why is concrete used for the columns of a building structure?
Because ..........................................................................................................
Reading
1.Look at these diagrams and read the passage
Building materials are used in two basic ways. In the first way they are used to support the loads on the building and in the second way they are used to divide the space in a building. Building components are made from building materials and the form of a component is related to the way in which it is used. We can see how this works by considering three different types of construction:
In one kind of construction, blocks of materials such as brick, stone, or concrete are put together to form solid walls. These materials are heavy, however, they can support the structural loads because they have the property of high compressive strength. Walls made up of blocks both support the building and divide the space in the building.
In another types of construction, sheet materials are used to from walls which act as both space-dividers and structural support. Timber, concrete and some plastics can be made into large rigid sheets and fixed together to form a building. These buildings are lighter and faster to construct than buildings made up of blocks.
Rod materials, on the one hand, can be used for structural support but not for dividing spaces. Timber, steel, and concrete can be formed into rods and used as columns. Rod materials with high tensile and compressive strength can be fixed together to form frame structures. These spaces between the rods can be filled with light sheet materials which act as dividers but do not support structural loads.
2.Now say which paragraph discusses:
a-Planar construction
b-Frame construction
c-Mass construction
3.Complete this table by putting ticks (Ö) in the boxes to show the functions of the components:
Function of components
Form of material
Structural support only
Space dividing only
Both structural support and space dividing
Blocks
Sheets
Rods
4.Now say whether these statements are true or false. Correct the false statements.
a-Rod materials can be used for both dividing space and support the building.
b-Concrete can be used as a block material, a sheet material and a rod material.
c-Steel is used for frame construction because it has high tensile strength and low compressive strength.
d-The sheet materials, which act as space dividers in a frame construction building, can be very light because they do not support structural loads.
e-Mass construction buildings are light whereas planar construction buildings are heavy.
Appendix 2
Detailed key for the test results
Question 1 (10 marks, 2 marks for each correct answer)
wheelbarrow, bottom dump bucket, dump truck, hoses and steel pipelines.
It depends on the quality of concrete to be placed, the equipment available and other factors.
They are made of timber or metal.
To support the wet material and allow it to be properly compacted.
To prevent the concrete from adhering to the forms.
Question 2 (10 marks, 2.5 marks for each correct answer)
Steel has the property of high tensile strength. This means it can resist high tensile forces.
Stone has the property of good sound isolation. This means it does not transmit sound easily.
Glass wool the property of good thermal isolation. This means it does not transmit heat easily.
Brick the property of high compressive strength. This means it can resist high compressive forces.
Question 3 (10 marks, 1 mark for each correct answer)
Concrete has high fire and weather resistance.
Precast concrete is durable and at low cost.
Concrete is made from different materials.
In solid slabs reinforcement is widely spaced.
Coarse aggregate ranges in size from 20mm to 40 mm.
Concrete may segregate during conveying.
Asphalt mixture shall be spread by paver.
Rubber is elastic but clay is plastic.
Mixing concrete is done in a mixer.
Vibrators can be driven by electricity or compressed air.
Questions 4 (10marks, 1 mark for each correct answer)
F 25. F
F 26. T
F 27. F
T 28. T
F 29. T
Question 5 (10 marks, 2 marks for each correct answer)
30. minimum
31. multi-story
32. possible
33. timber
34. architecture
Các file đính kèm theo tài liệu này:
- Vietnam national university.doc