Analysis of English Language Test Tasks for Fifth and Sixth Graders in Serbia According to Bloom’s Taxonomy

Critical thinking (CT) is a vital academic and life skill. Its development begins early in life, but it needs to be cultivated both during and after one’s education. In school, CT can be taught both within the domain of different subjects or as a separate skill. For it to be properly taught, CT needs to be assessed. With that in mind, this paper investigates whether or not English language teachers in Serbia incorporate tasks at different levels of cognitive capacity in their tests so as to monitor and improve their students’ domain-specific CT skills. The authors gathered 28 English language tests constructed by 14 teachers and classified the tasks according to the six levels of Bloom’s Taxonomy. The analysis revealed that the tasks for the fifth grade include mostly tasks at the lowest level of the taxonomy, whereas those for the sixth grade are predominantly at the levels of understanding and application. Tasks requiring complex cognitive reasoning were shown to be rather scarce, which indicates that teachers do not assess students’ free and creative use of the foreign language, i.e., complex reasoning skills. It is advisable that English language teachers be trained in the very concept of CT and its successful teaching and testing principles.


Introduction 1
Critical thinking (CT) is an important skill both for academic success and for thriving in today's world. The globalized society and Information Age we live in have created a demand for people 1 tatjana.glusac@gmail.com who are skilled at managing and manipulating large pools of information. This calls for people's ability to discern between important and unimportant content or true and false data, to synthesize information, evaluate the trustworthiness of sources, make tenable decisions, and even to create something new and unique out of what is available. Critical thinking has become an educational priority at all levels of education in many countries, including Serbia, and its rulebooks on the curricula for different elementary and secondary school grades rightfully list this ability as one of the goals of education. Even though the ability to think critically emerges, and begins to be cultivated, before formal education commences, school must have the function of further developing and honing this skill in its students. More often than not, however, teacher education programs do not evolve at the same pace at which the world changes, so many of them still do not presuppose equipping future teachers with the knowledge and skills that are necessary for teaching CT. When they are employed, novice teachers seldom have an opportunity to gain relevant knowledge of CT through professional development programs. Without being instructed in how to instill the needed CT skills in their students and monitor their development, English language teachers remain unprepared to test those skills properly and direct their further progress.
The aim of this paper is to investigate the English language teachers' practice of testing cognitive reasoning within the domain of their subject. More precisely, the aim is to discover whether those teachers incorporate tasks requiring different levels of cognitive capacity from their students in the tests they design. The results are expected to indicate the teachers' familiarity with the concept of CT, their awareness of the importance of testing it, their knowledge of the test design and, finally, their students' foreign language ability.

Fostering critical thinking in a school setting
The educational objectives of any school system should comprise the development of students' affective, psychomotor, and cognitive domain (Bloom et al., 1956) in a stepwise fashion, progressing from simple to more complex behaviors. The affective domain, as described by Bloom et al. (1956: 7), includes such objectives that "describe changes in interest, attitudes, and values, and the development of appreciations and adequate adjustments. " The authors admit that the objectives related to this domain were difficult for them to define, let alone for teachers to achieve. The psychomotor domain relates to the manipulative and motor-skill area, while the cognitive field refers to the development of intellectual abilities and skills. It is more often than not that an educational system neglects the achievement of at least one of these objectives, yet all are of vital importance for ensuring sound, comprehensive education that has positive, long-term benefits for its students.
It is the cognitive domain that is the centerpiece of this paper. Speaking from the perspective of education, it presupposes different mental behaviors or cognitive processes that students perform when and after learning either to understand or memorize the content or to utilize the acquired knowledge in different situations and for various purposes. These behaviors encompass simple intellectual actions, such as: knowing, or memorizing, things, facts, rules, paradigms, etc.; understanding, or being able to transform, interpret, paraphrase, etc.; and applyingputting into use in novel situations something a person has learned. More complex cognitive processesanalysis, synthesis, and evaluation -comprise critical thinking. Each higher level is built on a solid basis of all the preceding levels of cognitive reasoning and reflects one's independent thought. Analysis, for example, relates to one's ability to break a whole into its constituent parts for the purpose of analyzing them. Synthesis is the ability to utilize the knowledge one has gathered to create something new, while evaluation is reflected in making purposeful judgments and presenting them.
As mentioned earlier, CT is needed both for academic success and for thriving in today's world. For these reasons, what remains an unresolved issue is whether it needs to be taught as a separate school subject or within different subjects as a content-spe-cific skill (Morais et al., 2019: 224). In this paper, CT is analyzed as a content-specific skill viewed within the confines of the English language classroom. As such, it can be utilized to instigate the learning of the foreign language and prompt its independent and creative use, which is the ultimate goal of language learning and, at the same time, represents the highest levels of Bloom's Taxonomy. Furthermore, CT ensures one's personal and professional success as it requires an individual's ability to approach information critically and to manipulate it successfully, independently, and creatively. Such an ability needs to be accompanied by certain dispositions or habits of mind (Facione, 1990), including inquisitiveness, fair-mindedness, flexibility, etc. The same ability also demands a number of personal traits, exemplified by tolerance, systematicity, social activism, responsibility, etc. (Mirkov & Stokanić, 2015: 26). This clearly indicates that as one develops the ability to think critically, one also develops as a person.
Foreign language learning lends itself well to teaching and improving cognitive reasoning. It is organized in a stepwise fashion and typically begins with the acquisition of isolated words, phrases, rules, and paradigms (knowledge), based on which a learner can understand another person's speech or writing (understand), and only then be able to put a few memorized words or phrases and rules into practice (application). With the acquisition of knowledge, the learner becomes aware of differences between various linguistic options and their functions in different contexts (analysis), becomes capable of producing unique communication (synthesis) (Bloom et al., 1956: 163, 169), which is the ultimate goal of foreign language learning, and develops the ability to perform different evaluations in accordance with either external or internal criteria or standards (evaluation). Moreover, CT is commonly associated with creative, analytic, and heuristic thinking, as well as with problem solving (Wattles, 2016: 6;Mirkov & Stokanić, 2015: 26). Not only is thinking protocol teachable at the macro level (when considering the general process of foreign language learning), but it is applicable in everyday classroom situations. For instance, when teaching grammar or vocabulary, the teacher may prompt different levels of cognitive capacity of his/her students depending on the activity assigned. A case in point is a vocabulary exercise given in the form of a story from which some words have been omitted. For each of the gaps the student is offered a few possible solutions and he/she needs to select the most appropriate one. That activity is exemplary of the stage of understanding as students display the ability to comprehend the story and complete it by selecting appropriate words. The same activity can be done in such a way that, instead of being offered possible answers, students need to provide their own solutions to complete the story. Such an activity is typical of the stage of application since students are required to use all their relevant linguistic knowledge acquired up to that moment and apply it in novel situations. Moreover, in a foreign language classroom students meet different cultures and lifestyles and are hence given a chance to break possible stereotypes and become tolerant to differences, to accept various opinions, etc., all of which contribute to the development of important personal traits and dispositions that pave the way to successful CT.
As suggested by Glušac, Pilipović, and Marčićev (2019: 39), developing different levels of cognitive processing in the foreign language classroom is beneficial for a number of reasons: "It leads to the gradual acquisition of knowledge, which is more easily subsumed into the existing knowledge base; it is retained far longer than material learned through rote learning; it increases the general critical thinking capacity of students as they can transfer the critical thinking pattern to other domains; it can boost students' motivation as they are active participants and their opinions are valued; it provides better chances for the application of the acquired knowledge; it resembles real-life situations and thus equips students with those abilities and skills they will need in their everyday living. " The need to foster different levels of cognitive processing and, finally, improve students' capacities to think critically has been recognized by teachers in Serbia and abroad alike. In a comprehensive study conducted in Serbia that included 1,441 primary school teachers (Mirkov & Stokanić, 2015), the teachers were found to be aware of the need to promote students' CT and to be willing to do it. However, when correlating their attitudes towards teaching CT and their actual classroom activities, it became evident that they did not implement activities that promoted CT as much as they believed they should have. Regardless of teachers' readiness to teach CT, in a study reported by Mirkov and Gutvajn (2014), 856 eighth graders from Serbia expressed their dissatisfaction with opportunities to foster their CT skills in school. They reported a lack of opportunities to ask questions, participate in discussions or express their opinion. Similar results were obtained in a Portuguese study (Morais et al., 2019) in which university teachers expressed their willingness to promote CT within their own courses. However, the findings revealed that the teachers did not possess a complete understanding of the CT concept, though they did strive to teach it using a variety of activities and learning materials. Also, the study showed that teachers encountered a number of obstacles, ranging from organizational (lack of time, group sizes, etc.) to institutional (lack of institutional culture and agreement on core principles/terms). Viewed solely within the context of English language teaching, a study conducted by Glušac and Pilipović (2016) showed that primary and secondary school teachers in Serbia attempt to improve their students' CT by engaging them in Socratic questioning, a teaching/ learning technique that requires students to investigate the nature and rationale of their thinking. The authors emphasize that the technique is beneficial in that "students are active participants in the teaching / learning process, as well as that they are responsible for constructing their own knowledge" (Glušac & Pilipović, 2016: 412). However, even though Socratic questioning is applied at the primary and sec-ondary level alike, its function remains doubtful and it is evident that some types of questions are used more than others (Glušac & Pilipović, 2016: 413). It seems then that familiarizing teachers with the notion of CT and its teaching principles should be a global necessity, so as to maximize its teaching potential. Needless to say, institutional support and adequate resources are highly crucial as well. Even more so, teachers need to be shown how to conduct assessment and learn whether they instigate CT, as the results of such investigations would point to areas that require improvement in terms of teaching and learning alike.

Participants
For the purpose of this research, in 2017 the researchers contacted English language teachers across Vojvodina and asked them to share their selfcreated tests with the researchers. Fourteen teachers from 10 towns consented to share their tests for assessing their students' knowledge of English. Altogether, 14 tests were gathered for each of the two grades analyzed in this paper. The participant teachers were aged 30-45, had between 2 and 23 years of teaching experience, and taught both in the fifth and the sixth grade.

Procedure
Upon receiving the tests, the researchers took on the arduous task of individually determining the level of each task in all the tests according to Bloom's Taxonomy, thus ensuring researcher triangulation. When classifying the tasks according to the level of cognitive capacity required from the student for completing the tasks, the researchers closely followed the definitions and examples of the six levels of cognitive processing put forward by Bloom et al. (1956: 62-197), as well as the guidance for the classification of test tasks proposed by Bloom et al. (1956: 45-59). Moreover, the following factors were consid-ered in the course of classification: number of years of students' learning of the foreign language, age, task instructions, learning context, and prescribed learning objectives.
When analyzing the tasks and assigning corresponding levels of the taxonomy, the researchers carefully studied the instructions for the tasks, assuming that the content had previously been covered in class. An example task for the sixth grade reads as follows: Correct the mistakes in the following sentences: 1. Last night, Samantha have pizza for supper. 2. My pet lizard was died last month. 3. Yesterday, I spend two hours cleaning my living room. 4. This morning before coming to class, Jack eats two bowls of cereal. 5. What was happened to your leg?
Since the instruction does not specify the type of mistakes students should look for, in executing this task they need to use all their linguistic knowledge gathered up to that point to analyze the sentences and recognize which of their parts contain a mistake so that they may correct it. Hence, we classified this task as analysis. If the instruction specified that students should correct mistakes related to the verb, it would fall in the category of application, since students would need to apply all their language knowledge to conclude what was wrong with the verb form. On the other hand, if the instruction specified that the mistakes were related to the Past Simple Tense, the task would be classified as a knowledge task, as a large part of the answer would be made obvious to them.
Besides instructions, the researchers also took into consideration the age of the learners when performing task classification. Depending on the syllabi for different grades and prescribed learning objectives for different grades, tasks can be classi-fied differently. For instance, a task in the fifth grade asking students to write dates in words was classified as application, since this required them to use a completely new rule of saying dates and apply it in novel situations. If this task were given to older students, already well acquainted with pronouncing and writing numbers and dates, it would be classified as knowledge.
The analysis of tasks also revealed that the teachers designed certain tasks for whose execution students needed to perform at two different levels of cognitive reasoning. Such tasks will be listed as a separate category.
When classification was completed, the researchers compared their ratings. In situations where discrepancies emerged, the researchers consulted the definitions and examples again to determine classification together. Once the researchers agreed on the classification of all the tasks contained in the 28 analyzed tests, they simply counted the number of tasks included in each of the six categories of cognitive processing of Bloom's Taxonomy for each school grade.

Results
Altogether, 14 tests comprising 59 one-level and 7 two-level tasks for the fifth grade were analyzed. The results presented in Table 1 reveal the prevailing level of cognitive processing at which the tasks operated in this grade was knowledge (29 tasks), followed by understanding (16 tasks), and application (13 tasks). There was only one task requiring higher-order thinking protocol (synthesis). Also, the results show that there were seven tasks whose execution required two levels of cognitive processing. Match the words on the left with the words from the box. Make adverbs out of these adjectives. Write the Past Simple for these verbs.

Understanding 16
Circle the correct option in each sentence. Write SOME and ANY to complete the sentences. Write the words in the right order to get sentences.

Application 13
Complete the sentence by putting the adjectives in brackets in the comparative or superlative form. Make questions with the words given. Write the following dates in words. Analysis 0 Synthesis 1 Describe the interior of your home. Evaluation 0 Two-level tasks 7 Look at the picture and complete the words that indicate items of furniture (knowledge). Then write a few sentences to describe where those items are (application). Fill the gaps with the appropriate forms of the verb TO BE (knowledge) and then make those sentences negative and interrogative (application). Write questions with the words given (application) and their short answers (knowledge). Total: 66 When the results pertaining to individual teacher tests were analyzed (Table 2), it was obvious that they all included the lowest level tasks, while the majority involved tasks at the subsequent two levels. Higher-order thinking skills had completely been left out, with the exception of test 13, which contained a task classified as synthesis. Moreover, the results reveal that most tests contained several tasks at the first two levels of the taxonomy, whereas in those tests containing tasks that required the operation of application there was typically only one such task per test (with the exception of tests 6, 12, and 14). Also, there were quite a few tasks requir-ing the use of cognitive reasoning at two levels, most commonly at the knowledge and application levels.
The analyzed tests for the sixth grade show a somewhat different picture (Table 3). Out of 66 tasks included in the analyzed tests, most were either classified as understanding (23 tasks) or application (21 tasks), followed by those at the first level (knowledge) of Bloom's Taxonomy (17 tasks). These tests also included five tasks at the higher levels of cognitive processing (analysis -3 tasks, synthesis -1, task and evaluation -1 task). The tests for this grade did not include tasks whose performance required the use of two different levels of reasoning.

Understanding 23
Match the expressions with the pictures. Put the words in the correct order to make sentences. Complete the dialogue with the words offered.

Application 21
Write advice for the following situations using SHOULD and SHOULDN'T. Complete the sentences with the passive voice in a suitable tense. Kim did a lot of things yesterday morning. Write a sentence for each picture.

Analysis 3
Complete the questions and answers. The verbs in the sentences can be in different tenses. Study the following pairs of sentences and decide which one is grammatically correct. Correct the mistakes in the following sentences. Synthesis 1 Make true sentences about you using the following verbs and ideas. Evaluation 1 Write down a thing you are not allowed to do and a thing you can do. Two-level tasks 0 Total: 66 When the distribution of tasks at different levels of Bloom's Taxonomy is analyzed from the perspective of individual teacher tests (Table 4), it can be noticed that only three teachers did not include tasks at the lowest level of cognitive complexity in their tests (see tests 2, 9, and 14). Also, it is evident from the results that the tasks requiring complex cognitive processing (analysis, synthesis, and evaluation) were few and apart. As is evident from the table, most teachers always included tasks at the first three levels of the Taxonomy (e.g., see tests 1, 3, 5, etc.).

Discussion
Generally speaking, the presented results are quite unsettling since the vast majority of the tasks included in the analyzed English language tests do not fulfill the scope of the three levels comprising CT. In other words, the analyzed tests do not help students improve their domain-relevant CT skills, which implies that they are not given a chance to use the acquired language freely and creatively, but are only asked to reproduce it. In Table 2, for example, only one task (see test 13) from the fifth-grade tests is at a higher level of the taxonomy, which would require students to use the language for self-expression. In the sixth grade, the picture is only slightly better, as is evident in Table 6 (see tests 2, 6, 11, 12, and 13), where five tasks are shown that would prompt an individual to creatively use his/her gathered knowledge. All the other tasks for both grades require only the application of the low-order thinking skills.
Along the same lines, it is further unsettling that the tasks included in the tests for the fifth grade belong to a great degree to the lowest level of the taxonomy, asking students simply to remember/recall/regurgitate stored information. Even though the authors do acknowledge that cognitive reason- ing is cumulative in nature (Bloom et al., 1956: 18), i.e., that for the performance of cognitive activities at all levels of complexity the person needs to know the rules, definitions, and paradigms, and that there are justifications for the teaching of knowledge, as pointed out by Bloom et al. (1956: 32-36), the teaching and testing of a foreign language should not be solely based on separate language items. Students need to be exposed to a variety of situations in which they would use the acquired knowledge for communicative purposes. In the analyzed tests there was found only one such task (see Table 1, level of synthesis) asking students to describe the interior of their home. However, the analysis of the tasks for the same grade reveals that teachers do combine two levels of cognitive reasoning in certain tasks (see Table 1, two-level tasks), most commonly the knowledge and application levels. On the one hand, such tasks are useful for both students and teachers as they require the application of knowledge students have previously shown they possess. On the other hand, such tasks might not be in concert with the recommended test construction practice. Namely, when discussing multiple choice constructions, Dimitrijević (1999: 95) warns against those questions whose execution directly impacts the execution of subsequent tasks. The same warning might apply to other test techniques as well since, if students make a mistake or fail to do one test item, they inevitably fail to do the following one(s). The teachers' insistence on declarative knowledge in the analyzed tests is also contrary to what is prescribed by the Rulebook on the Syllabus for the Second Cycle of Primary School Education and Curriculum for the Fifth Grade of Primary School. This document clearly indicates that students need to possess both receptive and productive types of language knowledge and to be able to communicate both in written and oral form. However, the analyzed tests show a clear inclination towards receptive knowledge despite the fact that the students for whom the tests had been designed had been learning English for at least 4 or 4.5 years at the moment of testing and, supposedly, possessed enough language knowledge to be able to use it freely and creatively, at least to some extent.
Since the analyzed tests for the two grades were constructed by the same teachers, the analysis of the results of the levels of cognitive capacity required in English language tests presented in Tables 1 and 3 reveals that the participant teachers implement tasks at different levels of the taxonomy for the two grades. More precisely, the majority of the tasks found in the tests for sixth graders require understanding and application, whereas for the fifth grade the tasks where shown to operate at the first two levels. Such a finding is encouraging as it indicates the teachers' awareness of increased cognitive capacities of their older students. Also, the fact that a greater number of tasks fall within the scope of understanding is aligned with the claim of Bloom and his associates (1956: 89) and Wattles (2016: 159) that understanding is the most prevailing intellectual level both in school and college. On the other hand, the same results for the sixth grade are discouraging since only 5 out of 66 tasks in the analyzed tests are at levels which presume the free and creative use of the language. If the test design applied in the analyzed tests is indeed a mirror reflection of the teachers' general approach to testing, then this finding most probably indicates that the participant teachers employ such teaching and testing techniques that focus almost exclusively on separate items of the language system, rather than integrating those individual items into some form of cohesive whole. Such a practice is then contrary to what is prescribed by the Rulebook on the Syllabus for the Second Cycle of Primary School Education and Curriculum for the Sixth Grade of Primary School, which clearly emphasizes students' use of the language and prescribes that operative tasks should be more complex than for the previous grade. Moreover, the analysis of the results of individual teacher tests presented in Table  4 shows that the teachers most commonly combine tasks on the second and the third level of the taxonomy and that they sometimes also include tasks that require remembering (knowledge). In instances where the tasks operating at higher levels are included (see Table 4, tests 2, 6, 11, 12, and 13), they are always combined with understanding and application tasks (except for test 6) and there is always only one such higher-order thinking task per test.
When these results are compared with those obtained by Glušac, Pilipović, and Marčićev (2019), who investigated the levels of cognitive capacity required in English language tests for seventh and eighth graders, it is obvious that there is a tendency among English language teachers in Serbia to design tests that are comprised predominantly of low-level thinking tasks. In the analyzed tests for the seventh and the eighth grade there was also a paucity of tasks requiring higher levels of cognitive operation. In both of those grades the tasks at the second level of the taxonomy were most dominant, closely followed by those at the level of knowledge (eighth grade), or equally by those operating at the level of knowledge and application (seventh grade). When compared to the findings presented in this paper, it is evident that the situation is slightly better in the tests for the sixth grade, in which the majority of the tasks are at the levels of understanding and application, whereas it is least favorable in the fifth grade, in which the majority of the tasks are at the first level of the taxonomy. All in all, what becomes evident from these examinations is that instead of an increase in levels of cognitive complexity with age there appears to be a rather random selection of levels, which is not aligned with the students' cognitive maturity or linguistic proficiency. As pointed out by Glušac, Pilipović, and Marčićev (2019), this might be a result of English language teachers' unfamiliarity with the concept of CT and a predominantly structuralist approach to language teaching and testing. Further investigation into the origin of this situation would be beneficial and could reveal whether or not such results might also be attributed to a mismatch between the teachers' teaching and testing practice, which they might be unaware of and which, as pointed out by Anderson et al. (2001), could be detrimental to successful test performance.

Conclusion
The results of the research presented in this paper show that English language tests in Serbia at the fifth and sixth grade levels mainly include tasks at the three lowest levels of Bloom's Taxonomy, which do not call upon or develop CT skills. This implies that the predominant testing approach in the analyzed tests is structuralist, favoring discretepoint testing, instead of integrative, which presupposes the communicative function of the language. However, the gathered tests analyzed for the purposes of this paper may only constitute one measuring instrument contained in a battery of tests assessing different types of knowledge and skills. Hence, we must not jump to the conclusion that the participant teachers never require, or offer opportunities to, their students to use the language for communicative purposes. Still, these results, as well as those obtained by Glušac, Pilipović, and Marčićev (2019) relating to a nearly identical analysis of the subsequent two grades, clearly indicate that there is a tendency among English language teachers in Serbia towards a structuralist approach. The results necessitate familiarizing Serbian English language teachers with the notion, teachability, and testing principles of CT, as well as informing them about the benefits and pitfalls of the predominant testing approach they seem to have adopted in order to ensure quality teaching of CT and quality foreign language testing.