Teaching and Learning Forum 99 [ Contents ]

Anonymous peer review for classroom use: Results of a pilot project in a large science unit

Jennifer M. Robinson
School of Environmental Science
Murdoch University
It is expensive to give students enough writing practice or enough feedback on their writing. In 1998, I tried to address these problems with an anonymous peer review system in which authors and reviewers were anonymous, and both papers and reviews were assessed.

Students were surveyed after the assessment was complete. They showed an overwhelming preference for anonymity, and strongly supported the statement that reviewing other students' work taught some useful lessons. Most students regarded the approach as fair, and the class was more or less evenly divided on whether the feedback received from peer review was more helpful than that received from conventional assessment.

The system roughly halved paid marking time, and assessors found it much less stressful than conventional marking. The biggest problems were the uneven quality of reviews-which meant that a large number of students got little useful feedback-and the paperwork burden, which consumed most of the time savings.

The system could be improved by increasing the number of reviews a student gives and gets, and by providing a trial run. An online system is being developed to facilitate these activities and manage the paperwork.


Introduction

In the English department at Harvard, my writing style was severely criticized and I was receiving grades of C or C+ on my papers. At eighteen, I was vain about my writing and felt it was Harvard, and not I, that was in error, so I decided to . . . experiment. The next assignment was a paper on Gulliver's Travels, and I remembered an essay by George Orwell that might fit. With some hesitation, I retyped Orwell's essay and submitted it as my own. I hesitated because if I were caught for plagiarism I would be expelled; but I was pretty sure that my instructor was not only wrong about writing styles, but poorly read as well. . . George Orwell got a B- at Harvard, which convinced me that the English department was too difficult for me. Crichton, 1988, p.4.
In university, assessing a paper usually means marking up errors in spelling, grammar and usage, jotting a few telegraphic comments ('good', 'unclear', 'wordy', 'reference', 'paragraph structure', 'avoid platitudes', '?', etc.) in the margins and adding a mark. It is common to mark up a few pages of the paper and leave the rest.

The system works poorly. A survey of assessment in Australian tertiary institutions found that the single most common student complaint was insufficient feedback on written work (Candy, Crebert, and O'Leary 1994). Malaise is justified: A recent survey of employer satisfaction with university graduate skills in Australia found that: "up to half of graduates who are achieving adequate academic results, can be rejected by employers after testing for literacy and numeracy (A C Neilson Research Services 1998) (Sec. 5.3).

Providing fair assessment of and adequate feedback on writing requires time and skill. Large classes tend to be worst. When hundreds of papers are dispersed to a group of markers (often postgraduates and casual employees) standards tend to become vague and assessment uneven.

An alternative

Journal submissions, proposals, and documents related to hiring, firing, and promotions are often assessed by multiple assessors using an anonymous peer review system. In theory, this provides both careful scrutiny of texts and safeguards against arbitrary judgements.

Might anonymous peer review-with staff doing the editing and students doing the reviews- work in a university setting? Numerous sources (Boud 1995; Boud and Holmes 1995; Campbell 1996; Haaga 1993; Horgan and Barnett 1991; Jackson 1996; Jenkinson 1988; Marcoulides and Simkin 1995; Sims 1989; Vatalaro 1990; Zariski 1996) report good results from self- and peer review, but I can find no reports of anonymous reviewers in a tertiary classroom. This paper reports on a first attempt.

Context

N213 is in a large (150+ students), required second-year atmospheric science unit which has, for many years, included a paper on a topic of the student's choice. The normal problems of essay marking in large classes-marker burnout, disparities between markers, and insufficient resources to pay markers for a thorough job-were evident. There was talk of scrapping the essay altogether. Anonymous peer review was adopted as an alternative.

Preparation

To make the paper more amenable to peer assessment its objective was rephrased:"Assume your reader is someone like yourself. Your goal is to teach her or him something he or she does not know about something related to atmospheric science, and to make the result a good read." Text was provided to explain how anonymous peer review works in the professional world, and to set up a game plan for approximating the peer review process in the classroom.

George Orwell's (1946) essay, Politics and the English language [1] and notes on common errors in student writing were added to the Study Guide in attempt to reduce the incidence of mindless and illiterate writing habits. Notes on "How to assess quality" and text describing the attributes expect expected of papers marked HD, D, C, P and NP were inserted, along with threatening words on the necessity of handing papers in on time and sticking to the correct form (ie, include your student number but not your name).

Implementation

The plan was as follows: all students review two papers and do a self review; in turn, all students papers are reviewed by two other students. There were 20 possible points for the paper and 2.5 possible points for each review. The 'editor'-in practice a paid tutor-gave points based on the three reviews. If the three agree, the mark was based on the consensus position. If the reviews disagree, the editor set the mark based on reading all or parts of the paper, heeding the reviewers' comments as appropriate. In addition to setting marks on papers, the editor marked the reviews. Students could rewrite their essays based on reviewers' comments and proceed to a second round of reviews. The final mark was the average of the mark for the original paper and the rewrite.

It took two people most of a day to sort papers by subject, and to check to see that all essays were submitted in proper form, fix errors in submission (such as including names where anonymity was desired and failing to submit in triplicate). Once papers were grouped by subject, a simple pass-down procedure was used to associate authors with reviewers within each subject cluster. Thus students who wrote on cyclones typically received papers on cyclones or storms, and students who wrote on animal responses to weather typically received papers on animals, etc. Two papers for review, plus a sheet giving last minute notes on review and three marking sheets were then bundled into an envelope labeled with the name and student number of the reviewer.

Incoming reviewed papers and self-assessment sheets were sorted by student number, which automatically re-united papers by the same author. Two tutors did the bulk of the marking. They started with the three review sheets (two peer reviews, one for self-review) and then consulted the manuscript to resolve differences between reviewers. In most cases they made brief comments on the one of the two reviewed manuscripts. The two sets of reviewer's marks were returned to the authors in the same envelope in which they received and handed in their papers for review. Students were encouraged to appeal-in writing-if they felt their marks were unfair. Students desiring to rewrite were given 10 days to do so, and were asked to include a description of how they had responded to reviewers' comments in their rewrites.

A unit survey was conducted shortly after students got their papers-and marks-back from the anonymous peer review. This included seven questions related to the peer review exercise. These are listed in full in the caption to Figure 1.

Results

Many students provided conscientious and thoughtful reviews. Some even went to the library to check facts and references. Most were gracious and positive, even about seriously deficient work. But some were frank, blunt, or strongly critical. In conversation, several students said they found the papers they were reviewing interesting. Others expressed shock at the quality of the writing. Eight students reported plagiarism or borderline plagiarism. Quite a few reviews lacked substance or were inappropriate for the paper they addressed.

The editors were impressed by inconsistencies. It was not unusual for a paper to receive a mark of 55% from one reviewer and a mark of 85% from the next. Some students gave top marks for work that contained numerous technical errors and did not show serious research effort (eg, reference list dominated by textbooks and encyclopedia entries). Self-reviews ranged from modest to brash. Occasionally, students criticized things that teachers reward: a few reviewers criticised the use of common scientific terms ('too much jargon'), and one marked down a paper that sited eight or so journal articles for having "too many references".

Because inconsistency was rampant, editors ended out reviewing most of manuscripts and often overruled reviewers' opinions, sometimes writing things such as "ignore silly remarks by reviewer 2" on the paper. The eight students found to have committed plagiarism or borderline plagiarism were given marks of 40% and a note encouraging them to fix the problem and resubmit for a higher mark.

Editors' greatest output of 'red ink' went to issues of references and research. It was quite clear, in looking over marked papers, that a substantial fraction of our students avoid professional journals and other technical sources, and do not stick to the rules of referencing. Spelling, grammar, and usage errors and technical errors, such as failure to provide figure captions, also attracted much comment from reviewers and editors.

Ten students appealed the marks given by the first peer review. In two cases, appeals resulted in higher marks. The remaining eight got written explanations of why the mark given was deemed appropriate. Twenty students opted to rewrite, including all that received plagiarism warnings. On average, rewrites registered a gain in marks of 25%. All students who received plagiarism warnings rewrote, and passed the second time.

The process was complicated. Nonetheless, the editors found it faster and less stressful than conventional marking. They averaged 12 minutes per paper to review and mark reviews, decide on a grade, provide summary comments to the author and record the results. As much time was spent on paper shuffling as on marking. The total time required was close to that required in previous years using conventional marking.

Survey results

At the end of semester, students were surveyed about the class. The following questions were inserted with respect to the review (parenthesis denote headers in Figure 1).

Figure 1: Results of questionnaire following peer review exercise. N=113. Undecided answers
omitted from the basis of calculation. X-axis labels relate to questions listed in text.

As shown, almost all students favoured anonymity, and most students thought that adequate instructions were given, and felt that they had learned useful lessons from having to review other students' work. There was fairly strong disagreement with the statement:"The mark I got from peer review was unfair." Over a third of the class agreed that "The Orwell article changed the way that I look at my own writing." The class was more or less evenly divided on whether they hoped to encounter anonymous peer review again, and whether the feedback received from peer review was more helpful than that received from conventional assessment.

Conclusions

The outcomes of the exercise were complex. All students were exposed to anonymous peer review and got an opportunity to try on the shoes of the assessor, and all received two reviews of the same paper. Because reviews were uneven, this resulted in many combinations of thoughtful, articulate criticism, ill-founded criticism, praise with an insecure tone, cursory skimming, etc. Learning tends to be rapid at the bottom of the learning curve; thus even students who found the exercise uncomfortable were positioned to learn from the experience. Students react as individuals. Some liked the experience, and some did not. The administrative side of the exercise was smoother sailing than expected. Although anonymous peer review was more time consuming than conventional assessment in terms of paper handling, it roughly halved the professional time required for assessment, and at the same time reduced the stress associated with essay marking.

Discussion

The greatest shortcomings of the exercise were in the uneven quality of reviews-and the fact that the luck of the draw resulted in some students getting two unhelpful reviews. Its greatest strength is in its potential for formative learning and reduced costs in the marking of complex student work. A further bonus lies in the fact that the approach is new and highly amenable to improvement. Many of the areas in which it cries for improvement (eg, more reviews of each paper and running of a practice round of reviews before the final round) will be facilitated by constructing the system with a computer network at its front-end and a database at its back-end.

Endnotes

  1. A much-reprinted classic that stresses the relationship between clear thinking and good writing-and analyses and mocks pretentious prose.

References

A C Neilson Research Services, Government and Social Research Team (1998). Research on Employer Satisfaction with Graduate Skills: Interim Report. Canberra: DEETYA, Evaluations and Investigations Programme, Higher Education Division.

Boud, David (1995). Enhancing Learning through Self Assessment. London: Kogan Page.

Boud, David, and Harvey Holmes (1995). Self and peer marking in a large technical subject. In Enhancing Learning through Self Assessment, edited by D. Boud. London: Kogan Page.

Campbell, Enid (1996). Case study 32: Research assignment. In Assessing Learning in Universities, edited by P. Nightingale, I. T. Wiata, S. Toohey, G. Ryan, C. Hughes and D. Magin. Sydney: Professional Development Ctr, UNSW.

Candy, P.C, G. Crebert, and J. O'Leary (1994). Developing Lifelong Learners through Undergraduate Education. Canberra: National Board of Employment, Education and Training.

Crichton, M. (1988). Travels. London: Pan Books.

Haaga, D.A.F. (1993). Peer review of term papers in graduate psychology courses. Teaching of Psychology, 20 (1):28-31.

Horgan, Dianne, and Loretta Barnett (1991). Peer review: It works. Paper read at Annual Meeting: American Educational Reserahc Association, April 3-7, 1991, at Chicago.

Jackson, Michael (1996). Case study 53: Peer reading and self evaluation. In Assessing Learning in Universities, edited by P. Nightingale, I. T. Wiata, S. Toohey, G. Ryan, C. Hughes and D. Magin. Sydney: Professional Development Ctr, UNSW.

Jenkinson, Edward B. (1988). Learning to write / writing to learn. Phi Delta Kappan, 69(10):712-17.

Marcoulides, George A., and Mark G. Simkin (1995). The consistency of peer review in student writing projects. Journal of Education for Business, 70(4):220-223.

Orwell, G. (1946). Politics and the English language. Horizon, 76. Reprinted many times.

Sims, Gerald K. (1989). Peer review in the classroom: A teaching and grading tool. Journal of Agronomic Education, 18(2):105-108.

Vatalaro, Paul (1990). Putting students in change of peer review. Journal of Teaching Writing, 9:21-29.

Zariski, A. (1996). Student peer assessment in tertiary education: Promise, perils and practice. In Abbott, J. and Willcoxson, L. (Eds), Teaching and Learning Within and Across Disciplines, p189-200. Proceedings of the 5th Annual Teaching Learning Forum, Murdoch University, February 1996. Perth: Murdoch University. http://cleo.murdoch.edu.au/asu/pubs/tlf/tlf96/zaris189.html

Please cite as: Robinson, J. M. (1999). Anonymous peer review for classroom use: Results of a pilot project in a large science unit. In K. Martin, N. Stanley and N. Davison (Eds), Teaching in the Disciplines/ Learning in Context, 348-353. Proceedings of the 8th Annual Teaching Learning Forum, The University of Western Australia, February 1999. Perth: UWA. http://lsn.curtin.edu.au/tlf/tlf1999/robinson-j.html


[ TL Forum 1999 Proceedings Contents ] [ TL Forums Index ]
HTML: Roger Atkinson, Teaching and Learning Centre, Murdoch University [rjatkinson@bigpond.com]
This URL: http://lsn.curtin.edu.au/tlf/tlf1999/robinson-j.html
Last revision: 1 Mar 2002. The University of Western Australia
Previous URL 22 Jan 1999 to 1 Mar 2002 http://cleo.murdoch.edu.au/asu/pubs/tlf/tlf99/ns/robinson-j.html