Ratings Myths and Research Evidence
Myths about student ratings are more widely accepted than the
evidence.
Misconceptions about student ratings of instruction are so widespread that
the myths associated with the practice have, for the most part, outstripped
the research knowledge we have on the subject.
What follows is an attempt to answer some of the commonly asked questions
about student ratings.
Are students qualified to rate their instructors?
The myth says "No," but, generally speaking, the research says "Yes."
Students spend a full term in the course. They observe the instructor in
class and in interactions with other students and, importantly, students can
judge how much they've learned in a course.
Students can report on the frequencies of teacher behaviors, the amount of
work required, the difficulty of the material, the quality of lectures, the
value of readings and assignments, the clarity of the instructor's
explanations, the instructor's availability and helpfulness, and many other
aspects of the teaching and learning process.
Peer and administrator visits to the class usually occur once or twice per
term. Such visits are valuable, but they can't represent the range of events
upon which students base their opinions.
But this exclusive knowledge base doesn't mean students are qualified to
report on all issues. For example, beginning students do not have sufficient
depth of understanding to accurately rate the instructor's knowledge of the
subject. They might estimate knowledge based on the instructor's ability to
respond to questions, but this estimate is probably less valuable than a
colleague's rating if the purpose is to assess the depth and breadth of the
instructor's knowledge.
Students are qualified to express their satisfaction or dissatisfaction
with their own experience in a course. Their opinions on these matters are
not direct measures of the performance of the teacher, but they are
legitimate indicators of student satisfaction. And there is a substantial
research base linking student satisfaction to effective teaching.
But it should be clear that if an evaluation of teaching is to produce
useful data, the student rating instrument must ask questions students can
legitimately answer. And if the purpose of evaluation is to make decisions
about teaching, those making such decisions need sources of data in addition
to the information that comes from the students.
Are student ratings based solely on popularity?
The myth implies that a popular teacher is not a good teacher. But there is
no basis for this argument and no research to substantiate it. On the other
hand, there's plenty of research evidence showing the overall validity of
student ratings.
Implied in the popularity-equals-good ratings myth is the idea that
learning should somehow be unpleasant. The "popularity" statement
is usually accompanied by an anecdote noting that "The best teachers I
had were the ones I hated the most."
The assumption that popularity means a lack of substance or knowledge or
challenge has no evidence to back it up. Nor is there any evidence that
students learn more from the "feared and hated" faculty members.
Are ratings related to learning?
What the evidence does indicate is that there are consistently high
correlations between student ratings of the "amount learned" in a
course and their overall ratings of teacher and course.
Even more telling, in studies in multi-section courses that use a common
final exam, the students who gave the highest ratings to their instructors
were the ones who performed best on their exams. Quite simply, those who
learned more gave their teachers higher ratings.
Can students make accurate judgements while still in school?
The myth says that students can only discern real quality after years of
experience in the work force. There is no research proving this statement.
But there are several studies comparing ratings in class to ratings by the
same students in the next term, the next year, immediately after graduation,
and several years later.
All these studies report the same results: Student opinions change very
little over time: Teachers rated highly in class are rated highly later on,
and those with poor ratings in class continue to get poor ratings later on.
Are student ratings reliable?
This is more a technical question than the others since "reliability"
is a psychometric term and is measurable. The myth says "No." The
research, "Yes."
Whether reliability is measured within classes, across classes, over time,
or in other ways, student ratings are remarkably consistent by all accepted
standards of statistical measurement.
Does gender make a difference in how students rate their
instructors?
Reviews of gender studies conclude there is no strong or regular pattern of
gender-based bias in student ratings. That is, students do not favor
instructors on the basis of gender alone.
But studies do suggest there is gender bias in other aspects of higher
education, and there are implications for ratings practice in these studies.
For example, one study found that female instructors in one department
were largely assigned entry-level, required, large-enrollment courses while
males disproportionately taught upper-level and graduate seminars.
Since the research suggests ratings in the first group of courses will
usually be lower, the disproportion of course assignments put the female
instructors at risk of lower student ratings.
If the interpretation of student ratings results doesn't take into account
the imbalance in course assignment, but simply average scores by gender,
women would have lower scores.
These lower scores would constitute an unfair evaluation of female
faculty. The scores would reflect the differences in teaching situations,
not that female instructors are less competent and not that students are
biased against them.
Are ratings affected by situational variables?
The research says that ratings are robust and not greatly affected by such
variables. But we must keep in mind that generalizations are not absolute
statements.
For example, we know that required, large enrollment, out-of-major courses
in the physical sciences get lower average ratings than elective, upper
level, in-major courses in literally all disciplines.
Does this mean that teaching quality varies? Not necessarily. What it does
show is that effective teaching and learning may be harder to achieve under
certain sets of conditions.
Comparisons of faculty teaching based on student ratings should use
sufficient amounts of data from similar situations. It would be grossly
unfair to compare the ratings of someone teaching a graduate seminar with 10
students to the ratings of someone teaching an entry-level, required course
with an enrollment of 200.
Common sense, research, and ethical practice all demand correct
interpretation and use of evaluation data.
Do students rate teachers on the basis of expected or given
grades?
This is currently the most contentious question in ratings research. There
is consistent evidence of a positive correlation between grades and ratings.
But researchers conclude there should be a relationship between ratings
and grades. Quite simply, good teaching leads to effective learning, which
leads to student achievement and satisfaction. Ratings simply reflect this
sequence.
Some recent studies claim that, all else being controlled, giving higher
grades can raise ratings.
The question at this point becomes an ethical one: "Is giving higher
grades in order to get higher ratings a problem with ratings or a problem
with ethics?"
In the final analysis, the research overwhelming supports the legitimacy
of student ratings, but--and this proviso is essential--the ratings must be
used correctly to be valid. Their correct use is the issue we consider
next.
|