Skip to content
 

Grade inflation: why weren’t the instructors all giving all A’s already??

There’s been some discussion lately about grade inflation. Here’s a graph from Stuart Rojstaczer (link from Nathan Yau):

Rojstaczer writes:

In the 1930s, the average GPA at American colleges and universities was about 2.35, a number that corresponds with data compiled by W. Perry in 1943. By the 1950s, the average GPA was about 2.52. GPAs took off in the 1960s with grades at private schools rising faster than public schools, lulled in the 1970s, and began to rise again in the 1980s at a rate of about 0.10 to 0.15 increase in GPA per decade. The grade inflation that began in the 1980s has yet to end. . . . These trends may help explain why private school students are disproportionately represented in Ph.D. study in science and engineering and why they tend to dominate admission into the most prestigious professional schools.

People have discussed why the grades have been going up and whether this is a bad thing.

I have a slightly different take on all this. As a teacher who, like many others, assigns grades in an unregulated environment (that is, we have no standardized tests and no rules on how we should grade), all the incentives to toward giving only A’s: When I give A’s, students are happier and complain less, I get to feel like a nice person, and I give my own students (whom I generally have somewhat warm feelings toward) a benefit in their future lives. Back when I used to organize a class with several different section leaders, each instructor wanted to give his or her students higher grades. We had common assignments and a common final exam; even so, each instructor had a reason why his or her students deserved some exemption from the grading cutoffs.

So the real question is, why have grades been going up so slowly? I assume that back in the 1940s, a prof couldn’t really just give all A’s to his or her classes: someone would probably notice and say something. But now we really can, and it’s been that way for awhile.

The fact that profs don’t give all A’s, even though they can, is interesting to me. My explanation for this behavior is as follows: college professors typically got high grades themselves in college. Getting high grades is part of how we defined ourselves when we were students. So, now that we’re giving out the grades, we don’t want to devalue this currency. It’s not a matter of self-interest–if I give out a bunch of A’s to my students, it’s not going to retroactively tarnish my college grade-point average. Rather, I think it’s just that profs see grades as important in themselves. Sort of like rich people who don’t want to debase the currency, just as a matter of principle.

I remember looking at grading records for undergraduate classes back when I taught at Berkeley in the early 1990s. There was lots of variation in average grades by instructor, even for different sections of the same class. I didn’t do a formal study, but I remember when flipping through the sheets that average grade seemed to be correlated with niceness. The profs who were generally pleasant people tended to give lots of A’s, while the jerks were giving lower grades. Again, no standardized tests so no way to judge whether the average grades were informative, but I doubt it.

At the institutional level, these problems with grades would be fixed using standardized tests or with some sort of statistical correction such as proposed by statistician Val Johnson, who writes:

There are two approaches that might be taken in reforming our grading system. The first is to encourage faculty to modify their grading practices and adhere to a “common” grading standard. The second is to make post-hoc adjustments to assigned grades to account for differences in faculty grading policies.

The beauty of Val’s approach is that it does three things:

1. By statically correcting for grading practices, Val’s method produces adjusted grades that are more informative measures of student ability.

2. Since students know their grades will be adjusted, they can choose and evaluate their classes based on what they expect to learn and how they expect to perform; they don’t have to worry about the extraneous factor of how easy the grading is.

3. Since instructors know the grades will be adjusted, they can assign grades for accuracy and not have to worry about the average grade. (They can still give all A’s but this will no longer be a benefit to the individual students after the course is over.)

33 Comments

  1. Frank Kroell says:

    A potential issue with post-hoc adjustments: How to differentiate between a class with all really excellent students and a class with all average students who just got their A's because the professor is a really nice guy? Just imagine your scenario where students "choose and evaluate their classes based on what they expect to learn" and all the really smart and eager students pick the class which requires a lt of work (and do well in it), while the average student prefers a class that's easier on their weekend schedule?

    • Alex Reutter says:

      So long as there is some mixing of “better” students and average students in a few classes that don’t give everyone A’s, this shouldn’t be a problem using Val’s methodology. It’s not clear to me how big an “if” that is.

      (full disclosure: I worked with Val on this model at Duke for a little while after he first proposed it and published on it)

  2. C Ryan King says:

    I think the deepest problem is the “ability adjustment”. Even under the assumption of a scalar ability, you aren’t at all guaranteed to get a hold on selection bias just by using a regression strategy. When you think of “ability” as vector valued it becomes intractable.

    • andrewgelman says:

      Sure–but I still think a reasonable adjustment would be better than none at all.

      • C Ryan King says:

        1) I certainly think it’s worth an experiment. I agree it’s unlikely be worse (if thoughtfully designed) for most applications.

        2) As long as the policy makers take it with the right amount of seriousness. Similar to many other settings, there’s a reasonable fear that once it’s “fixed” with a not-so-complete system administrators will believe it to be perfect.

    • Alex Reutter says:

      IIRC, I believe that Val and his student Stephen Ponisciak did some work on the selection bias problem between ’98 and ’02, though I don’t believe they looked at vector-valued “ability”.

  3. Jeremy Miles says:

    For the benefit of the non-Americans, can someone explain how a letter grade is turned into a GPA? Does an A count 3 (or 4?)?

  4. Phil says:

    A friend taught a class at a public university last semester. She gave everybody an A, except one person, who got a B+. She insists that all of them did A-level work.

  5. Todd Trautman says:

    As a college professor at several different institutions over teh course of my career, I was surprised at the lack of transparency in the grading process. I had little ability to compare myself to others. Am I grading harder or easier than my peers? Does the size of the class matter? I have found that in classes with only 10 students, I had a much more personal relationship with the students and the grading was higher than in much larger classes of 200+ students. Without any data, even at the institutional level, I had little ability to adjust.

    When asking for this information from Deans and Chairs, I would receive vauge answers such as "the typical GPA is about a 3.2" Rather than try to develop complex adjustment schemes, this minimal feedback could be a useful first step.

  6. fhk says:

    Thing that sicks out for me is the climb during the Vietnam war. Not sure how draft deferment worked. Maybe you could split out women's colleges. Spending 30 seconds looking at the list of schools in the pdf, it's not obvious there would be enough of them.

  7. Anonymous says:

    I’m sorry, but this post is just plain chalk full of nahrishkite (ask your grandmother the meaning of that word if you don’t know). You wonder why more A’s weren’t given in the 1940’s (or now for that matter)? Really? Again, ask your grandmother.

    The whole conflating of grades with niceness (and your jab at those who don’t grade as generously as you, calling them jerks) is a mess. Grading isn’t about being nice and making students happy. You want to make your students happy? Buy them a pizza. You’ll feel good. They’ll feel good. You can get all the feel good stuff out of the way. Then maybe you can grade honestly and reserve A’s for excellent performance. Your grandmother would approve. If you want to keep her happy, though, you should call more often. Once a week at least.

    But honestly, it seems you don’t really like grades and grading. That’s OK. If you don’t want to make distinctions between excellent and mediocre don’t. Just make all your classes pass/fail so you don’t screw up employers and graduate schools who want to find the best and the brightest. And please don’t expect Val’s methodology to do the work of identifying the best and the brightest for everyone. Ranking systems are blind alleys, especiallyt

  8. I’m sorry, but this post is just plain chalk full of nahrishkite (ask your grandmother the meaning of that word if you don’t know). You wonder why more A’s weren’t given in the 1940’s (or now for that matter)? Really? Again, ask your grandmother.

    The whole conflating of grades with niceness (and your jab at those who don’t grade as generously as you, calling them jerks) that you’ve written about here is a mess. Grading isn’t about being nice and making students happy. You want to make your students happy? Buy them a pizza. You’ll feel good. They’ll feel good. You can get all the feel good stuff out of the way. Then maybe you can grade honestly and reserve A’s for excellent performance. Your grandmother would approve. If you want to keep her happy, though, you should call more often. Once a week at least.

    Besides, you’re not being nice if you give some szhlub who barely does the work required an A. You’re being a fool (nahr, see I’m being nice and giving you a translation). You’re giving explicit approval to slacker behavior. And ask the group of students who are working hard in your class if they approve of you giving that A to the szlub. They won’t like it. They won’t be happy to be lumped with the slacker. See, you will have made people unhappy by being nice to someone who really deserves a kick in the touchas.

    But honestly, it seems you don’t really like grades and grading. That’s OK. If you don’t want to make distinctions between excellent and mediocre, don’t. Just make all your classes pass/fail so you don’t screw up employers and graduate schools who want to find the best and brightest and might otherwise assume that the happy face A’s you’re giving out are an indicator of real performance. Finally, please don’t expect Val’s methodology to do the work of identifying the best and brightest. Ranking systems are blind alley ways, especially when grades tend to be all bunched up in the B+ to A range, as they are at elite schools.

    I am glad to hear you want to make people happy. Make me happy! Don’t give out phony A’s Make your excellent students happy by not lumping them with the slackers. Make your grandmother happy and take her out to dinner (take your parents out, too). And stop asking such stupid questions about why everyone didn’t get A’s in the 1940’s.

    • andrewgelman says:

      Stuart:

      1. My grandparents are no longer around (see here [link fixed]).

      2. I’m not jabbing at people who give low grades. I’m commenting that when I looked at the grades that were given out several years ago, I was a correlation between avg grade and personality. It’s just something I noticed among this small group of Berkeley statistics instructors.

      3. I don’t know why you think I don’t really like grades and grading. As we’ve discussed several times on the blog, it’s notoriously difficult to convey intonation in typed speech. Nowhere above did I say that I don’t like grades and grading; rather, I endorsed Val Johnson’s approach which, I believe, can make grades more meaningful.

      4. Before you start slamming Val’s system, or for that matter calling my questions stupid, you might want to read some of what Val has written on the topic of grading. In particular, see the last paragraph in my blog above. The idea is that if profs know their grades will be adjusted, they will have less incentive to give only high grades.

      In general, I think it makes sense to try to understand people’s incentives and work with that. I’m with Steve Levitt on that one.

      • Sorry to hear about your grandparents. Your link didn't work, but I hope they lived a long and happy life. Two of mine did. The other two were not as fortunate.

        Val was down the hall from me for many years. I have read his work. I would have voted for his grading system (I wasn't on the faculty council at the time) because I felt it would partially alleviate the science student penalty of about 0.4 GPA relative to the humanities.

        As far as changing grading habits, though, that's wishful thinking to the extreme. There will still be incentives to give phony A's. There will still be student shopping for easy classes. An invisible hand that sorts is an abstract thing that influences ranking student performance after a class is over. A paper that has A written on it is immediate gratification. Plus, given the noise level inherent in grading, and given how grades are bunched up in the B+ to A range in most classes at elite schools, there is no longer enough information content in an ensemble of 36 or so grades to make robust ranking possible at elite schools. The signal is too small. The noise is too great. There's a nice paper on that topic in a comp. sci. journal that I've referenced in our 2011 paper.

  9. Alex Cook says:

    At my university, we do grading "on the curve", meaning there are guidelines for how many As we should be giving out, how many students we can fail and so on. Here are some of the incentives it creates:

    1] Students become less likely to co-operate with each other in mutual learning, because if I help my classmate, she might get a better mark in the exam, and that might shift my grade down, because there's a fixed number of good grades. When I see students here "studying together", it basically means they are sitting quietly next to each other, studying independently but at the same time. They honestly think that's studying together.

    2] Students shop around for easier modules, or, should I say, modules with fewer good students reading them. (The more good students in the class, the lower my chances of getting a good grade; better to read a module in which it looks like a preponderance of the weaker students in my cohort are reading.) Rather than read modules that are important or useful, they are encouraged to select based on perceived competitiveness of the class.

    3] Students become averse to reading around and taking modules from other departments, because they will be comparatively at a disadvantage. The university had to create an even more complex system (students can nominate some modules to be pass/fail only!) in an attempt to compensate for this.

    4] Staff also play the system by trying to encourage weaker students to drop out so we can get the class size down below the threshold at which the grade applies (you'll be amused to know that my university believes the CLT kicks in at n=30… [ok not really the CLT, as it's the distn of grades not the mean]). That way you have abler and keener students and can reward them with lots of good grades and get good student evaluations in return.

    • derek says:

      I think grading on *a* curve is a good idea, but not a curve that includes the student or any of the student's contemporaries. A curve constructed of the last two year's classes, for instance, would remove some of the problems you cite: students would not mind collaborating, because collaboration would not affect the shape of the curve they were being graded on. Teachers would not reduce class sizes, because they could not get out of the curve by so doing (two classes, even small ones, would be big enough to construct a curve with, so the teacher would have to teach two really minuscule classes before the grade stopped applying).

      I realize this idea has problems of its own.

  10. Eli Rabett says:

    Eli can't go back to the 40s, but at least in the 60s, there was a pretty clear line between the upper and lower college courses, with the later being the survey courses and the former the major courses. Most of the hard majors had a gateway course where if you got through you were allowed to continue in your major, and after which, if you didn;t go to sleep in class there was nothing lower than a B and mostly As.

    In any case, as soon as you get out of the Ivy's grades in survey courses today are normally distributed and worse if you account for Ws

    O tempora o mores.

  11. kweed says:

    I think you may be overlooking a key incentive for faculty NOT to give all A's: it reduces the quality of the students in subsequent years. This is true in universities that publish grade statistics by course, but also in universities that don't. Whether through formal or informal means, word gets out quickly about which courses are "easy As" and which are not.

    If my choice as a professor is between (a) some unhappy students in year 1 and smarter students, on average, in year 2 through n; or (b) more happy people in year 1 and real duds in years 2 through n, I'll choose (a) every time. It just takes one or two duds to ruin classroom dynamics and rapport, and turn a course that used to be fun to teach into a miserable experience for all concerned. Although you can get such students no matter what your grade distribution, they are attracted to "easy A" courses like moths to a flame.

    Note, too, the assumption that everyone is happier if everyone gets As. I don't think this is true, because some hardworking students will feel cheated out of a grade that they think they earned and others did not. If these students are then less likely to recommend the course to their friends (who are, thanks to social network homophily, typically going to have a similar orientation toward learning), the future penalty for giving out As is even greater. Not only will you get more grade-chasers in subsequent years, you will also get fewer of the best students.

    Of course, if all courses gave out all As, this wouldn't be an issue. But the "first mover" penalty is quite large.

  12. […] Grade inflation: why weren’t the instructors all giving A’s already?? by Andrew Gelman. See my old post on grade inflation here. […]

  13. kdk says:

    The public/private digression is interesting, and seems to support another point – the _institutional_ incentive to give out As. By giving out As, an institutions makes their students more successful, which returns more money to the school (and makes students more willing to pay exhorbitant tuition). The more rapid acceleration in private schools supports this argument. This would only hurt if GPA affected an institution's reputation, which it clearly does not – that's affected by standardized test scores of incoming students and by faculty (often, publication). Harvard is notorious for grade inflation…

    Before you can fix the instructor level incentive problems, I suspect you'll need to fix the institutional level problems. Good luck, sincerely.

  14. andrew mack says:

    One other incentive for giving "As". In many universities faculty promotion (and thus MONEY) is determined in part on student assessments. Tough markers aren't popular with students ergo there is an incentive for teachers to give good marks to produce happy students.

    It would be interesting to see if the kick-up in GPAs that seems to have started around the '60s coincided with the progressive introduction of student assessments of faculty…

    One other negative. One faculty in my university is notorious for giving a disproportionately high numbers of "As". But GPAs are a major determinants of scholarships and prizes… The faculty in question which requires low high school grades to get into apparently gets a disproportionate number of prizes.

    In Australia 30 years ago I sat on a committee that oversaw student grades. Much to the irritation of the economists and psychologists (the latter of the hard-noised rat-torturing variety) the sociology marks always had more "As" than any other discipline. The economists regularly asked why students who rated "Cs" in economics were getting "As" in sociology. The sociologists' invariable smug answer — "We are better teachers."

    • fhk says:

      a plot of gpa vs incoming act/sat score broken out by major would then presumably show the departments with the better teachers /snark. Anybody have that graph lying around?

  15. paul gronke says:

    Andrew, I feel old! Val’s system was first proposed while he was at Duke (I believe) and was voted down (I was on the faculty at the time). Stu was on the faculty then and was part of the debate (I don’t remember his position on Val’s system). Enough nostalgia; I have to second a few of Stu’s comments.

    I don’t convey a good grade with “being nice.” Do you feel the same way about peer review? It does make me feel good to write a strong review of a strong article for a top journal, but that’s because I feel like I’m contributing to the scholar’s career and to my own discipline. I don’t mince words if an article is weak, and it does the scholar no good if I do so. But I don’t be MEAN, as some reviewers are. An well-written, accurate review is productive.

    What makes me feel good is when a student starts out weakly and then improves because of the quality of my teaching and their own motivation. This MAY be reflected in an A, or a B, or whatever. One of my fondest memories in graduate school was teaching a class with three football players who were obviously there hoping for an easy class. Two of the students barely showed up, but the third student did not do well at the start, came to my office hours, wrote additional drafts, and ended up with what I considered a good grade (not an A by the way). Perhaps it was only coincidental that this was the player who made it to the NFL.

    I supported Val’s system, by the way, but now teach in a college where such a system is not needed. So let’s not kid ourselves that some part of the blame for a grade inflation culture lies with faculty, and their desire to avoid conflict, “be nice”, just move students through the system.

    • Andrew says:

      Paul:

      I’m not saying that it’s nice to give good grades or that giving good grades is a nice thing to do. Nor am I saying that I’m a nice person.

      I’m just saying that in my casual observation of the data I noticed a correlation between niceness and giving high grades.

  16. I teach as an adjunct professor at a law school, and think that this post applies to all teachers instead of just statistics professors. So I’ll go ahead and chime in by saying something that might be rather controversial: I don’t like grades or grading, and wonder if we should have anything beyond a pass/fail system. I’d like to know your thoughts about the following issues which bother me greatly when it comes to grading:

    1. Service. When educating students, we are providing them with a service, which they pay for, often at great expense. It strikes me as odd that we determine at the end of our performance how good a job they did in consuming our service. I would never grade one of my legal services clients, why should I grade my students? If the answer is that grades are of great use in making future evaluations of students, please consider the additional points below.

    2. What Gets Evaluated. Often times, the method we use to grade students suffers from either varying too much or varying too little.
    It varies too much in the sense that different professors in the same area of study may use wildly different systems to measure the success of students in the course. Some might use a paper, some might use a short answer essay test, some might use multiple choice, and some might use true/false. Some tell the students ahead of time what will be on the exam, some do not (and everything is fair game). This variety injects a good deal of luck into the exam taking process and associated results.
    The grading method varies too little when we only test certain kinds of intelligence. Some of the most talented attorneys I went to law school with received poor grades. Often this was due to the omnipresence of the time limited short answer essay exam in the law school world, which privileges the ability to read between the lines, surmise exactly what the professor is hinting at in a fact pattern, and write a halfway decent first draft, all under extreme time limitations. Since this time limited, test taking skill is usually of little value in the profession, it raises the issue of whether the skills being awarded excellent grades are:
    (a) skills anyone should particularly care about, and
    (b) of much value to evaluations by others in awarding jobs and prizes (judges, law review journals, law firms, professors and scholarship providers).

    3. Arbitrary Sensibility. Among law students, I would guess that at least 50% of students feel that their grades are awarded in an arbitrary manner. In my experience, this is not true, but the issues above greatly contribute to this misconception, which itself does great damage to the perceived integrity of the entire grading system.

    4. Standardized Tests. Since, in many cases, students need to take standardized tests which provide evaluations over and above their grading results, how much incremental value is provided by routine grades?

    5. Embracing Easiness. Given that many people exhibit varying degrees of ability across different dimensions of intelligence, and that grades often reflect much narrower abilities, do we embrace typical grading because it is simply much easier to assign a number to someone and make a judgment, instead of going through the more rigorous process of holistically considering whether a person deserves a particular opportunity? In my own hiring recommendations over the years, I was much more interested in a law student with average grades, success in moot court competitions, publication of a thoughtful law review article, and industrious work experience, than a student from a top rated law school with excellent grades but little else. I found that considering the broader range of a candidate’s abilities often seemed to result in hiring a much better performer (though I lack data to support this point), but, of course, this portfolio approach took much more work. Additionally, it certainly makes evaluative decisions much murkier and harder to justify in a clear, convincing fashion to others (a price I was willing to pay).

    Of course, my thoughts are greatly influenced by the nature of the legal academic world (When I was an undergraduate, I never noticed very many students complaining about the arbitrariness of grades).

    What do you think?

  17. Site visitor says:

    “all the incentives to toward giving only A’s: When I give A’s, students are happier and complain less, I get to feel like a nice person, and I give my own students (whom I generally have somewhat warm feelings toward) a benefit in their future lives.”

    Tests should be randomly and anonymously assigned to another teacher for grading. That is to say, the one teaching the class should not be the one grading the papers. Ideally, the graders should not know who’s test they are grading.

    The grader would be more likely to grade based on the quality of the work, rather then the relationship with the student.

  18. Grant says:

    How about the professor just giving the student the grade he/she deserves and earned. Then to compensate for teachers who give all their students A’s, why not adjust the grades at the end of each semester based on the teacher’s reputation, skill level, and overall competence.

    • Andrew says:

      Take a look at Val Johnson’s approach (see link above). You can do the grade adjustment without needing to estimate teacher’s reputation, skill level, or overall competence.

  19. I wonder whether the “bias” towards good grades in the humanities is a symptom of grades being harder to define in these subject areas. Science, particularly at the undergrad level, seems to have a higher proportion of exams where it is fairly easy to determine right, wrong, and if wrong, how wrong. This translates into a larger spread in grades. In “softer” topics, on the other hand, there’s usually something thats good, or people “meant” the right thing, etc. In such cases, how much emphasis should be placed on clarity of expression versus being vague but meaning the right thing? I would expect that its difficult to capture that in a single number, yet that is what most grading systems expect.

    In a sense, this echoes what Stu has said about there being too little signal to amplify, and it also suggests that having an incentive to spread grades more may not be enough, if you have trouble doing so in a principled manner.

  20. April Galyardt says:

    There’s a serious confound hiding in this graph. Public universities have been struggling for a while now with seriously under-prepared students. Estimates indicate that about 1/3 of university students and 3/4 of community college students require at least one remedial class. And most of the student who require a remedial course require more than one.

    So here’s my question, what does the graph look like after you control for the number of students at an institution requiring remediation? My guess is that this would make the difference between public and private institutions dramatically smaller even if it doesn’t disappear entirely.