Grade inflation: Why hasn’t it already reached its terminal stage?

Paul Alper writes:

I think it is your duty to write something about this. Why? For one thing, it does not involve Columbia. For another, I presume you and your family will return to NYC and someone in your family in the future will seek medical care in hopes that the physician understands organic chemistry and what it implies for proper medical care. However, if you do not want to be recorded on this, how about getting one of your bloggers who has medical degrees or have some sort of connection with NYU? To be extra safe, have the column written in French or dialect thereof. Another idea: relate this to students taking an undergraduate statistics course where the failure rate is high.

P.S. True story: When I was an undergrad at CCNY in engineering, the administration was up-front loud and proud about the attrition rate from first to second year in engineering because that proved it was an academically good institution.

“This” is a news article entitled “At NYU, Students Were Failing Organic Chemistry. Whose Fault Was It?”, which continues, “Maitland Jones Jr., a respected professor, defended his standards. But students started a petition, and the university dismissed him.” It seems that he was giving grades that were too low.

Fitting this into the big picture: this particular instructor was a retired professor who was teaching as an adjunct, and, as most readers are aware, adjuncts have very little in the way of workplace rights.

The real question, though, which I asked ten years ago, is: Why haven’t instructors been giving all A’s already? All the incentives go in that direction. People talk lots about grade inflation, but the interesting thing to me is that it hasn’t already reached its terminal stage.

65 thoughts on “Grade inflation: Why hasn’t it already reached its terminal stage?

  1. An intuitive answer is that institutions of higher education aim at prestige, and starting out by giving all As wouldn’t be prestigious (why bother grading at all in that case?). But once their prestigious status is locked in and the “best” students want to go there, they can give them all As without suffering a hit to their status. However, that answer is at odds with the fact that the most prestigious universities like Harvard had locked in their status long before letter-grades became popular.

  2. In fact, at many elite institutions the average grade is not that far from A (perhaps someone can cite the evidence – I can’t recall where I saw it, but I thought I saw an average GPA of around 3.5 at Ivy League institutions – please correct me if I am wrong). If this is correct, then I’d rephrase your question as “why is the current equilibrium not a corner solution?” A theoretical prediction might be that all professors would give all A grades. But then we must consider that some professor’s have different goals, that tenure protects enough of them to keep them from awarding all A grades, that the long-run equilibrium must consider factors such as reputation (as Wonks suggests), etc.

    A number of empirical tests would shed light on this question. Do adjunct professors give higher grades (adjusted for class level, field, etc.) than tenured professors? Do untenured tenure-track professors give higher grades than tenured professors? Why are graduate level grades higher than undergraduate grades? How and why do grades vary across disciplines? Are grades higher in fields with lower enrollments (my casual observation is that this is true)? Are we at the equilibrium or are grades still trending higher?

    • I have always been surprised at the lack of outrage over graduate-level grades, which are generally much higher than undergraduate grades. Why not start tackling “the problem” there, where it is so much worse?

      The answer is that graduate students are supposed to be really good and so high grades are appropriate. But it is not clear why that answer doesn’t apply to advanced classes at a very selective undergraduate institution.

      • Grad school grades are largely just a formality; almost no one is going to look for the grad student’s GPA on their transcript in the same way they might for an undergrad. It’s the writing and research output that matters, ergo grades are typically kept high enough to ensure funding isn’t pulled.

        • This is the correct answer. In my opinion, if you are not operating on the apprentice system, then you are not a real grad program. You are a repackaged undergraduate program that is being sold to people who don’t know the difference.

        • “Grad school grades are largely just a formality;”

          That’s not my experience. Grades below B were uncommon but they happened, and A’s weren’t just given, they had to be earned. My grad courses were well taught and challenging with only a couple of exceptions. That’s my experience at two institutions.

          People are talking about Berkeley, Stanford and some about Harvard. What about Caltech, MIT, Rensselaer, Virginia Tech or Georgia Tech? From my experience tech schools are a different world. Higher standard.

    • What’s interesting is that Ivy schools are *not* heavily dependent on adjunct teaching (for the most part), and so it’s interesting that they are nonetheless the leaders in grade inflation (if they are — has someone shown that? Aside from news stories that will *always* focus on the Ivies).

      • There are multiple factors. The Ivies have less adjunct teaching, but are also filled with incredibly hard-working students who basically ace every test put in front of them. It may be that at some schools inflation is driven by the former, while in other schools it is driven by the latter.

        • It’s possible that the difference in grade inflation between institutions is driven by increasing selectivity at elite schools. Admitted standardized test scores can support that. However, when I was at Berkeley, the conventional wisdom among students at both Berkeley and Stanford was that getting As was much more difficult at Berkeley. The only students who had experience of both, grad students who moved between the two institutions, generally corroborated that.

          The attitude from Stanford students was generally that Berkeley professors were a bunch of theory-headed elitists who made things pointlessly difficult and did not bother to prepare their students for jobs and applications, mixed with a hint of guilt. The attitude from Berkeley students was overt jealousy, mixed with some secret pride and a general inferiority complex about not having gone to Stanford. Fascinating stuff, egos inflamed on all sides.

        • somebody –

          So from your experience, outside of attitudes, do you have any thoughts about the impact of the different sense of, or reality of, on the educational experience at the different schools? Did it affect the quality of education in some manner that, in your opinion, a grade inflation or lack thereof showed up as a signal in the quality of, or attributes of, the educational outcomes?

      • The hardest part may very well be getting in, but I wonder about the effect of course loads and term lengths.
        At some highly selective schools that are on a semester (as opposed to quarter) system – e.g., Duke, Princeton, Brown, Harvard, Swarthmore, Haverford, Amherst, Oberlin, Smith, Williams – the standard load is a total of 32 courses over 8 semesters, or an average of 4 per semester. At Cornell, the standard is 4.25 (34 courses), at Yale 4.5 (36 courses), MIT, Columbia and Penn are between 4 and 5.
        At state universities what’s typical for those on the semester system is 5 courses over 8 semesters, or 40 courses.
        The credits given for courses at schools with a standard load of 4 is 4 credits per course; at schools with a standard load of 5 it is 3 credits per course. But does this really reflect a significant difference in the work either expected or actually done by students?
        At Duke, courses typically meet for 150 minutes weekly for 15 weeks, and the standard course load is 4. At the nearby NC State, where I taught, and at nearby UNC/Chapel Hill, courses typically meet for 150 minutes weekly for 15 weeks, and the standard course load is 5.
        Between schools, the number of weeks in a semester can also vary and there is also variation among ‘reading periods’ before the final exam periods with some as short as a day or two.
        I’m not sure how to compare with schools on quarter systems.
        I’ve not found any published research addressing this question about workload and time in classes.

    • I feel like there’s a Keynesian beauty contest element where professors are incentivized to be more lax than other professors, but not necessarily to give all A’s.

  3. It’s something I’ve thought about for a long time, and after a lot of research the conclusion I came to about letter grades is that their primary purpose for existing is as an administrative tool to administer (a) scholarships and (b) graduate school admissions. Financial resources in universities are too few, so the way we have collectively decided to divvy up the vanishingly small number of rewards we have is “academic merit.” There is not enough financial resources and labour to evaluate each potential student on merit, so letter grades and GPA exist as an administrative sorting tool that can be used to make quick decisions with little labour.

    So, the reason not everyone gives A’s is to prop up this reward system, which would entirely collapse if the majority of professors gave As for everything. Institutions usually do step in to some extent to prevent the ceiling from being hit, keeping the precarious balance of ranking students but keeping them happy as paying customers. It is similar to the reason that people still review journal articles and grants even though there is no financial or professional benefit for doing so: It props up the merit-based system of academic careers, which would collapse if people stopped doing it.

    This isn’t an endorsement of meritocracy, but it is nonetheless the guiding ethos of most academic institutions whether I like it or not. I am occasionally frustrated that my colleagues think that grades are essential to the learning process, when in reality, I think they are mostly an administrative tool for ranking students to dole out limited rewards. At a certain level, I resent having to stratify my students!

  4. There are certain classes where I routinely give all As. In an advanced music composition class at my very selective university, the students are all going to follow directions and work very hard. The prerequisites will weed out those who don’t have a fair amount of experience.

    The differences among them often come down to their past experience and innate talent. (Even among those who meet the prerequisites, some will have parents who are professional musicians, for example.) Grading on those factors tends to create perverse incentives: the whole point of a liberal arts education is to encourage, say, the future bankers to take a range of classes.

    These are students who are incredibly motivated and diligent, and often plan to apply to fellowships and jobs that will pay close attention to their grades. I could give everyone Bs, except for the 1-2 who I think might actually be able to become professionals, but that doesn’t really get the students to produce their best work, in my experience.

    So it is essentially a pass/fail grade with the letter “P” being replaced by “A.” Not being a letter fetishist I don’t mind using a different set of letters.

    • > The differences among them often come down to their past experience and innate talent. (Even among those who meet the prerequisites, some will have parents who are professional musicians, for example.) Grading on those factors tends to create perverse incentives: the whole point of a liberal arts education is to encourage, say, the future bankers to take a range of classes.

      +1

    • Dmitri, you indicated you use this approach in only one class, so perhaps my criticisms are a bit overwrought with respect to you as an individual. Just the same, as Joshua’s “+1” indicates, seems like your approach in that one class is consistent with broader sentiment among many academics, so consider my criticisms in that spirit.

      “the differences among them often come down to their past experience and innate talent”.

      I’m not sure what to make of this statement. Is it not stupidly obvious that people who work hard learn more? That’s part of their “experience”, isn’t it? Furthermore, that knowledge and skill are a cumulative result of effort? It’s also stupidly obvious that a student who has a “talent” for something will excel at that thing. That’s great. If they have more knowledge than other students, they *should* get a higher grade. Educational testing and grades are there to reflect a students’ ability. More ability = higher grade. Can you spell out why that is a problem for you? I prefer a doctor that is talented over one that isn’t.

      Your general view surprises me for another reason: if there are no concepts or skills that require objective mastery, why are you teaching a class? What are you teaching in the class? Your approach *could* make sense for an independent study course, in which you offer only modest guidance and advice. But if you’re teaching a class listed in a course catalog, then presumably you should have some knowledge to impart, and how much of that knowledge the students actually acquire is measurable.

      The idea that if students don’t all “get” A’s it is a “perverse incentive” is odd. When I got poor grades in primary school, my parents used that to correctly infer that I didn’t work very hard. They didn’t go complain that I was getting “perverse incentives”. As a college student, when I got a C in a class, I used that as an indication that I had to put more effort into that subject.

      What happened to the idea that students earn the grade themselves? They don’t “get” a grade. You don’t “give” them a grade. They earn it. They earn it by demonstrating they understand the material. That’s the point of education. We don’t have a multitrillion dollar education system to buff up student self confidence. We have it for people to acquire beneficial knowledge.

      What if we apply that same concept to the NBA? Shouldn’t players always get points for effort when they take a shot? Why are we having these “perverse incentives” that the shot actually has to go through the hoop to count? Once they reach that level haven’t the pre-reqs already screened out the chaff? Why do we even have fouls! At that level, they’re all trying their best! Giving fouls is a perverse incentive!

      Whatever the case, if we keep going down the path that the job of universities is to do nothing more than encourage students to try hard – and never even measure the outcome of that! – it will at least save the public a lot of cash because we can just eliminate the university altogether and just hand out the degrees! There’s a positive incentive!

      • Hi Chip, a few points.

        First, my classes are filled with students who are for the most part not going to be pros. So imagine I am teaching basketball at a school that has never sent anyone to the NBA. I could easily grade students by a future-professional standard but there’s no point. There would be no As.

        I sometimes use a weightlifting analogy. I am teaching weight training to a range of students with a range of abilities and body types. I could grade them all by a future-Olympian standard: Fs for everyone. Or I can try to tailor a program to each student, to try to get them to make the most progress they can. Some are going to be stronger than others, but that is largely outside their control at this point. If they follow the program and make good progress they get a good grade. If they all follow the program and make good progress? As for everyone. What’s wrong with that? The standard is less one student to another, but internal to each student.

        Second, and relatedly. You write “Is it not stupidly obvious that people who work hard learn more.” My experience is that my students all work tremendously hard. Some of them are really gifted, some of them are less gifted. Imagine you have a string quartet, and every player practices 10 hours a week now. But some practiced 30 hours a week all their lives; others practiced 7 hours a week all their lives. The 30-hour a week students play better now. Should I give them the higher grades? First, that means I’m grading on factors that are already fixed by the time they sign up. Second, that means the 7-hour students tend not to want to take the class. An A grade just isn’t really in the cards no matter what they do: their rhythm needs work, they play out of tune. This is particularly problematic if the student is ambitious, gets As in their more academic class, and is gunning for a prestigious fellowship (some of which have minimum GPA requirements).

        OK, now imagine it is intro Spanish and some students are fluent Italian speakers. Etc.

        Third point: grading is really different in different fields. Grading in physics or math is really different from grading in music or poetry or whatever. A lot of damage has been done by naively extrapolating across fields.

        • It seems like the split here is that you believe a student’s grade should be a measure of the effort and time they’ve put in to the course while others believe the grade should be a reflection of a student’s abilities and mastery of the material. Two of your students with very different levels of ability in composition would be indistinguishable based on the grade they received in your course. Plus what’s so wrong with getting a B?

  5. To quote:

    “To understand grade compression, we first need to understand grade inflation. Looking at a graph of student GPAs since 1889 is sort of like looking at a graph of Harvard’s endowment: It only goes up. In 1950, when Harvey Mansfield was but a freshman at Harvard, the average GPA was estimated at 2.55.”

    See https://www.thecrimson.com/article/2022/10/3/barton-grade-inflation/.

    The story goes on to say that the average grade at Harvard today is 3.8. It also notes that the data may be unreliable

    Bob76

  6. Sometimes it becomes terminal and then starts over. For my first post-PhD teaching assignment, I taught to Psychology 1 at McMaster University. I was told that normally the class had about 200 students but it had expanded to 2,000 for that year, 1/4 of all the undergrads, because, the year before, the instructor thought he did a bad job teaching it, and, to apologize, gave all A’s. (I think something is exaggerated here, possibly from my memory over 52 years.) Since no lecture hall was big enough, 4 of us would teach the class by recording lectures on TV. (This is not exaggerated.) We did not give all A’s. The next year things were back to normal.

  7. As always objective data should help in clarifying this issue. The service academies report class rankings. Someone should give us a look at the eventual career achievements tied to the class rankings. Highest rank achieved related to the ranking would be an obvious measure. Decorations for valor would be another; does high class rank imply greater likelihood of attaining the Medal of Honor?
    No patient ever asked me about my grades. We tried to assess patient satisfaction and found this very difficult. The humanitarian qualities that make a good doctor are not reflected in their GPA. GPA reflects in classroom achievement which is partly, but only partly, to real life goals. One of my highest GPA colleagues was often pursuing treatments that were “cutting edge” but later shown to be wrong. A less studious doctor sticking to middle of the road regimens would have gotten better results. We had trouble getting the smart guy to see that.

      • Interesting study – any chance you can provide the data? It looks like gender had even more impact than the test scores – but I suspect that gender is related to both test scores (I’m not sure about that, but women generally out-perform men on most standardized tests) and field of practice (I’m not sure how they interact, but I believe that sanctions vary across fields and I know that the gender breakdown does vary across fields). Did you results account for these interactions?

        • Dale:

          No, for reasons of confidentiality or security or something or another, the data are not available. I don’t recall the details of the analysis, so you’ll just have to go by whatever’s in the paper!

  8. I took two writing classes at Michigan State and was told there “C is for courtesy; if you show up to every class and hand in something for each assignment, you will get no worse than a C.” All my grades in math and science courses were based on numerically graded exams. Of course there probably was a broader range for A’s than other grades, but you could get D’s and F’s (no E’s though). All of that seemed reasonable to me. In math and science, there are (mostly) right and wrong answers, in liberal arts it is not as definite.

    Some of the best engineering work at GE was done by “specialists” and “technicians” (people with a two-year degree from a technical or community college). One told me that he had done so-so in high school, but at community college he was told that if he didn’t do the work and learn the material there were plenty of other applicants they could take, so he buckled down.

    Those were the good old days (for engineering) of course, before Welch decided that how much it cost to make things was more important than how well they were made.

  9. The Maitland Jones saga at NYU has a few wrinkles which separate it from the usual situation of student displeasure with grading. Jones was not the typical adjunct desperately careening from one academic institution to another because he could not find full-time employment which would have given him some sort of health insurance until medicare kicks in at 65. In fact, according to

    https://www.dailyprincetonian.com/article/2022/10/maitland-jones-nyu-princeton-professor-terminated

    Maitland is now in his mid 80s and before his retirement at Princeton University, “At Princeton, Jones served as the David B. Jones Professor of Chemistry and was responsible for several landmark discoveries in the field of chemistry.”

    Nor, unlike the more recent situation at Hamline University in St. Paul, MN, was he teaching a course which might offend religious beliefs. His specialty was organic chemistry which is often believed to be a “gatekeeper” course for those aspiring to get into med school. I fully admit that I know virtually nothing–make that absolutely zero–about organic chemistry, but there seems to be disagreement in the medical world as to whether knowledge of organic chemistry plays or such play a vital role in medical school admissions/education.

    • The thing about organic chemistry is you can try to memorize everything as disconnected facts, which is very difficult.* Pre-med students typically try to memorize everything for some reason.

      Or you can figure out the relatively few basic principles, then deduce what the answer is likely to be when confronted with a new problem. This is much easier, but requires a bit more upfront effort.

      * This is very similar to “evidence-based medicine”

      • It is like writing an algorithm that optimizes for CPU because you have plenty of RAM. It works great but doesn’t scale. Eventually you input a huge dataset and run out of memory. Either you refactor to optimize for memory, or get stuck swapping and progress slows to a crawl.

        Organic chemistry weeds out the students who need to refactor but don’t do it. Some don’t need to, of course, because either they have a better memory or are already using a memory-efficient learning algorithm.

        Really people should be taught to deduce things from first principles much earlier in education.

  10. This doesn’t fully answer the question. But some institutions have guidelines for instructors about the distribution of grades in large classes. If grades are outside of a given range, there can be additional scrutiny and a justification from the instructor is required. So maybe then the incentives are for grade inflation right up to that upper end of that range.

    • That type of thing does prepare you for “real life” though. If the FDA approves many more/fewer drugs than the prior year, congress gets involved and starts asking questions. There is always some fatal flaw in any publication/submission, so if you can get yours up for approval in a slow year, it greatly increases your chances over a glut year.

    • +1. Institutional “controls” is one reason it can’t get to terminal state. Princeton did an experiment for 10 years stipulating a kind of loose quota on the proportion of As but the current president gave up this “deflation” policy as his first act in office. Even without explicit controls, there is subtle pressure to not give 100% As. Maybe 90% is ok, but not everyone.

      Another barrier against the terminal state is the curmudgeon: Maitland Jones is one of them. There are some professors who just refuse to accept this phenomenon and set an example of resisting it. It’s the individual variation that conspires against the terminal state! (Jones’s mistake is he forgot that at NYU, he wasn’t tenured.)

    • My father got into a bit of trouble at his college when he gave all his students As one year. He said there was a lot of pressure to bell curve his marks but he stuck to his guns and defended his decision as all of his class that year was exceptional (in his opinion).

  11. “Why haven’t instructors been giving all A’s already? All the incentives go in that direction.” The same reason you don’t lavish every PPNAS article with praise, even though the (near term) incentives go in that direction. Honest assessment is, in fact, a good thing!

  12. When I was an undergrad student at Berkeley a few years back, it was the policy of a few departments to grade to a curve, which definitely created some perverse incentives. One professor was notorious for having much more difficult exams than the other professors for that subject, far beyond the standards of the textbooks or the entry requirements of later courses. When asked why, he stated that with a curve, but without difficult exams, students tend to saturate towards the top. This had the net effect of rewarding students who were more careful and made fewer mistakes over students who had a deeper understanding. The availability of online answer keys also became a problem. In classes where homework was important to one’s grade, it became a requirement that you use an answer key for your homework because the class average was nearly 100%. Furthermore, “professional fraternities” would build vast repositories of answer keys, former exams, and former homework assignments that they distribute to their members, so it was in one’s best interest that they be connected to one. One savvy professor therefore vastly diminished the homework contribution to students’ grades, announcing that “in recent years, homework has taken on an odd transactional quality. I send you out with all these questions and you all find the answers somewhere and bring them back to me. You don’t have to do that! I assure you I already know these answers!”

    In other cases, a runaway feedback loop had a depressive effect on students’ apparent performance. An acquaintance of mine had already gotten into graduate school, so he calculated the precise number of homework assignments he had to do to stay in the C- range and not have a report go out to the admitting institution. At one point, I also took a class I found myself dramatically overqualified for and, in the interest of not wasting my own time, did precisely the number of assignments required to keep me in the 89th percentile “A range” of the course. The class’s format was a single, 400+ person, 3 hour lecture a week on Wednesday nights with no smaller groups, so I felt the standards that semester were a little low. As it turns out, the instructor agreed, and eschewed departmental standards grading to a C- median, effectively failing just under half the class and giving me a B. A bunch of students complained, but nothing happened. I thought it was a little precious to complain, but it was a bit annoying that this guy could transparently put in the literal minimum required work into teaching and then punish the class for not doing well.

    Professors’ opinions on the curve varied. Some were very proud of the academic rigor, others’ lamented that in some introductory classes students’ performance had been transparently increasing over time but they had to keep the percentages low to keep the numbers in later classes manageable. Professors in the pure math department had the most latitude to grade as they pleased. The typical response in this day and age was indeed a preponderance of As, but a couple professors were notorious for having absurd standards and declaring that “a C is a good grade, you should be happy with it.” So math students were at the whims and mercy of some of the strangest personalities, but with their smaller class sizes a curve would have simply been unworkable.

  13. As I have often claimed, grade inflation in the U.S. began because of the G.I. Bill. Once the Federal Government promised to pay tuition for veterans of WW II, universities raised tuition knowing that the demand for placement would keep rising despite the unconscionable increases in price. Flunking students no longer became an option for institutions seeking to take part in the largess, so that GPA also rose precipitously. However, unlike tuition, there is a ceiling on GPA so that the wiggle room separating one student from the next has become smaller and smaller.

  14. Grading on a curve is a terrible idea; it should be banned, possibly made a felony!

    It provides the learner with no interpretable feedback on how much he/she has mastered the material–only on how he/she has done relative to others in the same class–but the learner has no usable information about the latter.

    It provides misleading information to admission committees and employers by implying that a student who received a given grade this year is more or less equivalent to one who received the same grade the previous year.

    It should be apparent that the distribution of mastery in a course will vary from class to class both due to specific selection factors and just the luck of the draw. So there is no reason to assign the same grades to the same percentiles in each class.

    Also, grading on a curve makes it possible to improve one’s grade by degrading the performance of others in the class. That is, it provides perverse incentives for students to refuse to help each other out with learning during the course, and even, when circumstances permit, to sabotage others. (Sad to say, the latter actually does happen!)

    Grading should be based on a reliable and valid assessment of mastery, and nothing else. That means that everybody getting an A, or everybody getting an F and everything in between should be a possible outcome of the grading. Which is not to exclude the possibility that if everybody fails an investigation of the competence of the teaching is warranted, or that if everybody routinely gets an A a more demanding course might be appropriate for the population being taught.

    • Clyde –

      +1

      But if go a step further.

      Back to yuur beginning comment:

      > It provides the learner with no interpretable feedback on how much he/she has mastered the material–only on how he/she has done relative to others in the same class–but the learner has no usable information about the latter.

      I think thst applies to grading more generally. To be really effective, grading should be explicitly criterion references within a scope and sequence of the material to be mastered.

      I once started teaching at a new school that was intended to be a leading flagship model for educational reform (The Coalition of Essential Schools). I will never forget in the first day or so of meetings prior to the first day of classes, when grading came up and it became clear that traditional grading was a bridge too far to cross within the early goals of the reform. I was very disappointed. Grading is the engine that drives the mechanism cart of education. If it isn’t reformed, you’re must likely just tinkering imo. I think that grading should be truly and clearly riterion referenced, where students can actually use it to glean information about where they are in a scope and sequence, and where they need to go next, and to provide a vocabulary for teachers and students to discuss those issues. Just telling students how well they’re doing relative to other students is pretty meaningless unless you view schooling as a social sorting mechanism. And usually, anyway, it only confirms what people already know (for the most part): students who grade well will get good grades and students that don’t, won’t.

      • I’ll note that’s like particularly true for students who generally get poorer grades. There’s evidence that such students as a general tendency tend to attribute it to “luck” or an easy test if they grade well, not a function of their actions. They might tend to lack a sense of their own agency as a learner and then develop habits that reinforce that lack of agency. Students who do well on grading might tend to think thst learnjng largely means doing well at doing what you’re told to do. If you’ve run across students who got good grades in high school and maybe undergraduate school who find inn graduate school that getting good grades required more skills than just doing what you’re told to do well, that could be the dynamic in play. I have found it to be a tough adjustment for many – largely because there’s no one around to help them to understand the shift.

        • “They might tend to lack a sense of their own agency as a learner”

          Thank you. It’s because they don’t know how to study.

        • > Thank you. It’s because they don’t know how to study.

          Sure, that’s a part of it but I think the underlying problem is broader than simply “not knowing how to study.” Methinks you were redding to confirm a bias?

          Getting them to “know how to study” would be one part of resolving the underlying problem but certainly not sufficient.

          Also, I think that a problem exists on the other side also. Often getting good grades means “knowing how to study” in the sense of thinking it means being good at cramming or being effective at focusing efforts on “what’s on the test.”

          That’s “knowing how to study” when the focus is on getting a good grade but not so much knowing how to learn in a way that’s not terribly effective for learning outside of an contrived paradigm as exists in the dominant model door schools.

          Check out what Shravan linked below.

        • You’re making it really easy to confirm my biases.

          You say:
          ” being effective at focusing efforts on “what’s on the test.” ”

          That’s exactly what they should be doing. The test represents the material in the class. They’re supposed to know it.

          The idea that successful students have some ESP for whats going to be “on the test” is BS. Successful students *know all the material*. They don’t have to guess. They do that routinely. If you’re communicating otherwise to students you’re the one who’s undermining their confidence and ability. You’re the one telling them it’s impossible to succeed. Most of my courses had 2-3 mid terms and a final. That’s 10-15 lectures for each mid term. Hardly impossible to cover.

          Cramming for tests is an important part of building knowledge. Unfortunately, most students have already screwed themselves before it’s time to cram. When they cram for the test it’s the first time they’ve reviewed the material. They pull out their notes two days before the test and start cramming. Their memory is heavily fogged because the lectures were 3-5 weeks ago and they never reviewed. Their notes are shitty, they missed two lectures, its Friday, the test is on Monday, there are no office hours. They’re forced to guess to fill in the copious blanks. The not-surprising result: they do poorly – Bs for the more capable, Ds for the less capable.

          Shravan’s / David McKay’s glaring mistake is the assumption that it’s all “A-quality” input. This is demonstrably wrong and badly so.

        • Chipmunk lives in a world where students that are smarter and work hard earn higher grades on tests that truly test their knowledge (is knowledge the same as ability to learn and find answers?), consumers are well-informed and businesses compete via better and cheaper products to earn their dollars, and where merit and quality are easily measured and it is government, regulation, and anti-market practices that are the real obstacles to this meritocracy. Ayn Rand’s world. Not the one I live in.

          I don’t mean to dismiss those views, as I actually share many of them. But the world is more nuanced and overstating that extreme view as a counter to the equally distorted view that we are all equal and only discrimination accounts for different outcomes, is not a way forward. Both extremes are just that – overstated and not commensurable. Reality is somewhere between and that is where the difficulties lie. Designing a test to truly measure what you think a course should do is difficult. Implementing it in a constructive way is difficult. And having an evaluation system that both recognizes talent, achievement, and provides positive incentives is also difficult. (by the way, I don’t see that cramming serves any of these purposes).

        • Dale’s response hits many of the points i’d make.

          I will add that in my experience looking under the hood from many angles (admittedly anecdotal), grading is far more arbitrary than most people think. There are usually many judgment calls in grading.

          And validity (let alone reliability), IOW whether it’s measuring what it’s supposed to be measuring, is lower than it is often perceived to be.

          But on top of all of that, the question is whether it advances your goals. I think it’s interesting that many people who are highly critical of our educational outcomes are also staunch defenders of a long-ago designed system where grading is the backbone. (where the designers explicitly sought to train students to be good at following instructions on a hierarchical work environment). Of course, there could be an accompanying argument that it’s the corruption of grading that’s the underlying problem – but then you’d have to provide measures for outlining that causality, something that’s incredibly difficult to do.

          Again, if your goal is to sort students by a (imo basically arbitrary) set of ciiteria than I think grading can be fit for purpose. But that’s not my goal as an educator and I don’t think it should be the goal of (at least most) educational institutions – particularly as we move away from hierarchical work and intellectual environments.

          I get that these are tough issues – but past of the problem for me is that ecejne things that they’re an expert in education and if you’re going to think your an expert than you should be informed as to the relevant evidence and not just make assessments ONLY on your anecdotal experiences and observations.

        • I will add that in my experience looking under the hood from many angles (admittedly anecdotal), grading is far more arbitrary than most people think. There are usually many judgment calls in grading.

          There may be some arbitrary elements to grading but a couple of (anecdotal!) observations has made me feel reasonably secure that we’re getting it about right. In the UK student’s work is marked anonymously and so you don’t know what the grades look like with respect to individual students until the results are finally de-anonymized. For a number of years I organized a first year Professional (Medics; Vets; Dentists) Course. These have first years comprising a number of different pre-clinical subjects. The correspondence between a students marks across the different subjects was ecouragingly strong. Likewise when our final year graduating students have their marks de-anonymized they “make sense” (students we expect to do well do so even though we don’t know whose paper is whose when we’re marking them). We can pretty much predict beforehand who the top few students will be. Whatever the grades may mean they do seem to be capturing some aspect of student’s knowledge/abilities fairly reliably.

          Again, if your goal is to sort students by a (imo basically arbitrary) set of ciiteria than I think grading can be fit for purpose. But that’s not my goal as an educator and I don’t think it should be the goal of (at least most) educational institutions – particularly as we move away from hierarchical work and intellectual environments.

          Yes, I agree. My goal as an educator is not to sort students but to enthuse them and help them learn. However grading/sorting is part of the overall process. Possibly, and maybe a little oddly, while I increasingly find it quite awful that we put students through the various assessment/exam stresses, students seem to be quite comfortable at least with the idea that they have to be examined and graded.

        • Chipmunk wrote “Cramming for tests is an important part of building knowledge.” Perhaps I am mistaking their intention, but I strongly disagree. Information learned from cramming seems to be forgotten quickly.
          Cramming also seems to only help for low levels of learning (e.g., in Bloom’s Taxonomy. I can ask a bunch of Chemistry seniors “About how many molecules are in one cubic centimeter of air in this room right now?” and relatively few can figure it out. But it’s just first-year chemistry (or U.S. high school chemistry) connecting 2 or 3 widely used concepts.
          1) The ideal Gas Law: PV=nRT
          2) moles (n) = molecules/Avogadro’s number = N/N_A
          3) converting volume (V) in Liters to cubic centimeters
          Most students could do any one part of this fairly easily, but phrasing things as a reaThe question is asking for molecules per volume: N/V
          So PV =(N/N_A)RT and N/V = PN_A/(RT). If we take P=1 atmosphere, T=298 K, we can use R=0.0826 L atm/(mol K) and get N/V in molecules/Liter. Then a units conversion gives us 2.5E19 molecules per cubic centimeter).
          Most students could do any one part of this fairly easily, but phrasing things as I do makes it tough if they have only crammed knowledge rather than sought understanding.

  15. This thread has turned into a general discussion of grading, so my experiences may be of interest (if anyone is still reading this). I taught for 21 years at Evergreen State College, where grades had been banished in favor of written evaluations. At the end of each quarter I would write an approximately full page account of what each student did and what the outcomes were. The best part of it is that I could provide differentiated feedback, for instance praising a student for hard work, teamwork, etc. and also pointing out what skills were learned and what gaps remained. When teaching a subject with expected learning outcomes that would be important to future teachers or employers, I addressed these needs as explicitly as possible.

    Another positive aspect of this system was that I really paid close attention to each student, especially to those in the middle I might have glossed over elsewhere, so that end of quarter evaluation would even be feasible.

    The bad part was how much more work it entailed.

    What is especially germane to this discussion is that there were immense pressures for “evaluation inflation”. Many faculty succumbed, and I was often appalled to find grossly ill-prepared students in my classes whose previous evaluations praised them to the skies and said nothing about their shortcomings. My impression (not backed by anything resembling hard data) is that the teachers most prone to handing out such evaluations were those who put less of themselves into teaching and basically didn’t care so much about the whole enterprise, as long as students would parrot their preferred opinions in the classroom. Since there were no departments at Evergreen or effective accountability substitutes, a substantial portion of the faculty fit that description.

    • Tragnically enough, in the hi tech age Rain University’s enrollment is falling through the floor – during the pandemic the decline flattened at about 40% of peak enrollement in 2007

      https://www.evergreen.edu/sites/default/files/2022-10/Headcount_FTE%20%281971-2022%29.pdf

      I don’t agree that it’s about evaluations.

      Grade inflation is as much prof lazyness as anything. Teaching a rigorous class is a lot more work, because if you’re going to be a hard-ass grader you have to hold up your end so the students get the necessary support to achieve high expectations. It’s a lot easier just to be sloppy, have low expectations and give generous partial credit so you don’t have to argue with any students about their grade. Just make it a little higher and they won’t bother you.

      • Believe me, I know about Evergreen’s enrollment troubles, or at least I did prior to leaving. I was on the faculty committee that monitored the college’s budget, and we tried to sound the alarm for years, with little response. It wasn’t until the post 2017 meltdown that anyone paid attention. What caused the headcount drop? Lots of things, but that’s a different topic. FWIW, the college experienced a positive bounce last fall, but it’s too soon to say whether it’s a one-off.

        As for the issue of evaluation inflation, I think we’re not far apart. Just giving everyone a hug and sending them through is the path of least resistance, at least in some disciplinary domains. OTOH, it’s possible to be a curmudgeon *and* not give students what they need to succeed. There’s some of that too. Just to give one aspect of the problem, quite a few students choose to attend Evergreen because it has no distribution requirements, which means (in theory) they never have to look at math again. But courses are interdisciplinary, and often a student looking for one thing would end up in a class I’m involved in which entails learning stats, some formal modeling, etc. I wanted to hold them to a high standard, but they may have been drastically unprepared for what lay ahead and had probably been miseducated in quantitative courses in the past, against which they had recoiled. To uphold the standards I felt were appropriate, I had to work my butt off. It didn’t pay off in evaluations, either.

        • While we’re on the subject of grades – the only system that makes sense to me is to have someone independent of the instructor/student create the exams and grade them (I believe this system was/is used in the UK). My role as instructor is to help develop the students – not to create evaluations to be used by employers and graduate schools. It just isn’t that difficult to evaluate a potential employee or graduate student using standardized exams, personal interviews, or creating your own examinations. I think writing evaluations of each student, while time consuming, is a good system (an additional side benefit is that it undermines the huge economies of scale in higher education, which I think are destructive in many ways). If the grading is removed from the instructor, then I think the idea of grading/evaluation inflation disappears.

        • “OTOH, it’s possible to be a curmudgeon *and* not give students what they need to succeed. ”

          Yes. For sure. There are also profs who are capricious. That’s partly responsible for the inflation. Students justifiably complain when their grade appears to be capricious, so the admin busts on some profs for bad grading; then other profs, fearing trouble, lighten up far more than necessary, kind of like everyone slowing down to 50mph in a 60mph zone when there’s a cop on the side of the road.

        • Dale mentioned the UK so I’ll give my thoughts on this from a UK perspective.

          We have experienced significant grade inflation over the last 15 years or so, especially in two tranches – first was the introduction of a 21-point marking scale for subjectively-assessed work (especially essays which form a large part of exams for final year students). Degrees are awarded as classes (first class >70; 2.1 60-69; 2.2 50-59 etc). In the past markers had internalised the idea that >70% for a piece of work should be a tough ask and so there was a sort of notional ceiling around 75% for a really excellent piece of work. The 21 point scale has gradually induced markers to use the full range of marks and so a really excellent piece of work might be given 85 or even 90% now. Clearly this means many more students achieve an average of 70% over all assessments and so numbers of first class degrees have gone from maybe 10-15% to 25-30% of students.

          The second is a covid-related inflation due to a “no-jeopardy” rule that ensured that any covid-related issues allowed, for example, particular affected assessments to be discounted from the overall mark. It’s expected that this additional grade inflation is being reversed.

          Awarding straight A’s or preferential treatment for preferred students doesn’t really happen in our system. All marking is anonymous (apart from research project grading). There is a strong incentive to maintain grade discrimination – one reason is that students need a 2.1 (at least) to do a PhD though with the grade inflation a 2.1 has lost some of its status in that respect and good graduate schools may now prefer a masters level qualification for a PhD. Motivated students that want to do a PhD but haven’t achieved a 2.1 can stil get onto a PhD programme by doing a Masters (2.2 required for this tho more competitive Masters courses may require a 2.1) before applying.

          An additional Covid-related aspect of this is that due to Covid school students didn’t sit final exams in their school leaving year in 2020 and instead were given grades based on teacher’s assessment. This resulted in a very large school grade inflation spike, Uni’s that had made offers based on school grade requirements were forced to honour their placement offers, this resulted in horribly large 1st year student intakes, quite a bit of first year student drop out and a likely short term grade-deflation due to a little less than averagely good cohort moving through the system.

  16. I mostly teach statistics these days, and my rule is that every homework assignment gets 100% if all questions are attempted. I will give a percentage-correct, and other feedback when needed, but just attempting the questions is all I ask for. This removes the stress of grades for those who are interested in the material.

    I did an MSc in Statistics largely online from Sheffield (this was after I became a tenured prof), and the grading there was real strict. E.g., one professor gave me zero points for writing that the mean for the Cauchy did not exist (he said it had to be is not defined—or maybe it was the other way round). They gave points very, very reluctantly. But I did learn a lot from the exercises they gave (I did every single one of them), and in the end the grades only served to give me feedback on my understanding.

    As a teacher, I feel that grades are largely a distraction if not treated as feedback for the student per se; they are used like the h-index in evaluating how “good” a scientist is. I guess they do measure something, but not sure whether it matters (it=whatever it is that they tell us) in the long run.

  17. Thinking idly…

    The issue about grade inflation is really about inter-cohort comparisons. Ideally, I think, you would want some kind of grading on a curve but *only between cohorts*. A class could get all As showing a satisfactory understanding of the material, but be poor relative to last year’s students, or students from another school. This would address more what people care about without the issue of undermining cooperation.

    But could you really compare cohorts objectively? What about a cohort scoring scheme where graders get a certain number of points to hand out, so they can’t gift a class with too many pts or else they would run out?

  18. Grade inflation is something I’ve always thought about. When I was at Berkeley a professor told us that they had to increase the number of students receiving A’s because at Stanford they handed out A’s like candy and as a result it made our students look worse when applying to grad schools. The logical conclusion then seems to be to give everyone A’s to make your students as competitive as possible.

    But you would think if people know a school’s grades are inflated then it would make all the grades meaningless, but I doubt Stanford grads struggle to get into grad school. I guess their justification is that all their students are smart so they should all get A’s

    Personally I think your transcript should just have your rank in each class.

    • Jordan:

      My experience of Berkeley and Stanford is that people at Berkeley talked about Stanford all the time—they saw Stanford as their big rival—but people at Stanford didn’t really think much about Berkeley at all.

Leave a Reply

Your email address will not be published. Required fields are marked *