How many patents by African Americans were there in the “golden age of innovation,” 1870-1940?

Michael Wiebe pointed me to a recent thread and a post from last year regarding a paper by Lisa Cook, an economist who was recently nominated to the Federal Reserve Board.

The paper in question is Violence and economic activity: evidence from African American patents, 1870–1940, published in 2014 in the Journal of Economic Growth, and it was mentioned in this news article:

If confirmed, Cook will likely be an important voice in a debate over the appropriate pace for the Fed to raise interest rates in an environment in which Black workers have yet to reach a full recovery. Her research on how violence against Black Americans from 1870 to 1940 coincided with a decline in patent filings from that community—and a loss to U.S. growth—has been read by Nobel laureates Milton Friedman, Kenneth Arrow and Paul Romer and changed assumptions about the building blocks needed for economic expansion. It’s also prompted criticisms from Republicans who say her work focuses too much on “racial policies,” an assertion that many of her peers dismiss as ridiculous.

In his linked thread, Wiebe finds a bunch of problems with data processing involving the regressions predicting patents given trends on racial violence. For reasons discussed below, I’m not so interested in the regressions, but before getting to the statistical questions here that do interest me, let me share this from Wiebe’s final tweet:

Do I [Wiebe] think this disqualifies Cook as a Fed nominee? No. She has plenty of relevant experience.

I know nothing about the Fed and have no opinion on the qualifications of Cook or anyone else to these positions. I agree with Wiebe that (a) now is a good time to talk about this research, given that it appears to be related to Cook’s nomination (see quote from linked news article above), and (b) we should be able to have this discussion without getting into questions of qualifications for a government position.

To return to the research paper: from my perspective, the most interesting parts are the graphs of raw patent data:

Something went wrong with the labeling of the figures—as Wiebe notes, the numbers on the right axis in Figure 1 are wrong—but the elephant in the room is the big drop of black patents around 1900. There’s a convention in empirical social science research—especially in economics, but not just in economics—that privileges causal estimates. But in this case I think the quest for causal identification is hopeless, and a focus on issues of statistical analysis distracts from the more interesting descriptive questions.

Cook put in a big research effort to identify 726 patents assigned to black inventors between 1870 and 1940; see section 2 of the article and Appendix II of the supplementary materials, which also includes an interesting discussion of alternative approaches that didn’t work. Interesting stuff.

Here’s the relevant citation from the article: “Cook, L. D. (2004). African American inventors data set; updated in 2005, 2007, and 2009.” I was wondering if it had been updated since, so I did some googling and I found this Brookings Institution report from 2020, The Black innovators who elevated the United States: Reassessing the Golden Age of Invention, by Jonathan Rothwell, Andre M. Perry, and Mike Andrews, who write:

The history of Black people’s contributions to the catalog of inventions that marked the Industrial Revolution has been largely muted. This period is considered one of the most innovative eras in world history, seeing the birth of major advances in agriculture, transportation, communications, manufacturing, and electricity that fueled rapid economic growth. With the exception of a few notable inventors who are regularly elevated during Black History Month—e.g., George Washington Carver (peanut products) and Madam C. J. Walker (hair products)—the disregard of many of the era’s Black inventors not only whitewashes the historical record, but biases who we perceive to be innovators in the present. . . .

We build on the research of economist Lisa Cook, who is the only scholar we know of who has systematically analyzed how Jim Crow laws suppressed invention among Black people. We extend her work by using a more comprehensive measure of inventors, one that links patent records to newly released digital data from the U.S. Census Bureau for relevant years during the 1870 to 1940 period. . . .

Rothwell et al. write:

We use a novel database created by Sarada, Michael Andrews, and Nicolas Ziebarth that matches inventors listed on patent records in decennial years from 1870 to 1940 to complete census records, which include demographic information for the named inventors.

And here’s what they found:

Hey—this is a lot different! There drop after 1900 seems less severe, but it’s hard to say more because these graphs only show 10-year averages. My guess is that the big dropoff in Cook’s data in 1900 are a combination of an actual decline and some data issue involving the census. Recall that the dataset was created using the U.S. Census, and there’s a new census every 10 years, so it seems plausible that a sharp drop happening right at a 10-year boundary could be some sort of data coding issue. I don’t really know, though.

The big change, though, is in the absolute numbers:

These are much larger than the 2 per million from the earlier paper. By comparison, here are the data for whites:

Rothwell et al. summarize:

To understand the rate of patenting by northern Black Americans during this period, we compare it to the average rate of patenting throughout U.S. history. . . . First, the Golden Age truly was remarkable for its rate of innovation. No other time in the 19th or 20th century saw rates of patenting matched by the period from 1870 to 1940. . . . Second, Black patenting by northern residents during this period should be considered extremely high relative to the national rate at any time in U.S. history. . . . Obtaining a patent was more difficult for Black people, because it often involved working with a white lawyer who may be tempted to engage in unfair dealings. These obstacles, no doubt, suppressed the wealth, fame, and influence of Black inventors—and yet, many succeeded in making important contributions to American technological and economic development. What is striking is that even while lacking complete liberty, Black people in the North acquired and practiced cutting-edge creativity, science, and technical skills at very high rates for a substantial period of U.S. history.

They continue:

Several important institutions changed in the North that help explain why opportunities for Black advancement seem to have stalled and even reversed after the Golden Age of Invention. The 1920s saw the birth of zoning laws and other government-backed institutions that closed off real estate markets to Black people, leading to rapid increases in racial segregation which did not reach their peak until the 1970s. . . . governments around the country fostered segregation and corralled Black people into areas that were targeted for disinvestment in important public resources, including education. Meanwhile, powerful professional associations—including the American Bar Association and American Medical Association—gained prominence in the early 20th century and used their emerging power, in part, to officially discriminate against Black people for decades. . . . Throughout northern states, the Golden Age of Invention in America provided a tantalizing glimpse into what Black people could accomplish if given robust opportunities to learn and practice in highly skilled fields. . . .

Assuming this report and the underlying Sarada et al. (2019) data can be trusted, I’d say that any discussion of this topic should start with this new data set, and we should value Cook’s dataset as an important step along the way. If there is interest in the particular research questions regarding effects of racial violence on innovation, it would make sense to work with these new counts, or else to address whatever problems are in these new data; that would make more sense than trying to replicate the 2014 analysis.

Getting back to the statistics, this is an interesting example of the challenges of measurement, and of the importance of details of measurement when making substantive conclusions. In statistics and social science we so often start with “the data” without recognizing the work of scholars such as Cook and Sarada et al. to put together these tallies.

P.S. There was some question in comments about how was the larger dataset constructed. Some details are in section 2 of Sarada et al. (2019) and in the online appendix to that article. There’s also a link to a dataset at the ICPSR. From there, you can find a bunch of data files in Stata format along with some instructions on what to do. I’m not completely sure, but it seems that they have the first name, last name, patent ID, and estimated demographics for something like 319,254 patents. This seems to be only about 1/6 of the total number of patents given to Americans during this period, but maybe I made some miscalculation or I’m missing something; I’m not sure because Sarada et al. seem to be saying that their dataset includes about 75% of all eligible patents. Anyway, that’s a start. I’m not really so interested in patents, so this is likely to be the last I look at these data.

27 thoughts on “How many patents by African Americans were there in the “golden age of innovation,” 1870-1940?

    • Michael:

      Interesting. So in this case it’s all about the denominator. If so much of the inference comes from extrapolation, my guess is that this would be a fertile area for further statistical work.

    • Can anybody make sense of the methodology behind this paragraph in the Brookings Institution report:

      “With 50,000 total patents, Black people accounted for more inventions during this period than immigrants from every country except England and Germany. In our database, 87% of inventions were traced to people born in the United States, and 2.7% of the U.S. total were invented by Black Americans, which is a larger share than nearly every immigrant group. After accounting for patents during nondecennial years, we estimate that Black people accounted for just under 50,000 total patents during this period.”

      https://www.brookings.edu/research/the-black-innovators-who-elevated-the-united-states-reassessing-the-golden-age-of-invention/

      In 1913, US Patent Office official Henry E. Baker stated that after massive Patent Office surveys of 9000+ patent attorneys and the like in 1900 and 1913, he had assembled a database of nearly 800 verified patents earned by black inventors, with unverified leads on possibly another 400. He also estimated that his database included only half of all patents by black inventors because many wouldn’t want to have attention drawn to their race due to commercial discrimination (e.g., after Garrett Morgan became celebrated as a black for inventing a fire-fighting gas mask, sales in the South dropped).

      So, if all the unverified leads checked out and that was half of all the black patents, as Baker estimated, that would be 2400 through 1913. Add in 1914-1940 and my guesstimate would be a nice round 5,000 patents for 1870-1940. That’s an impressive number.

      How does Brookings come up with an order of magnitude more than my Henry Baker-based guesstimate?

  1. There’s a convention in empirical social science research—especially in economics, but not just in economics—that privileges causal estimates. But in this case I think the quest for causal identification is hopeless, and a focus on issues of statistical analysis distracts from the more interesting descriptive questions.

    Yes, I think that while the drop is clearly not year to year noise, attribution to a particular cause is impossible. A number of important political and cultural developments happen in any particular decade. The existence of the drop is interesting enough by itself — it, along with other stuff, at the very least, challenges the liberal-centrist intuition of automatic, monotonic progress over time. Sometimes, things move backwards.

    I haven’t read Cook’s original paper, so it makes sense that I don’t understand, but nonetheless I don’t understand the argument based on that graph. It looks like lynching drop in the period. Doesn’t the graph imply, with naive causal assumptions, that lynchings may actually increase patents while riots decrease them?

    • Somebody:

      The paper includes a few different models, each including multiple predictors, so you can’t really try to pick out the estimation from this graph that only shows one predictor. In his thread, Wiebe gets into those issues. But I think if someone wants to look further into this, it would make sense to first figure out what was going on before thinking of how to explain it.

  2. From what I’ve heard (not being an expert), it’s a little simpler than all this. There was a survey of African-American inventors leading up to the 1900 Worlds Fair. So they made special effort at the time to identify black inventors. Funding for the study ended after that.

    • Right, the huge drop in patents from 1899 to 1900 is due to the US Patent Office sending out a mass mailing to 9,000 patent attorneys and other experts on January 26, 1900 asking if they knew of any blacks who had earned patents in order to celebrate black inventors in the US pavilion at the 1900 World’s Fair in Paris.

      The Patent Office official who conducted the survey, Henry E. Baker, in 1902 published a detailed list of over 400 patents earned by African-Americans.

      Baker did another mass mailing in 1913 but, as far as I can tell, only published highlights from it rather than a detailed list. He said he was now up to about 800 patents, which would be almost 400 over the first survey. If most of those were new patents post-1900, rather than old pre-1900 patents missed by the first survey, that would suggest the early 1900s were similar in black inventiveness to the late 1800s.

      Dr. Cook says 65% of her 726 patents, or about 472 are from Baker’s research. That might imply she’s using Baker’s complete 1900 list along with his published highlights from 1913, but that she’s missing a few hundred more obscure patents Baker came up with in 1913 that are not readily available and may be lost.

      https://www.unz.com/isteve/did-bidens-fed-nominee-lisa-cook-mess-up-her-most-famous-paper/

    • That’s embarrassing if true. Naomi Wolf-tier “not understanding your data”. https://twitter.com/AnechoicMedia_/status/1489847148862742531 points out that the quotes are right there in the interviews. She knows she was using a big special survey which ended in 1900… and somehow she (nor her interviewers) never thinks to herself about why she finds a big drop after the survey ended?

      If this study previously could be taken as bolstering the case for her nomination, surely such an error seriously undermines it. We know from long experience here that when it comes to garbage studies, where there is a little smoke, there is often a dumpster fire.

      • In Lisa Cook’s defense, Henry E. Baker conducted not just a 1900 mail survey but also a 1913 mail survey. He apparently found more than 350 additional patents in 1913 not in his 1900 database (many, I presume, since 1900, and some older but overlooked in the first survey)

        Unfortunately, while he published in 1902 a detailed table specifying by patent number each of the 400+ patents he found in 1900, after his 1913 survey he only published verbal descriptions in a pamphlet and academic article of the most important patents he’d found since 1900. He promised in 1913 to publish his new list of nearly 800 verified patents awarded to black inventors in a full-scale book, but it looks like his planned book wasn’t ever published.

        Lisa Cook’s database of 726 patents draws 65% from Baker’s work, or about 472 patents. That sounds like Baker’s complete list from the 1900 survey and his published highlights from his 1913 survey, but it mostly leaves out a few hundred of Baker’s unpublished 1913 findings (except the ones Cook managed to rediscover via her other techniques).

        So, if we can trust Baker’s summary of his 1913 work (and he strikes me as a trustworthy man), it would appear that 1900-1912 saw a fairly comparable number of patents awarded to black inventors as in 1887-1899, but the details are not accessible. (Baker’s 100+ year old files might be in storage somewhere, so there is hope.)

        By the way, I would be surprised if the publication of Baker’s complete 1900 list but not his complete 1913 list reflects anti-black political tides. The 1900 survey was part of a $15,000 appropriation by Congress to put on an impressive exhibit at the Paris World’s Fair lauding black progress. 1901 was perhaps the high point of post-Reconstruction Republican pro-black sentiment. Booker T. Washington’s “Up from Slavery” was a bestseller and Teddy Roosevelt famously had Washington over for dinner in the White House. Washington emerged as a GOP power broker in turning out the Northern black vote in return for federal jobs.

        In 1913, the Democrats, under the Virginia-born Woodrow Wilson, took over Washington DC and imposed much more segregation. So, perhaps that Baker didn’t manage to get his promised book published had to do with the harsher political climate, such as, to guess, the new Democratic administration wouldn’t give him time off from his regular duties at the Patent Office to work on it?

        That’s all speculation on my part.

        Anyway, Cook should have made clear in her paper that the the large bulk of Baker’s database that she primarily drew upon came from Baker’s fully published 1900 survey, while Baker only published the most noteworthy patents discovered in his 1913 survey, and that that’s the single most likely reason her database shows a massive and permanent drop from 1899 to 1900 in patents earned by blacks.

  3. One of the graphs has a label

    “U.S. Northern Black parent rate”

    Is that a misprint and “parent” should be “patent”?

    Another figure has labels “Black lynchings” and “Black patents” and is weird in the sense it looks like Blacks do the lynchings and Blacks do create patents when, of course, the meaning is lynching of Blacks and patents created by Blacks.

  4. >I think the quest for causal identification is hopeless

    I wonder if this is too dismissive. Cook is certainly aiming to answer a causal question (and even spends two pages addressing reverse causality in Section 4.3), so I think we should evaluate her paper as intended.

    • Michael:

      I know that this is how things go in social science, that causal identification is highly valued. I assume there’s no way this paper would’ve been published in this journal had it just included descriptive data plots and some discussion. Nonetheless, it’s my view that just about all the value in this sort of paper comes from description and discussion. Which is fine—I like description and discussion. That’s what Rothwell et al. did, and I think that worked well (setting aside concerns with the extrapolations involved in the dataset they were using).

  5. Thanks.

    Lisa Cook did a cautious, conservative job of only coming up with 726 specific patents that were definitely issued to blacks from 1870-1940. That has to be considered the absolute minimum number, while the real number of patents earned by African-Americans was likely considerably higher.

    I believe Lisa Crook considered and rejected in her paper the expansive approach of the Brooking researchers that came up with “nearly 50,000” patents issued to blacks. She stuck to much more cautious methodologies, like focusing on patent holders with the surname “Washington” or the Christian names “George Washington” and then investigating their biographies as best she could. Another method was to take all the names in Henry Baker’s 1902 and 1913 lists and see if these inventors earned any more patents after Baker retired. Her various conservative methods tended to each add rather small number of patents. It was slow work.

    But then she confused her scholarly cautiousness with what had really happened and announced that the sudden drop in verified patents in her database from 1899 to 1900 was real rather than an artifact of the big US Patent Office survey of January 1900.

    I’m guessing the big problem for her was that she didn’t really discover anything all that noteworthy compared to what Henry Baker had done over a century before. By immense labors, she added 54% more patents to the approximately 472 patents Baker published. But what new, more general knowledge had her labors, which were considerable, produced?

    In contrast, I probably would have put forward a guesstimate of something like several thousand patents earned by blacks based on:

    – we have Baker’s detailed list of about 400 (I haven’t gotten around to counting the exact number) published in 1902 and

    – And we have his verbal report in 1913 that he was now up to “nearly 800” after another big mail survey in 1913. So he was likely adding 25-30 per year, although some added in 1913 might have been overlooked ones from before 1900.

    – And at that rate by 1940, the list would likely be up to close to 1,500

    – And Baker guesstimated in 1913 that he had found only half.

    So, 1500 times two is 3,000 patents earned by black Americans over 71 years. That may well be an overstatement, but it’s a defensible SWAG. That African-Americans in the 75 difficult years after Emancipation earned up to 3,000 patents is something to be proud of.

    What I can’t wrap my head around justifying is the Brookings estimate of “nearly 50,000.”

    From reading their methodology, I _think_ the Brookings guys matched names on patents with Census records, which include addresses and race. The patents include the inventor’s name, the town in which he’s filing for the patent, but not his race or his street address.

    So, if a patent was awarded in 1900 to Zebediah Q. Grimsby of Smallville, Maine and the 1900 Census says there was one Zebediah Q. Grimsby living in Smallvile, well, you’ve found your man and you can rely on the Census’s demographic data for him.

    But what if the patent was awarded in 1900 to J. Smith in Philadelphia? I think what they are doing then is looking in the 1900 Census and counting all the possible J. Smiths in Philadelphia and then allocating fractions of the patent by race. If there were 850 white J. Smiths and 150 black J. Smiths in Philadelphia in 1900, then whites get credited with 0.85 patents and blacks with 0.15 patents.

    But you can see the problem: you are assuming that black J. Smiths are as likely to earn patents as white J. Smiths and then using your assumption to “discover” that blacks earn a surprising fraction of patents per capita relative to whites.

    But the Brookings guys don’t provide many illuminating examples, so it’s hard to tell what they are doing.

    • Steve:

      The Brookings report is by Rothwell et al. Based on a quick read of that report, it was my impression that they’re using the data from Sarada et al. (2019). So that’s where you want to go to learn more.

  6. Wasn’t the claim that white acts of violence against blacks caused a huge drop in the rate at which blacks were awarded patents extremely implausible from the get-go? What exactly was supposed to be the mechanism for this?

    I’ll bet that, even in those days and even in the South, violent attacks on blacks came mostly from other blacks. But supposedly it was the violence that came from whites and not the violence that came from other blacks that had this huge effect?

    • Calvin:

      I guess neither of us is an expert on this era in American history, but, just to speak generally, the mechanism does not seem so implausible to me, that, during periods in which violence against black people was essentially legal, it was harder for black people to set up businesses etc. Lynching, in addition to its direct effects, was an enforcement mechanism for white supremacy.

    • Yep, no significant acts of political violence by White Americans against Black Americans that would have impacted the ability to produce inventions occurred in that time frame. Of course, if “black on black crime” was a major problem during reconstruction, as you claim is something you’d willingly bet in favor of, it should be easy to find some supporting figures.

      (Of course, regardless of absolute numbers, there are significant reasons to believe White lynch mobs do have a disproportionate impact, as they often targeted successful Black communities. A “general” act of violence between two individuals is unlikely to significantly impact the decisions made by third parties. At the same time, a targeted act of violence aimed specifically at a group of people because of group-specific characteristics might certainly impact members of that group’s decisions.)

    • The character of inter-ethnic and intra-ethnic crime is and was different, especially during Reconstruction/shortly post-Reconstruction America. The salient difference here is that intra ethic crime is typically either property crime or gang conflict. It’s typically within a poor community or between mutually hostile parties. Lynchings and race riots were specifically targeted to inhibit or reverse black success, and were often committed against successful or middle class black people. While, through economic and political success, one could largely escape the former sort of crime (to the extent that free movement is allowed), the same success made one a target for the latter. Hence a hypothesized depressive effect. Some common examples are:

      1. The Tusla Race Riots were a thinly veiled, successful attempt to destroy a successful black commercial district. There were a few other similar uprisings around the country roughly contemporaneous with the period under discussion.
      2. The Wilmington insurrection of 1898 was a coup by white supremacists, deposing an elected mixed government. Interestingly, this insurrection was essentially successful in every way–the replacement government is essentially contiguous with the current government.
      3. Probably the clearest quantitative evidence is the rapid rise, then rapid fall in the number of black elected officials after Reconstruction.

      On the point of black elected officials, the causal mechanism and the existence of a drop in that arena is essentially inarguable. The research in question, I guess, just applies the same logic to a different measure of success. I, too, am predisposed to be suspicious–it makes sense to me that white racists would notice black officials being elected and target them and their supporters, but the causal story with patents is much less conspicuous.

      • “reasons to believe White lynch mobs do have a disproportionate impact, as they often targeted successful Black communities. A “general” act of violence between two individuals is unlikely to significantly impact the decisions made by third parties.”

        In what sense did lynch mobs target “successful black communities”? There is no research on characteristics of black communities where lynchings took place. Mobs did hardly deliberately, directly target black communities in any meaningful sense. They targeted black individuals accused of crimes or other transgressions against white. And many lynchings were violence between individuals rather than collective violence in the sense that mobs were composed of handful of people and took place at night and/or in non-public places (this book deals with such matters in great detail: Mattias Smangs. Doing Violence, Making Race: Lynching and White Racial Group Formation in the U.S. South, 1882-1930). Which according to your reasoning would suggests that such lynchings did not much impact decisions of third parties. Cook’s article does not take such differences in lynchings into account.

        “Lynchings and race riots were specifically targeted to inhibit or reverse black success, and were often committed against successful or middle class black people.”
        In what sense were lynchings specifically targeted to inhibit or reverse black success? There is no systematic research to that effect. And the systematic research on individual characteristics of lynch victims shows that they were not “successful or middle class black people” but in contrast poor with weak attachments to the localities where they lived. This book deals with that matter in detail: Amy Bailey and Stewart Tolnay. Lynched: The Victims of Southern Mob Violence.

  7. Yeah, but were blacks so much more oppressed, in ways that would affect the awarding of patents, in 1900 than in 1880 or 1890? I would think that there would have to be some cataclysmic changes to explain a sudden drop by a factor of about 4 in the number of patents awarded to blacks. Also, it looks like many of the blacks who got patents were living in the North, so whatever was going on in the South should have not affected them so much.

    I’m not aware of any such cataclysmic changes (relevant to the rate at which blacks got patents) during those years, and the other graphs (Ruggles, etc.) suggest that there weren’t any.

  8. After 1875, blacks tended to suffer political/legal setbacks, especially in the South, up at least through the election of the anti-black Woodrow Wilson in 1912, who brought segregation to Washington DC, where a large fraction of elite blacks such as Henry E. Baker lived. But it’s hard to pick out from this general historical trend any sudden event that would trigger the huge decline in patents between 1899 and 1900 reported by Cook. That is obviously an artifact of her having to rely so heavily on Cook’s 1902 chapter listing the majority of the patents in her list of 726.

    On the other hand, despite the political headwinds, blacks were slowly building their hard-earned social capital. For example, black literacy increased from 44% to 57% from the 1890 to 1900 Census. The two leading men of the race, Booker T. Washington and W.E.B. Du Bois, each published his magnum opus book in 1901-1903. BT Washington dined at the White House with TR in 1901. (BT Washington became the Republican political machine boss for blacks looking for federal jobs.)

    This Congressionally sponsored exhibition at the 1900 World’s Fair on African-American progress that led to the Patent Office’s survey of black inventors reflects the vaguely pro-black sentiments of Republican leadership at the time. The Republicans tended to endorse BT Washington’s “Up from Slavery” ideology that blacks would progress most by learning basic skills, which kept GOP politicos from having to do much, rather than Du Bois’s emphasis on top down civil rights lawsuits liberating the talented tenth elite blacks, which took about a half-century to pay off but then had a huge impact after WWII.

    But the Republican-dominated federal government was less influential than often Democrat-dominated state governments. So, while the GOP would occasionally do something symbolically nice for blacks in DC, the Democrats were hardening Jim Crow in the states.

    So, considering both the inauspicious political trend and the auspicious human capital trend, my prior would be that the rate of black patenting would be about the same in the early 20th Century as in the late 19th Century: racist politics would be slowing black achievement, but blacks would still be adding to their capabilities over time.

    My interpretation of Baker’s writings is that, if we can trust his 1913 pamphlet, that is roughly what happened.

  9. 1. You don’t need to be interested in patents to be interested in methodology. There are several interesting issues here (not just in her paper).

    2. And patents really are sort of interesting. As, say citations or academic papers are interesting (to count, trend, analyze).

    3. “My guess is that the big dropoff in Cook’s data in 1900 are a combination of an actual decline and some data issue involving the census. ”

    Not really. If you read Cook’s Appendix 2, she is explicit about the census approach failing and her abandoning it. Instead she did a bottoms up approach (“extending Baker”, her words). She took his survey and then added what she could find from books of famous black inventors, newspapers, etc. This is what Sarada et al call the “biography” approach. The problem with that approach is that anyone who’s ever participated in black history, will know that the most notable individuals are repeatedly described. Not random obscure individuals with single patents of no economic importance and the like. Given this, it is really not surprising that we see the massive drop after Baker survey. I mean at least Baker did a survey of most active patent agents/attorneys. Cook probably did a Lexus Nexus search and a library search and the like. But it’s not the same thing as mailing a letter to every patent agent you can find with a return addressed, stamped envelope and a simple form to fill out. And the power of the govt.

    4. If you look at Sarada (2019), the key figure is Figure 2 and the discussion related to it. Figure 1 says they could get about 70-90% OCR scanning (just of the key fields in the records). Then when they tried matching names using the census, they got 5-10% matches (less if you are more restrictive using what they call “perfect” or “unique” matches). I guess the harshest filter would be to look for matches that were both “perfect” and “unique”, which would have a very low false positive rate. But would cut you down to less than 5% matching. Still, they took this matching approach (all patents, census) and used it to estimate patenting by race. This is a top down approach, but is essentially a ~5% sampling of the overall data (both white and black). There are possible issues with this approach also. Very strict unique filters could end up biased against blacks (say if there names are less diverse, or the reverse I guess). Looser matching criteria (relaxing the unique) are apt to mixing in some whites with blacks in cities with mixed race demographics (John Smith in Cleveland or the like).

    —————

    But anyhow, either approach can have its issues. I’m not sanguine about the possibility Sarada mixed in some whites into their black numbers–they might be overcounting by mixing. And Cook might be undercounting, by missing things from bottoms up. But the big issue with Cook is not the overall number, it’s the 1900 plunge. I mean who cares about the nuances of how to create a racial violence index and to regress it if the time series of the patents is dramatically wrong to start with. If 1900+ numbers aren’t reliable versus 1900- numbers.

    I’m extremely skeptical of Cook’s 1900 patent plunge. She took a strong primary survey (Baker had temporal proximity, and the power of the US govt and names of most patent agents/attorneys to ask) and combined it with cobbled together aenecdotes, after Baker:

    The final strategy to extend the Baker data set was to construct a broad-based data set of African
    American inventors, i.e., potential patentees, and to match the resulting data to patent data. Among the
    historical and contemporary sources used to create a pool of potential patentees were searches of 44
    historical newspapers, including obituaries, e.g., from the Ohio Historical Society Newspaper online
    database and newspaperarchive.com; correspondence from Carter G. Woodson, Henry E. Baker, and
    patent survey participants (Library of Congress); the Garrett Morgan Papers; historical and
    contemporary directories of African American medical doctors, scientists, and engineers, e.g., ;
    academic journals, including the Journal of Economic History and the Journal of Negro History; historical and
    contemporary biographies of African American inventors and general biographies, e.g., Great Negroes
    Past and Present; and programs of exhibitors in the African American sections or exhibitions of historical
    fairs, including the “Exhibit of American Negroes” at the 1900 Paris World’s Fair, the 1904 “Great
    Negro Fair” in Raleigh, North Carolina, and the 1933 Chicago World’s Fair “Negro Day”. Newspaper
    and obituary searches and programs of exhibitions allowed the identification of lesser known inventors.
    A complete list of sources appears in a companion paper. Not all inventors and others in the pool of
    potential patentees were matched to patent records and were dropped from the data set. Others were
    dropped if there was not a unique first- and last-name match, e.g., James Young in the patent data.
    Ultimately, while second best, this process provides a more systematic and less ad hoc means of
    recovering black patentees to extend the data set.

    Given this combination of a strong primary survey with names from books and the like (often repeating the same famous ones), it’s not surprising that we see a big change in frequency.

    One thing to look at might be #s of patents per black patenter from the Baker set (I guess including patents by same named individual after 1900), versus the “not Baker sourced” individuals. If you get a lot of “long tail” (lower average) within the Baker set, then it would imply Baker did a better job of finding non-notable individuals with his survey than what Cook did with her biography method.

    • P.s. I did look for the companion paper (not cited in the free pdf of her 2014 JEG paper). There is a 2011 paper that might be the one, but I couldn’t find it free and have no academic access.

      “Inventing Social Capital: Evidence from African American Inventors, 1843-1930,” Explorations in Economic History,
      Volume 48, Issue 4 pp. 507-518, December 2011.

      The 2011 paper is on the same general topic, 1870-1940 patents, but it wasn’t clear to me from the abstract if it was a “details of how I crunched the data” paper, didn’t seem to be. But still, if someone with access can look at it, great.

Leave a Reply

Your email address will not be published. Required fields are marked *