A few days ago we discussed a post by historian Sean Manning who, in the context of a review of a book by economist Brad DeLong, wrote:
I [Manning] don’t see much value in estimating the population of the world in 6000 BCE when we can’t agree on the population of the Americas in 1492 within a factor of 20, and took decades to agree on the population of the Roman empire under Augustus within a factor of two.
I pointed this to someone by email, who disagreed with the claim of a factor-of-20 uncertainty:
The lowest estimate I am aware of was A.L. Kroeber’s 8 million estimate in 1938, based on backward extrapolation from 1800s populations without any recognition of the effects of the effects of plagues on a greenfield population. Dobyns, reacting against Kroeber in 1966, pulled a 20:1 “depopulation ratio” from the first century of contact out of thin air. People who think hard about this guess at around 40 million, with some anchoring out of an unwillingness to be a weirdo, but not much.
If the best estimate is 40 million, then a factor-of-20 uncertainty might correspond to a range of 40*c(1/sqrt(20),sqrt(20)), that is, 9 million to 180 million.
This motivated Manning to look back at the literature:
It has been a while since I read Henige’s Numbers from Nowhere and if the 20:1 figure is out of date I will correct it. Some recent research that I have to hand which shows the magnitude of the problem:
David Henige, “Recent Work and Prospects in American Indian Contact Population,” History Compass 6/1 (2008) pp. 183–206, doi: 10.1111/j.1478-0542.2007.00490.x
In a recent canvass of hemispheric depopulation the Italian historical demographer Massimo Livi-Bacci is cautious in his numbers. For example, he thinks that Hispaniola’s contact population was ‘several hundred thousands’ rather than the 2 million to 8 million estimates proffered by various High Counters.
John C. Caldwell and Thomas Schindlmayr, “Historical Population Estimates: Unraveling the Consensus,” Population and Development Review, Vol. 28, No. 2 (June 2002), pp. 183-204 especially page 201 https://www.jstor.org/stable/3092809
It might be possible to work out homeostatic maximum carrying capacity but this has proved impossible even in pre-contact Australia, an island with a stable hunting and gathering system. Had it been possible to estimate Aboriginal population at first contact, something could have been done, but the modern estimates for 1788 vary by an order of five, from 300,000 to 1,500,000 (see Caldwell, Missingham, and Marck 2001). Pre-1492 estimates of Amerindian populations vary by at least the same multiple.
It is debated whether the population of Easter Island on European contact was around 15,000 or 3,000 which is also 5:1 https://dx.doi.org/10.1126/sciadv.ado1459
A recent takedown of the historical population estimates published by people like McEvedy and Jones is:
Timothy W. Guinnane, “We Do Not Know the Population of Every Country in the World For the Past Two Thousand Years,” The Journal of Economic History, Vol. 83, No. 3 (2023) pp. pp. 912 – 938 https://doi.org/10.1017/S0022050723000293 Preprint at https://www.repository.cam.ac.uk/items/6c7f7b04-c2c4-4e34-a3b1-bbe740ddb4df/full
Just looking at Manning’s reply, not looking up the references, I see the claim by Caldwell and Schindlmayr that “pre-1492 estimates of Amerindian populations vary by at least the same multiple,” with that “same multiple” referring to the “order of five” mentioned in the previous sentence.
So I edited my post and changed Manning’s “within a factor of 5” to “within a factor of 20.”
In comments, John Mashey wrote:
There’s been much analysis since Caldwell & Schindlmayr (2002) and Henige (2008). Koch, Brierly, Maslin, Lewis wrote “Earth system impacts of the European arrival and Great Dying in the Americas after 1492” (2018), https://www.sciencedirect.com/science/article/pii/S0277379118307261, 24p, including ~17 pages of analysis of the various population estimates, concluding:
“Our estimate of the number of people living in the Americas in 1492 CE is 60.5 million, with an interquartile range (IQR) of 44.8-78.2 million…”
Assuming for simplicity or approximation a normal distribution, the interquartile range is +/- 0.67 sd. The 95% range is +/- 2 sd. If we consider the upper and lower 95% points as a reasonable range of uncertainty, this would give a factor of (78.2/44.8)^3 = 5.3. So “a factor of 5” still seemed about right to me, if you trust that source. We can also look at the endpoints of this interval, which would be sqrt(44.8*78.2)*c(1/sqrt(5.3),sqrt(5.3)), which is 25 to 135 million.
Manning then released a new post returning to the question:
The Population of the Americas in 1492 is Disputed . . .
Colin McEvedy and Richard Jones . . . in their 1978 Atlas of World Population History . . . acknowledge disputes about the pre-Columbian population of the USA and Canada within a factor of 20, and disputes about the population of Mexico within a factor of 6. Their arguments for one end of the range are no more sophisticated than “it seems to be generally accepted” and that if the population of Mexico had been as high as 30 million, then the rate of decline which this implies would be an “improbability.” . . .
But 1978 is a long time ago, so if you prefer you can check a more recent survey. . . .
The introduction of European diseases caused much of North America to revert from fields and parkland to secondary forest, because the people who had been burning the brush to encourage deer or clearing forest to grow maize died or fled. [In their study published in 2023], Alexander Koch and colleagues wanted to guess how this impacted the world system, so they tried to estimate the population of the Americas in 1492. They began by surveying the literature and found a wide range of estimates:
• 2.2 to 52 million for Central Mexico (factor of 24)
• 2.3 million to 13 million for Yucatan (factor of 6)
• 1 to 20 million in Amazonia (factor of 20)
• 0.9 to 18 million in the USA and Canada (factor of 20)They could not see any way to decide who was correct with the resources available to them and decided to take their omnium gatherum, assume the guesses reflected a statistical distribution, and find the middle of that distribution (“We included all the prior studies and did not make any judgement on their relative quality.”) . . . So both quick-and-dirty researchers in the 1970s, and a literature review in 2023, find estimates that differ by a factor of 6 to a factor of 20. . . .
As David Henige pointed out on 1998, most of these estimates start with numbers in the writings of early travellers or the records of colonial and post-colonial states. There are many reasons to question these numbers, such that sometimes a number in one story is based on a number in another story not a count, or that by the time any settler made a count many of the natives had been killed by disease, forced labour, expulsion from their homelands into wasteland, and plain old murder. Researchers then add, multiply, divide, and subtract the numbers in their sources until they feel right. . . . There has been no great improvement in these methods during my or my parents’ lifetime, although certainly sometimes a new source is discovered or archaeological work affects the arithmetic.
Given enough money and labour and equipment, archaeologists can survey large regions for evidence of settlements before European contact. This kind of research only covers a few areas, it works best for people with durable houses and pottery, and converting counts of potsherds or potholes to people is very difficult. It can be hard to tell the difference between a very large village, and a village where it was customary to rebuild your house in a new location every time the old one decayed. . . .
It is well known that when you ask people to give you a number for something, and there is a plausible minimum but no maximum, the numbers will be skewed upwards by people with big imaginations. Counts of First Nations and American Indian populations in the early 20th century generally provide a minimum for the pre-contact population of North America, although even then there are difficulties because the reintroduction of the horse and the spread of native and Old World crops let people live in places where few of them had lived before. . . . But it is easier to guess arbitrarily high numbers than arbitrarily low numbers, and so any attempt to take an average or a median will skew high. . . .
In ancient Afro-Eurasia, the only societies where we have good data are the Greco-Roman world, Egypt, and Han China. We can estimate the population of those societies within say a factor of 3 using contemporary censuses and very detailed archaeology. Evidence for those societies is no help in estimating the population of Hispaniola (Hati and the Dominican Republic) or Newfoundland in 1492, because the ways they lived and were organized were totally different. . . .
The abstract of Alexander Koch’s study emphasizes the middle range of their population estimate not the wide range in their sources (and Koch and colleagues explain why its hard to know historical populations, but you have to read the whole article to see their explanation). . . .
OK, so what to do about all this? I’m not sure! I see four big issues here:
1. It’s not clear what we should mean when we say that we know a positive number within a factor of X—even in the ideal situation when the uncertainty about X can be explained by a known probability distribution. I’m taking X to be the ratio of the upper and lower bounds of the 95% central uncertainty interval, but that’s just one way to define it.
2. The population of the Americas in 1492 is the sum of the population in 1492 of the land that is currently Canada, plus the population in 1492 of what is now the United States, plus the population in 1492 of what is now Mexico, plus the population in 1492 of what is now Hispaniola, plus the population in 1492 of what is now Cuba, etc etc etc. Our uncertainties in these numbers are, presumably, positively correlated, but not 100% correlated.
Koch et al. (2023) obtain their uncertainty about the total population by dividing the hemisphere into seven regions: Caribbean (“Most estimates are between 300,000 and 500,000 people”), Mexico (“estimates for central Mexico and Yucatan combined, which is considered representative for all of Mexico, range from less than 3 million to over 52 million with many at around 20 million”), Central America (“Estimates range from 0.8 million . . . to 10.8–13.5 million . . . Most estimates range between 4.75 million and 6 million), Inca Territory (“estimates range from 4.1 million to 43.8 million with a likely population of around 20 million, based on the sum of the most widely accepted figures for each of the regions”), Amazonia (“Estimates include 1.5-2 million based on an average of present-day densities . . . 3.2 million based on tribe-by-tribe counts . . . 5.5 million extrapolated from eastern Ecuador . . . and from 5.1 to 20 million . . . Recent findings . . . indicate larger populations, with most recent estimates ranging between 8 and 20 million people), North America (“The lower range . . . lies between 900,000 and 2.4 million . . . The highest estimate of 18 million . . . has been criticized . . . More recent estimates derived from geospatial interpolation of archaeological sites range between 2.8 million and 5.7 million), and the Rest of the Americas (“Venezuela with 600,000–1.5 million, Uruguay and Paraguay, estimated together as 285,000–1.1 million, and Argentina with 300,000–500,000 people . . . The total estimate for the remainder of the Americas is between 1.2 and 3.1 million”).
Then they put these numbers together . . . I’m not 100% sure what they do here. Here’s what they say:
National estimates within a region are cross-combined and their sums form a regional estimate for each of the seven regions. . . . Next, cross-combining and taking the sums of these regional estimates (combinations) gives a hemisphere-wide population frequency distribution, with the higher occurrence rate of similar results reflecting higher frequencies in the distribution.
I think this means that they’re doing a kind of bootstrap, where they’re taking the different estimates they’ve collected and using these to represent an uncertainty distribution for each region, and then they’re assuming independence of the uncertainties.
As noted above, assuming independence doesn’t seem right, and this makes me think they’re understating their total uncertainty.
3. The estimates are constructed by adding together numbers that come from rough extrapolations from crude models. I don’t know of any good alternatives here–given that this all happened in the distant past, “rough extrapolations from crude models” is pretty much the only game in town–but it’s still an issue. To stick with that Koch et al. paper, their distributions include some estimates that they themselves (Koch et al.) don’t seem to think are reasonable. On the third hand, all of these estimates are rough extrapolations so maybe the uncertainty ranges are too narrow. Just for example, can they really estimate the 1492 population of Argentina as “300,000–500,000”? Just intuitively this sounds way too precise. Their citations for that area come from papers from 1954 and 1976. OK, this is not a big deal–Argentina isn’t where all the people were living. The point is that it’s hard to know what to do of this mix of implausibly precise and implausibly extreme estimates for different regions.
4. There will always be a demand for precise numbers. This does not mean we should laugh at the estimates we have, just that we sometimes need to push against our desire for a quick number. And there are pressures in the other direction: as Manning discussed in his posts, there are various political reasons for people to want to give low or high numbers.
A factor of 20 (for example, 9 million to 180 million) seems soooo wide to me. I’m more comfortable with a factor of 5 (for example, 25 million to 125 million). But that might just come from my comfort with modern governmental statistics. Maybe 9 million and 180 million both are legitimate possibilities. I don’t have a good sense of what’s known here, let alone what we don’t know we don’t know.
To say the population of the hemisphere in 1492 could be known to within a factor of 2, that indeed doesn’t seem reasonable given the difficulty of estimating the population at that time in any given region.
What’s it all for?
At this point, the question arises: What do we plan to do with this number? Or, to put it more baldly, who cares what was the population of the Americas in 1492?
For sure, it makes sense for historians, epidemiologists, political scientists, economists, etc., to want to know what was the population in particular cities and regions of the Americas, to get a sense of what happened after contact with the Europeans.
But what are you supposed to do with the total population of the Americas? One thing you can do is add it to your estimates of the total population of every other region of the world to get an estimate of global population. A couple of my books have these data; here’s the relevant bit from Active Statistics:

I think I typed these numbers in from . . . the aforementioned Atlas of World Population History, which sits on my bookshelf—I saw it in a bookstore and bought it many years ago, just because I was curious about the topic. I’ve always been interested in statistics and was very excited upon discovering the Statistical Abstract of the United States as a teenager. So that’s one answer to who cares: stats nerds!
Notice, though, that the numbers in my table above, which were taken from that atlas, show no uncertainty. Kind of embarrassing for a statistics book, no?
Another way to say this is, if a number is particularly difficult to estimate, maybe you should question your need to estimate it. The world population in 1492, or even the population of the Americas at that time . . . what does it mean? There was so little interaction between different parts of the world, or even between different regions in the Americas, that these total population numbers have no particular meaning in themselves.
I mean, sure, there was a number, which we’ll never know or even be able to accurately estimate, of the total number of humans living in the Americas at midnight GMT on 1 Jan 1492 or whatever, but there’s nothing you can really with it. It would be like, oh, I dunno, what if you do a census of someone’s house and find that it contains 240 books, 35 DVDs, 5 magazines, 2 newspapers, 3 old vinyl records, and an 8-track tape? You could add these up and say that there are 286 news and entertainment items in the house, but this “286” isn’t really the answer to any question.
Getting back to the population of the hemisphere, this all came up because Manning used uncertainty in population to cast doubt on DeLong’s claims about historical rates of growth. This led us down the rabbit hole of trying to summarize uncertainty in this one particular population summary as a sort of proxy battle about the level of precision that can be expected from historical estimates of population, health, and economic production and consumption. Before the modern globalized era, the relevant numbers here are local and regional, not continental and global. I recommend going back and reading Manning’s post and also probably DeLong’s book, about which Manning has much to praise.
I don’t have any immediate thoughts on the general issue, but this post did make me think of an experience when I was maybe eight or ten years old and we took a family trip to visit my grandmother and other relatives in Arkansas. Among other things we drove out to someplace near a tiny community called Lost Cane where my grandmother had grown up and where some relative still owned farmland. We walked out into one of the recently plowed fields and were able to find several fragments of Indian pottery. Whoever was showing us around said there were several areas around where you could do that, especially after a light rain when the pottery of arrowheads would sit proud of the surrounding soil because they would protect the soil beneath them from being washed away.
Much later I read that the first European travelers through the area reported that by the time you could no longer see smoke from the fires of the Indian village you had just left, you could already see the smoke from the village up ahead….and that the next Europeans through the area reported that there were no people.
Anyway it made a big impression on me as a kid, that you could just walk out into a farm field and find pottery fragments from long ago.
Hah! We routinely turn up old turn of the century Herty Cups after tilling or heavy rains in the garden. Relics from the era of frontier turpentine collection down here in the piney woods of the Coastal Plain. Also different colored antique glass, and an old bridle once. Truly human and natural history intertwine..
The biggest problem with these estimates is that we are now deep within the second golden age of archaeology, all because of lidar. In particular, the Amazon was long thought to be unable to support large populations due to a lack of good soil. Recent lidar scanning now reveals vast areas of ancient cultivation based on rather sophisticated soil-building practices.
Lidar has also validated one of the earliest European written reports from the Amazon river, where a Spanish missionary (forgot his name) described near continuous settlement along the river. The ruins visible on lidar support this claim which was once thought to be exaggerated.
In our current state of understanding based upon these new findings, the lower population estimates just no longer make any sense.
Hi Matt, I talk about the competing theories about Amazonia and how its unlikely that “the truth is somewhere in the middle” in my second blog post. All the big claims to estimate the population of the Americas like Koch et al. or the Angus Maddison dataset are based on studies from the 1930s through 1970s when both archaeology and historiography were in very different places (David Henige’s book from 1998 does not say as much as I would expect about written or oral indigenous sources). Aztec archaeologist Michael E. Smith does not seem confident that (documents + travellers’ tales + archaeology) lets us be confident about the population of the Aztec world yet.
My understanding is that population estimates are important to understanding historical economic conditions, because population is a good proxy for technological and economic progress in a Malthusian setup.
This was a key dependent variable in one of the more famous Acemoglu, Johnson, and Robinson institutional economics papers – “Reversal of Fortune”. Although for some questions, the level of accuracy which is needed will vary. In some statistical analyses, you would only need a good estimate of percentage changes over time or relative differences between places in the same region to measure economic output and could tolerate some errors in measuring the level.
Joseph:
I can see that local and regional population estimates can be important. I can’t see a good reason for wanting continental or global population estimates from that era.
Joseph, generally there is some kind of count of indigenous people by a colonial or post-colonial state, and evidence that large numbers of people died at specific times in the 200 years before that count, but its very hard to put numbers on the deaths (and the count was usually not in the interests of the people being counted, so those numbers may not be reliable; likewise boasts by missionaries that they had baptized 14,000 people last Sunday and please send more money). Nobody since 1520 has disagreed that millions of people died in Mexico from European diseases, forced labour, and direct violence by the Spanish and their native allies, but how many millions is very tough to answer.
Sometimes talking about the numbers (which everyone disagrees about) is easier to stomach than talking about what specific acts happened as a result of European settlement (which is usually pretty well documented, and poses tough moral questions about what reconciling after those events would involve).
“At this point, the question arises: What do we plan to do with this number? Or, to put it more baldly, who cares what was the population of the Americas in 1492?”
I think there is an ideological component to this. If you want to argue that settler colonialism is bad and indigenous populations are good, then you want a big number to emphasize the human cost of conquest. If you want to argue that the Europeans were bringing civilization to the blighted natives, then you want a small number, to minimize it.
I’ve been aware of the controversy about the pre-conquest population of the Americas since taking a class in the history of Mexico from WW Borah at Berkeley in 1964 or 65; he did a lot of work showing that the Spanish conquest of Mexico caused a population crash there.
John:
Yes, Manning discusses these motivations. Recall that notorious line from Laura Ingalls Wilder: “There were no people; only Indians lived there.” Even here, though, I think the relevant numbers would be by region, not hemisphere-wide.
I don’t know if it was intentional, but this is a fun post sitting next to yesterday’s. Looks like psychohistory is on the way!
Alex, if you want an attempt at psychohistory, check out Peter Turchin and his SESHAT project. I have some posts on the accuracy of their historical data in my archives.
Although I am neither an economist nor a geneticist, it still seems to me that a more reliable quantitative estimate of demographic shifts of that type can be obtained by tracking genetic diversity and reconstructing a certain level of the genetic tree. The attached article, based on mitochondrial DNA (which is maternally inherited), found a decrease of about 50% in the female population in the Americas around the year 1500, followed by a recovery to pre-1500 levels.
https://pubmed.ncbi.nlm.nih.gov/22143784/#:~:text=high,widespread%20mortality%20among%20indigenous%20Americans
Hi nadav, DNA studies seem like they might help understand this history in the future. It would be important to look at the O’Fallon and Fehren-Schmitz data set since some indigenous populations vanished (the entire populations of some islands were either killed, kidnapped, or paddled away and married into other nations) and some don’t want to let strangers collect their DNA.
Their model shows a five-fold population increase about 9,000 years before present which would correspond to the domestication of potatoes and maize (figure 1). The model of populations as static thereafter seems implausible given that eg. maize spread east of the Rio Grande in the last thousand years, or the Classic Maya seemed to encourage a population boom followed by a decline. Likewise, in many areas there is no trace of a population decline before the 17th-19th century, so the model of “sudden decline 500 years ago followed by recovery” does not match the evidence for specific regions. I wonder if their model is really showing that deaths in densely-populated areas like Mexico and Peru were much more numerous than deaths in other areas which pretty much everyone agrees (and indigenous people in North America are very used to hearing “at least we were not as bad as the Spanish!” and to people who want to talk about disease but not oppression). I feel like I am dancing on the edge of my competence both as a scientist and as a human.
I’d love to see a restatement of the different estimates as more-or-less (probably “less”) refactored models (that is, retracing the process in the relevant literature, not just taking the point estimates or ranges). That would probably narrow down the high-likelihood area of the posterior (might not be a continuous range, though – sometimes in these things there are discrete parameters [“do we believe X at all?”] that can lead to very strongly multimodal posteriors), but, beyond the number itself, there’d be a lot of educational and intellectual value in such a shared model (realistically, large and growing number of competing, partial models).
Just like “plans are unimportant, planning is crucial,” from the point of view of thinking about very longue durée demographics –and of thinking about what would be most helpful to look into next– no single model might be critical, and no plausible estimate very useful, but the process of modeling on itself is often clarifying (a version of one of the best ways of learning about something is to program a computer to do it).
Hi Marcello, the article by Caldwell and Schindlmayr and the 1998 book by David Henige are excellent overviews of where most of these numbers come from. They don’t cover the last 20 years of guesses and methods. Bret Devereaux’s ACOUP blog has overviews of how ancient historians estimate the population of Roman Italy, Egypt, and the ancient Greek urban world.
Devereaux’s big demography post is https://acoup.blog/2023/12/22/collections-how-many-people-ancient-demography/
He makes an important point that’s the sort of thing statisticians would appreciate: even with Roman Italy, where the number of people counted is regarded as pretty accurate, there’s the question of the target population. Were women counted? Children? Free non-citizens? Enslaved people?
The arrival of Europeans in the Americas brought a lot of disease, knowing the effect of disease on the population, both regionally and overall throughout the Americas would teach us something about pandemics and immune systems and sanitation, etc
Start with a real upper bound then start chipping away.
I get the combined area is ~42 million sq km. You can pack standing people about 1 meter each. So not more than 42,000,000,000,000 (42 trillion) people.
Such a planet existed in star trek, so it is plausible: https://memory-alpha.fandom.com/wiki/The_Mark_of_Gideon_(episode)
If the region had density similar to Europe today there would have to be many many settlements that would have been visible ruins everywhere in the 1600’s only ~ 100 years later. EU has 106/km^2 so that means 4.4 Billion is well above what could have been. Take it down a notch (power of 10) and say 440M people is probably achievable but still likely above the population.
So, a decent starting point for a number is something like Lognormal with a mean-log of log(100e6) and log standard deviation of log(2.0), that’d give a ~ 95% range between about 25M and 400M.
There is something dissatisfying about those guesstimates though.
The 42 trillion, while most likely (sic) way overestimating the value, seems more concrete. I guess that number still ignores multi-story buildings though, so fine multiply by another 100x.
Maybe more realistic is use the most densely populated city today, which is 45k / sq km: https://en.m.wikipedia.org/wiki/List_of_cities_proper_by_population_density
There’s no way it was more than that right? That gives 1.9 trillion people.