Debunking Huebner's 'A Possible Declining Trend for Worldwide Innovation'
The paper is biased. There is not decline in innovation.
**View the original on the website uncorrelated.xyz, follow me on X/twitter*
Introduction
During the late 2010s and early 2020s, IQ dysgenics was at its height. Over the years prior Michael Woodley had steadily published his repertoire of papers on estimating IQ dysgenics using Galton's methods, Edward Dutton was on the rise with his new YouTube channel, popularizing "At Our Wits' End: Why We're Becoming Less Intelligent and What It Means for the Future"; the book he had just collaborated with Woodley to publish. It was something everyone felt was true - stupid people had kids in greater number and sooner in life. Given the heritability of IQ, it was assured - humanity was in peril - and Idiocracy was a documentary.
In brief; in light of new evidence incrementally published over the years, the disappearance of Woodley from the internet, and the plateauing of Dutton's channel, the theory has desensationalized. Touched grass if you will. HBD scientists put themselves to work, in summary:
1. IQ dysgenics isn't nearly as large as was once thought. When IQ is used directly in dysgenics calculations - instead of a proxy like educational attainment - the effects are statistically significant but small in magnitude.
2. Selection against educational attainment > selection against IQ. Since educational attainment was often used as proxy for IQ decline in genetic/biobank studies, genetic estimates of IQ decline were greatly exaggerated.
At the time, this supposed IQ mega decline made a few 'just so' stories.
One of these was innovation: it 'just so' happened to be that innovation was decline because of IQ dysgenics. The favorite paper to tell this story was Huebner's 'A Possible Declining Trend for Worldwide Innovation'. This paper produced the much cited (152 times!) plot:
This fit the theory of IQ dysgenics snugly, because the theory premised itself on dysgenics beginning once selection pressures against IQ relaxed - and that was of course assumed to be post industrial revolution. 1850 was considered the perfect turning point.
This seemed so convincing that Michael Woodley - co-author of "At Our Wits' End" - published a study on this (cited 62 times!). He alleged a correlation of (r = .876 across N = 56 decades).
In retrospect this is embarrassing to look at - as given what we know now looks like circular reasoning.
However - the point of this blog post - it can still be argued that Huebner's 2005 paper despite what we know about IQ dysgenics today is genuinely measuring an innovation decline. This post will tackle this, showing that sources the paper bases itself on are most likely suffering from selection bias.
Replicating Huebner's Method
Where did Huebner get his data? He extracted it from a book called 'The History of Science and Technology' by Bryan Bunch. The book is chronologically ordered, starting from events in 3000 BC and ending near its publication date in 2003 AD.
The book has a simple, systematic format. The year in discussion is given in the header or footer of the page. If there are multiple years of innovation events on one page, the authors distinguish the year in which they occurred under clearly marked headers. Here's a sample page.
So how do we go about replicating Huebner's method? Instead of counting the innovation events as Huebner did, we're going to count people. How do we do this? Scripting and Natural Language Processing (NLP).
Method
NLP can discriminate words from names. Our script basically works like this:
1. Extract the text from all page.
2. For each page use regex to extract the list of years.
3. For each page use NLP to extract the names.
4. Create a table with this data, containing the following columns: page number, page text, list of years, list of names.
There's still some work to do to get from this to a list of historical figures. Next, we need to refine this data and remove duplicate names.
1. Associate each distinct name with the average of the years it was found with across the entire book.
2. Download the database: A cross-verified database of notable people, 3500BC-2018AD. This database contains all figures on Wikipedia and Wikidata. A far more extensive and comprehensive record of historical figures. Filter the cross-verified database to exclude celebrities and sports stars, etc.
3. Using Levenstein distance calculations, associate each name found in 'The History of Science and Technology' with a name found in the filtered cross-verified database. Optimize the algorithm to only match figures that were living around the year the figure was mentioned in the book.
After this whole process we are find ~6000 historical figures mentioned in the book between 1450 and 2003 (book publication year). The matching algorithm wasn't perfect, looking at the matches about ~50% of them are absolutely correct matches. The rest are partial matches that roughly correspond to what they're meant to be in the book. I'm not going to say it's perfect, but I would say the correlation is sufficiently high for our purposes, in my opinion.
Okay, so why did we match to the cross-verified database? Two reasons, we get the occupations of the figures mentioned in Huebner's data source, and we get the birth dates and death dates of all figures. This gives us sufficient data to ask the following questions and answer them with data:
Hypotheses
1. Like with innovation, is there a decline in the number of historical figures mentioned in 'The History of Science and Technology' per capita?
Obviously, it would be interesting to see if there are less living figures per capita mentioned in the book. If there is a decline, it would reinforce Huebner's position of a decline in innovation, as innovation requires innovators.
2. Using only the occupations figures in 'The History of Science and Technology' were a part of, is there a decline in figures with those occupations in the larger, more comprehensive cross-verified dataset?
This measures if there's any discernible occupation bias exhibited in 'The History of Science and Technology', it's possible they're sampling historical figures from occupation in decline (inventors, for example) and not from newer ones such as computer science.
3. Does the cross-verified database show a decline in the top 0.1% most eminent figures, and if so, does this track declines in living figures in "The History of Science and Technology"?
It could be possible that declines in "The History of Science and Technology" are happening because they exclusively sample the best of the best. This is a sort of measurement invariance test - the cross-verified database might be oversampling popular, more recent figures.
4. Does the over-representation in occupations in "The History of Science and Technology" correlate with their decline in the cross-verified database?
If it does, and the direction is positive, it would further evidence that "The History of Science and Technology" is selecting biased occupations.
Let's answer these questions with data!
Results
Q1
The answer to the first question is: No. There is no decline of living figures in the Western World found in "The History of Science and Technology", that is similar in magnitude to the decline of innovations that Huebner apparently found.
Q2
The answer to the second question is: No. The opposite was found, there isn't a decline, there's an increase! This indicates that the plateau seen in 'The History of Science and Technology' is incomplete sample bias - they just weren't comprehensive enough.
Q3
The answer to the third question is: No. This almost perfectly correlates with the second plot. So we can exclude the fact that the cross-verified database sees an increase because it over samples irrelevant figures.
If you don't think these top 0.1% are good enough, see for yourself. I recognize every name here.
Q4
The answer to the last question is: No. Using the top 100 occupations found in 'The History of Science and Technology' There is no correlation between the over representation and decline/increase of each occupation. This excludes the fact that the book just happens to be sampling occupations that are in decline.
Conclusions and Discussion
We've demonstrated that there is no decline in living eminent historical figures mentioned in our cross-verified database and in the book Huebner cites. The fact that figures from Huebner's book show no decline as opposed to an increase in the cross-verified database cannot be due to biased occupation sampling or from selecting overly eminent figures.
Hence, it is likely that the cross-verified database with a sample size >100x larger than Huebner's book is simply an incomplete resource that, in part, is likely one of the reasons it presents an innovation decline.
Just eyeballing the numbers, the book is already 785 pages long. If it were actually representative of all innovations proportional to the number of actual historical figures, that page number could well be into the thousands, or even more.
Clearly, that's just an unrealistic undertaking for the two authors of the book.
In a forthcoming post I hope to try to answer whether innovation rates are actually increasing or declining; but my sense of it is that the nature of innovation has simply changed. For example, one occupation that did decline that the booked loved sampling was inventor. We don't have Thomas Edisons just spinning up the first light bulbs and batteries anymore. Instead, we have exponential declines in costs of batteries according to a law like curve approximating something like Moore's Law. In my opinion (which I hope to prove in another post) innovation is defined today not by big macro innovation but small incremental improvements that accumulate into radical changes - which might be another reason why Huebner finds a decline.
Furthermore, in retrospect given the new numbers we now have about dysgenics (that isn't not as big of a deal), it doesn't make sense that there should be a per capita decline anyway.
So overall, I think this is at least decent evidence that Huebner's paper should be discarded. Besides, it's been 20 years since it was published, we have far more resources at our disposal to measure innovation from multiple angles anyway.
Status: DEBUNKED
References
Bunch, B., & Hellemans, A. (2004). The history of science and technology: A browser's guide to the great discoveries, inventions, and the people who made them from the dawn of time to today. Houghton Mifflin Harcourt.
Dutton, E., & Woodley of Menie, M. A. (2018). At our wits' end: Why we're becoming less intelligent and what it means for the future. Imprint Academic.
Huebner, J. (2005). A possible declining trend for worldwide innovation. Technological Forecasting and Social Change, 72. https://doi.org/10.1016/j.techfore.2005.01.003
Jensen, S. (2024, January 30). Are we getting dumber? Data on IQ and fertility across the world. Center for the Study of Partisanship and Ideology. https://www.cspicenter.com/p/are-we-getting-dumber
Laouenan, M., Bhargava, P., Eyméoud, J. B., Gergaud, O., Plique, G., & Wasmer, E. (2022). A cross-verified database of notable people, 3500BC-2018AD. Scientific Data, 9(1), 290. https://doi.org/10.1038/s41597-022-01369-4
Woodley of Menie, M. (2012). The social and scientific temporal correlates of genotypic intelligence and the Flynn effect. Intelligence, 40, 189-204. https://doi.org/10.1016/j.intell.2011.12.002
Good article, though Dutton at least differentiated between incremental innovation versus larger innovation (I forget the term). The former being of course quite strong in modern times, but the latter appears to have decreased
Great article, looking forward to the next ones!