I asked Gemini - how many orders of magnitude greater is the sum total of humanity's human knowledge as compared to what AI's have access to from the digital world? The answer below is something that is truly frightening. What if the assimilation of human knowledge was only the first step in "plundering" the knowledge and skills of all living things?
The sum total of humanity's knowledge is estimated to be roughly 2 to 3 orders of magnitude greater (100 to 1,000 times larger) than what leading AIs can currently access from the digital world.
While AI has ingested a staggering portion of the public internet, it remains bottlenecked by the fact that the vast majority of human knowledge has never been digitized.
Why Human Knowledge Dwarfs the Digital AI World
- The Dark Data Gap: The internet only captures a fraction of what humans know. Millions of physical books, proprietary corporate databases, classified government archives, and highly specialized academic research sit behind strict firewalls or physical shelves where AI crawlers cannot reach them.
- Undocumented Human Intuition: A massive share of human knowledge is procedural and tacit. The hyper-specific, real-time intuition of an elite surgeon, a master mechanic, or a seasoned structural engineer cannot be fully scraped from a webpage, representing an enormous pool of unindexed information.
- Oral and Experiential Histories: Entire cultural traditions, localized languages, historical accounts, and geographical insights exist purely within individual human memory and oral storytelling, completely invisible to the digital web.
- Private and Dark Web Data: The vast majority of daily human communication occurs in private channels—encrypted messaging apps, internal corporate Slacks, family emails, and offline conversations. AI is legally and technically barred from accessing this massive, growing reservoir of human data.
The Data Math
- AI's Digital Sandbox: The absolute upper limit of high-quality text and multimodal data used to train the world's largest frontier models is measured in the tens to hundreds of Terabytes (extending to a few Petabytes if counting heavily filtered video and audio).
- The Human Mind Reservoir: Neuroscientists estimate a single human brain's functional memory capacity to be anywhere from 10 to 100 Terabytes. Multiplied across 8 billion living minds, the total raw data, unique perspectives, and experiential knowledge locked inside humanity sits well into the Zettabyte range—making our collective intelligence vastly superior in volume to the narrow slice of the web AI feeds on.
No comments:
Post a Comment