The fabled ancient library of Alexandria was one of the great achievements in human history. Its mission was to compile all knowledge in one place. Its greatest fame came when it burned 2,000 years ago. The San Francisco based Internet Archive is a conscious attempt by founder Brewster Kahle to recreate the library of Alexandria in digital format.
Everyone has heard of Wikipedia’s astonishing work to crowdsource the world’s best online encyclopedia, but the Internet Archive is not quite as well known. It is an equally surprising noncommercial project to capture and make available to everyone the world’s historical digital information. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format.
Brewster Kahle founded the nonprofit Internet Archive in 1996, the same year he started the online analytics company, Alexa, which was eventually sold in 1999 to Amazon for $250 million, making Kahle a classic tech millionaire. Since then he has devoted his time to building the great digital library whose aim is to compile all public domain digital information. It is now among most popular websites in the world, ranking number 175 in the vast digital ocean of over 600 million websites now on the Internet.
The Archive has very consciously partnered with current Bibliotheca in Alexandria, Egypt. It has servers in San Francisco, and mirror servers in Alexandria. In an event reminiscent of the ancient library of Alexandria, the Internet Archive had a fire in their San Francisco scanning center in November of 2013. Some equipment was lost, but no data was lost and no one was hurt.
The Vision of The Internet Archive
In terms of his vision for the archive, Brewster Kahle says "I come from the Internet generation, and the things we've seen work have not been these closed, walled gardens," Kahle said. "And what we're really about is having no centralized points of control. We want lots of winners. We want lots of publishers to win. We want lots of libraries to win."
"It's really meant to be a resource where you can come up with your own ideas," he said. "We want people to think deeper and then create new things that are worthy of putting in the library."
The Wayback Machine
The Archive has been receiving data donations from Alexa Internet and also has partnerships with the Library of Congress and the Smithsonian. Now the Internet Archive includes: texts, audio, movies and videos, and software as well as archived web pages. The Internet Archive's newest incarnation of its Wayback Machine has archived 386 billion web pages to date. For instance you can look at TechSoup’s original CompuMentor website from 1996 using the Wayback Machine. The organization has been crawling the Web with its bot since it started.
Of course the archive has ebooks and texts in vast numbers. To be specific it has over five million books. The American Libraries collection is the largest. It has public domain volumes like Uncle Tom's Cabin and Little Women. Actually one of the most downloaded items recently has been Newton's Principia, the mathematical principles of natural philosophy. The books can be read online or downloaded in all the primary ebook format. This section of the archive also has entire digitized collections like the John Adams Library at the Boston Public Library.
The Archive also features more contemporary digital print materials from OpenLibrary.org. OpenLibrary was originally founded to support the Print Disabled community, but is available to everyone. Anyone can get a free OpenLibrary card and borrow ebooks. This part of the project also provides specialized services for adaptive reading and information access for the blind and other persons with disabilities.
The Archive takes donations of physical books or print items. Libraries can also contact the Archive for help in getting collections digitized. After scanning, materials are permanently hosted on archive.org and integrated with Open Library. Partners in this service are Library of Congress, Harvard University, The New York Public Library, the Smithsonian, and a thousand more Open Content Alliance partners. Primary donors to the books collection include Microsoft, Yahoo!, and The Sloan Foundation
This is the place to download or listen to free music and audio. The audio books and poetry collection has digital recordings and MP3's from the Globe Radio Repertory and the classics of world literature, Naropa Poetics Audio Archive where you can listen to all of Ginsberg’s classes, LibriVox, Project Gutenberg and many others.
The biggest collection is Community Audio. This is a great place for free MP3 music to listen to or download in all genres. The Live Music Archive at etree.org has live concerts in a lossless, downloadable format. There are literally thousands of Grateful Dead concerts on there. The audio archive also has Podcasts, Radio Programs, a Spirituality & Religion section with sermons and talks, and Non-English Audio programs
The Movies section has full-length feature films, short films, and world culture documentaries, while the Animation & Cartoons has exactly what you might imagine is in there. The Community Video section houses hundreds of thousands of videos and video blogs in several different languages from people who upload them under creative commons license.
Thanks to the support of Knight Foundation, the video archive’s latest addition is a research library of nearly half a million hours of American television news broadcasts aired over the last four years. Kahle calls TV news just as ephemeral as websites, yet it is just as "pervasive and persuasive" in its influence on modern life, from culture to politics. The archive's latest project, TV News Search & Borrow, attempts to preserve those shows for future generations.
The project began in September 2012 with a collection of 350,000 news programs digitally recorded during the last three years from domestic TV networks and stations in San Francisco and Washington. It also includes a section devoted to news broadcasts about 9/11 from around the world. Thankfully the free service is searchable online by keywords.
Championing City-Wide Free Wireless Access
One of the Archive’s newer projects is to advocate for partnerships between municipal governments and nonprofit organizations to build city-wide, wireless backbones, like the one they’re helping to develop in the City of Richmond, California where the Internet Archive sends every physical book that has been digitized to their many climate-controlled shipping containers.
The Virtual Reading Room
The Internet Archive’s new virtual reading room project is a free service for research scholars. It will enable people to get access directly to the archive’s servers using virtual machine software. Researchers will be able run data mining algorithms to mine the huge amounts of information that the archive is compiling. To quote the Knight Foundation blog, "We believe this presents a powerful new model for digital libraries to support large-scale public interest data mining of the collections they hold in trust while being respectful stewards of that intellectual property. It will enable a new generation of research exploring the macro-level patterns of society itself."
What can you say about this all this ambitious work? TechSoup for Libraries founder, Sarah Washburn sums it up: “The Internet Archive is certainly the right organization to tackle such a vast and lofty goal. Brewster is a huge fan of libraries and the need for information to be freely available (the Archive is an official California library). A reading room for the masses would be pretty fantastic.”
Brewster Kahle received the 2013 LITA/Library High Tech Award for Outstanding Communication in Library and Information Technology this past year.
Images: courtesy of the Internet Archive
Have you used the Internet Archive? We'd love to hear about your experience. Please log in to tell us.