A time capsule of computer code is buried deep in the Arctic
- Buried underground near the Svalbard Global Seed Vault is the Arctic World Archive safeguarding humanity's books, documents, and data.
- The Archive includes the massive GitHub library of software code behind the world's open-source applications.
- Information in the vault is stored on special media said to be durable for 1,000 years.
For a place that's so cold, Norway's Svalbard archipelago is downright hot when it comes to safeguarding some of humanity's most precious stuff. We've written before about the Svalbard Global Seed Vault that holds the world's backup supply of seeds capable of replanting our planet's flora should some horrible catastrophe occur. Since 2017, there's been another critical repository embedded about 91 meters down in that Svalbardian mountain: It's called the Arctic World Archive (AWA) and it holds the world's books, documents, and data from across the globe.
The Arctic World Archive
The AWA describes itself as "home to manuscripts from the Vatican Library, political histories, masterpieces from different eras (including Rembrandt and Munch), scientific breakthroughs and contemporary cultural treasures." Government and research facilities can store their data at AWA, as can private companies and individuals, for a price.
"Our ambition is to be a secure world archive to help preserve the world's digital memory and ensure that the world's most irreplaceable digital memories of art, culture and literature are secured and made available to future generations." — Arctic World Archive
AWA's first deposits were made by the National Archives of Mexico and Brazil, and have been joined by a growing number of entities from over 15 countries. These include the National Museum of Norway, the European Space Agency, the Museum of the Person, and major global corporations.
GitHub’s vault with a vault
Within the AWA is the GitHub Arctic Code Vault, located roughly 76 meters below the Svalbard surface. GitHub is the preeminent library of programming code for those who develop open-source software applications. Each directory — think: folder — of code is a GitHub repository. Together, it's a massive resource used continually by countless programmers storing and sharing their source code. GitHub says it has 37 million users and holds over 100 million repositories.
21 terabytes of GitHub data have already been moved to the code vault — or copied, presumably, since GitHub remains an active day-to-day resource — beginning with the 2019 deposit of 6,000 of the most important repositories GitHub held at the time. The latest transfer contains a snapshot of all of GitHub's active libraries as of February 2, 2020.
Says GitHUb's director of strategic programs, Julia Metcalf, "Our mission is to preserve open-source software for future generations by storing your code in an archive built to last a thousand years." It's hoped that the source code in the vault will provide insight into today's programming and provide a trail of bread-crumbs that reveals the workings of apps from our era, apps that may become foundational for future applications.
How to store data for the future
The lifespan of any given storage medium is brief. Gone the way of the dinosaurs are floppy disks, cassettes, and so on — a 10-year-old may even wonder what a CD was. "It is easy to envision a future in which today's software is seen as a quaint and long-forgotten irrelevancy, until an unexpected need for it arises," says the GitHub Archive Program website. So, AWA data is stored on a specially developed, digital archival film called piqlFilm — GitHub alone has filled up 186 reels of it so far. This may at first seem sort of a retro approach, but it's not.
piql, one of the two partners behind the AWA, developed the film. The company claims it can "keep data alive" for over 1,000 years, so long as one has an app that can read it, such as the open-source app GitHub has created. piql asserts that their film has undergone "extensive longevity testing," and can withstand electromagnetic exposure.
piqlFilm is made up of layers of silver halide on a polyester backing. The data, when written, looks similar to a QR code, although it can hold far more information: Each frame in piqlFilm can pack about 8.8 million microscopic pixels. A reel of piqlFilm loaded with these frames is almost a kilometer long and can thus store a truly massive amount of data.
Of course, it remains impossible to guess the capabilities of future humans (presumably) trying to decode all this data, so GitHub has a backup plan, a human-readable document called the "Tech Tree," which they describe as "a roadmap and Rosetta Stone for future curious minds inheriting the archive's data."
Warming up to Svalbard
Svalbard has a number of attributes that have made it attractive as a permanent storage site. It's a demilitarized zone by agreement between 42 nations. It's also quite remote. Plus, it's very cold and dry, for now.
When the seed vault was first contemplated, Svalbard seemed a place that could be counted on to remain frigid, with the underground vaults dug deep into the area's permafrost safe from moisture damage. However, conditions are changing more rapidly than anticipated thanks to climate change. The Arctic, says NOAA, is warming at "twice the rate relative to the rest of the globe."
Between 1971 and 2017, the temperature in the Svalbard area has risen by 3-5° Celsius. Svalbard's current average temperature is -8.7° C, but models suggests that with moderate global emission levels going forward it will go up by 7° C, and with heavy emissions up by 10° C.
Already, there has been at least one incident of ice melting and then freezing in the entrance to a seed vault tunnel. Also, less snow and ice means more rain, which can cause landslides in the previously stable local environment, and glaciers nearby are breaking up more frequently.
The seed vault's managers say, for now, that it looks like their vaults will be okay, and the people running the AWA and the GitHub Arctic Code Vault are also optimistic.