Rosetta disk

Source: TW

Concept

The Rosetta Disk is the physical companion of the Rosetta Digital Language Archive, and a prototype of one facet of The Long Now Foundation’s 10,000-Year Library. The Rosetta Disk is intended to be a durable archive of human languages, as well as an aesthetic object that suggests a journey of the imagination across culture and history. We have attempted to create a unique physical artifact which evokes the great diversity of human experience as well as the incredible variety of symbolic systems we have constructed to understand and communicate that experience.

The Disk surface shown here, meant to be a guide to the contents, is etched with a central image of the earth and a message written in eight major world languages:

“Languages of the World: This is an archive of over 1,500 human languages assembled in the year 02008 C.E. Magnify 1,000 times to find over 13,000 pages of language documentation.”

The text begins at eye-readable scale and spirals down to nano-scale. This tapered ring of languages is intended to maximize the number of people that will be able to read something immediately upon picking up the Disk, as well as implying the directions for using it—‘get a magnifier and there is more.’

On the reverse side of the disk from the globe graphic are over 13,000 microetched pages of language documentation. Since each page is a physical rather than digital image, there is no platform or format dependency. Reading the Disk requires only optical magnification. Each page is .019 inches, or half a millimeter, across. This is about equal in width to 5 human hairs, and can be read with a 650X microscope (individual pages are clearly visible with 100X magnification).

The 13,000 pages in the collection contain documentation on over 1500 languages gathered from archives around the world. For each language we have several categories of data—descriptions of the speech community, maps of their location(s), and information on writing systems and literacy. We also collect grammatical information including descriptions of the sounds of the language, how words and larger linguistic structures like sentences are formed, a basic vocabulary list (known as a “Swadesh List”), and whenever possible, texts. Many of our texts are transcribed oral narratives. Others are translations such as the beginning chapters of the Book of Genesis or the UN Declaration of Human Rights.

The Rosetta Disk is held in a four inch spherical container that both protects the disk as well as provides additional functionality. The container is split into two hemispheres with the three inch Rosetta Disk sitting in an indent on the flat meeting surface of the two hemispheres. The upper hemisphere is made of optical glass and doubles as a 6X viewer, giving visual access deeper into the tapered text rings. The bottom hemisphere is high-grade stainless steel. We have machined a hollow cylinder into the bottom hemisphere that holds a stainless steel ribbon for disk caretakers to etch their names, locations, and dates - hopefully creating a unique pedigree for each Rosetta object as it travels through time and human hands. A small stylus tool is included for future caretakers to add additional information.

At the very least, the Rosetta Disk provides an informative overview of human linguistic diversity in the 21st century. However, it may do much more. The translations on the disk, for example, are a close analog to the Rosetta Stone, whose parallel texts (in this case unintentionally) enabled the decipherment of Egyptian Hieroglyphics. It isn’t a great stretch to imagine that the language information on this Disk could provide the key to the (re)discovery of valuable society sustaining knowledge far into the future.

The Rosetta Disk is being designed and developed through the collaboration of artists, designers, linguists and archivists including Kurt Bollacker, Stewart Brand, Paul Donald, Jim Mason, Kevin Kelly, and Alexander Rose and Laura Welcher. Primary funding for the first Rosetta Disk and the project that grew out of it came from the generous support of Charles Butcher and the Lazy Eight Foundation.

Technology

For the extreme longevity version of the Rosetta database, we have selected a new high density analog storage device as an alternative to the quick obsolescence and fast material decay rate of typical digital storage systems. This technology, developed by Los Alamos Laboratories and Norsam Technologies, can be thought of as a kind of next generation microfiche. However, as an analog storage system, it is far superior. A 2.8 inch diameter nickel disk can be etched at densities of 200,000 page images per disk, and the result is immune to water damage, able to withstand high temperatures, and unaffected by electromagnetic radiation.(5) This makes it an ideal backup for a long-term text image archive. Also, since the encoding is a physical image (no 1’s or 0’s), there is no platform or format dependency, guaranteeing readability despite changes in digital operating systems, applications, and compression algorithms.

Reading the disk requires a microscope, either optical or electron, depending on the density of encoding and could be combined with an Optical Character Recognition system to read the text back into digital formats relevant at the time of reading. We are keeping our encoding at a scale readable by a 1000X optical microscope, giving us a total disk storage capacity of around 30,000 pages of text.

Embracing the archive principle, “Lots of Copies Keep Stuff Safe”, we intend to create a version of the disk that can be mass produced and broadly distributed. Disk production costs are currently very high, however we are exploring ways around this, possibly using a different kind of disk material. For now, we are working on developing a limited edition disk prototype.

Wearable

After three and a half years of research and development, we are delighted to announce the release of our first Rosetta Wearable Disk. This version of the Rosetta Disk uses a similar manufacturing process as the first edition of the Rosetta Disk, with the resulting archive microscopically formed in nickel and readable with optical magnification. The main difference is that the final archive is about 2 centimeters in diameter, making it a size that can be comfortably worn on the human body.

To develop the Rosetta Wearable Disk, we have been working with the company NanoRosetta, which uses a process that is faster than the one used to make the original Rosetta Disk, and that does not make use of silicon or a focused ion beam. This new process is similar to microchip lithography, and uses a highly focused laser to write directly onto a photosensitive material coated on a glass plate. The recorded features are then developed like film to form the microscopic information. Next the plate is electroformed, resulting in a thin disk made of solid nickel. The information on the disk is raised very slightly from the surface (the text looks embossed) and can be read just like you would read pages in a book, but with optical magnification.

Rosetta Wearable and Rosetta V1 Disk Given the new process is reliable, fast, and less expensive than the one used for the original Rosetta Disk, the Rosetta Wearable Disk is the first version of the Rosetta Disk that could potentially meet the long-desired goal of broad dissemination, in keeping with the long-term archiving strategy of LOCKSS (“lots of copies keeps stuff safe”).

This wearable version, like the original Rosetta Disk has two sides. One side has instructions in eight different languages and scripts (Bahasa Indonesia, English, Hindi, Mandarin, Modern Standard Arabic, Spanish, Swahili, and Russian). The instructions translate into English as “Languages of the world: This is an archive of over 1,000 human languages assembled in the year 02016 C.E. Magnify 100 times to find over 1,000 pages of language documentation.” Each instruction starts at a human-eye readable size, and then spirals inward around a globe graphic, ending at the microcopic scale. This indicates to the reader “find something to magnify this with, and there is more.”

The other side of the pendant contains the archive, with over 1,000 microscopic pages. While the smaller size of the disk is an advantage for portability, it imposed a new constraint of having less surface space that the archive contents can occupy. So to keep the information or “pages” in the archive at the size where they can be read with optical magnification, we needed to fit roughly 1000 or fewer of them on the disk. The original Rosetta Disk has over 1,500 languages and 13,000 pages of information, so this meant we needed to include fewer languages, fewer pages for each language, or some combination of the two. Yet constraints breed creativity, and we chose to meet this new challenge by altering the contents somewhat, which now include:

The Preamble to the Universal Declaration of Human Rights (327 languages)
Swadesh vocabulary lists assembled by the PanLex Project (719 languages)

“The Clock of the Long Now” by Stewart Brand
Updated diagrams for the 10,000 Year Clock

The contents are in keeping with the original Rosetta Disk in that they represent many of the world’s human languages. The language contents are also parallel, that is, the same information for each language. The two main kinds of linguistic content are still a parallel text and parallel vocabulary list. The text we have chosen is the Preamble to the Universal Declaration of Human Rights (“UDHR”), which is available in over 300 languages, and the parallel vocabulary are Swadesh vocabulary lists compiled by Long Now’s PanLex Project.

Rosetta Wearable Archive Side In a major departure from how the contents of the original Rosetta collection were assembled, the Universal Declaration and PanLex data are all “born digital”. This meant we had a lot of control over font and font size, but also entailed making choices. The goal was to maximize the amount of language content on the disk while preserving maximum legibility.

In order to make these decisions, we needed access to better microscope equipment to examine the texts in detail at the character level. So in November 02016, The Rosetta Project submitted a proposal and was awarded access to staff and facilities at Lawrence Berkeley National Lab . The successful proposal, titled “The Rosetta Disk – An Exploration into Very Long-term Archiving” focused on the need for access to high-powered microscopes and imaging technology available at the Lab to prepare and evaluate components of a new Rosetta Disk prototype. The user program provided Rosetta Project staff access to the Molecular Foundry, Advanced Light Source, and National Center for Electron Microscopy.

Another goal of this proposal was to develop long-term relationships with the staff and scientists at the Lab who have interest in exploring new materials and methods for long-term archiving, which may be very helpful in developing new versions of the Rosetta Wearable Disk. Some intriguing new possibilities have already emerged from early discussions, including ways to “stamp” new disks, ways to protect the disk through atomic layer deposition, and possible methods of creating polychromatic disks which would allow for archiving of color graphics, photography and art. These techniques may allow us to radically change not only how we archive, but what we are able to archive for the long-term as well.

Another advantage to having “born digital” material is we can make the contents of the wearable Rosetta Disk available as open digital data as well as a physical artifact. We hope this will allow for all kinds of interesting experimentation in the archival longevity of both forms. The Universal Declaration of Human Rights collection we will be using are all available in Unicode, which is a much preferred long-term format, and the PanLex Swadesh lists are now part of the Natural Language Toolkit collection and available as a corpus for computational tinkering. We also greatly benefitted from the impressively multilingual suite of Google Noto fonts, which we used throughout the archive.

Here is a tool where you can zoom all the way in and see the contents of the Rosetta Wearable Disk up close (it works on mobile devices and tablets as well, and best if you expand to full screen) …

[Update: December 11, 02023] A limited supply of new Rosetta Archive Pendants is available to donors of $1,000 or more. You can learn about this new edition at the Artifacts Page or start your donation now.