The Internet Archive's Map of Book Subjects

This map offers an alternative way to browse the 2,619,833 images contained in the Internet Archive's book collection. It shows 5500 different subjects which have been algorithmically arranged by their thematic relationships. The size of each link resembles the amount of images that are available for that topic. Clicking on a link will open the flickr page containing all the pictures for that subject. Rolling over a link will highlight all the topics that have a direct link with the subject.

The map can be dragged with your mouse and you can zoom in and out using the mouse wheel or multi-touch gestures. You can also use your browser's search box to find and highlight topics you are interested in (press CTRL-F on your keyboard to open it).

Note: on touch screens that do not support hovering the first click on a link will highlight the related topics and the second click will open the flickr page

The relationship data for this map has been generated by first retrieving all the tags of the Internet Archive's images on flickr and then connecting those subjects which appear together on an image. The resulting similarity matrix has been processed using the t-Distributed Stochastic Neighbor Embedding (t-SNE) technique which groups topics by the strength of their relationship. In the last step the layout gets cleaned up automatically so that no text blocks overlap.

The automatic nature of the process also explains some oddities in the resulting layout: sometimes a topic that is clearly part of a certain cluster is placed far away from it (e.g. "Locomotives" is not part of "Railroads"). Also some clusters that are thematically very close together do not appear as neighbors on the map (e.g. "Bees/Bees" is not part of the "Zoology" cluster). The reason for this is that no editor has created a single connection been between those subjects - probably because different people were involved in the classification of those books and were using a different set of labels for the same topic.

Created by Mario Klingemann @Quasimondo

Version 2.0, October 18th 2014