We are now testing a new subject browsing mechanism on The Online Books Page. This browser allows you to browse books by detailed subject terms. Like many library catalogs, it lets you browse terms alphabetically. But it also lets you look at clusters of books and topics around a particular subject area. The underlying mechanism is designed to be scalable and adaptable to a variety of collections.
When you're in the alphabetic view, you can browse through a subject list that runs from A to Z, like a dictionary or encyclopedia. The list will show you how many books are directly under a subject term, and whether it has subtopics (narrower terms). The list will also show you some alternative topic names that are commonly used, or have been used in the past, with links to the terms now preferred in most library catalogs.
You can page back and forth in the listings, or jump to a particular starting letter, or type in the start of a term that interests you, to see what shows up. Clicking on a term in the alphabetic view brings you to a cluster view for that term. The terms you see here are also terms you're likely to find used in your local library catalog.
When you're in the cluster view (see example for Women), the page has two halves. On the left half, you might see descriptions of your subject term, notes on what gets filed under that term, and related terms. On the right half, you'll see books that were classified under the term, and, if there's room, you'll also see books filed under some of the related terms. All the terms are clickable, so you can easily navigate to broader, narrower, or related subjects, and see books filed in a range of subjects, so you don't have make lots of clicks to zero in on them. You can also go back to the alphabetical view by clicking on "Browse alphabetical list" near the top of the page. This will put you in that list positioned at your current subject. (You can also type in another subject to start browsing from in the alphabetic list.)
Up until recently, we didn't have subject headings for most of the books on the Online Books Page. Most of the books have recently been assigned a single subject term through a largely-automated process, based on their call number. The automatically assigned heading might not be particularly precise, complete, or in some cases, even accurate. But most of them should match the basic subject area of the book.
We have done some more precise cataloging, focusing largely on selected areas. For instance, we have fairly detailed cataloging on many books related to women, and related to science and engineering. Other areas may have substantially sparser cataloging at present.
As of April 9, 2008, the following call number ranges have full subject cataloging:
Please let us know if there are any books that are clearly not filed in the right places. We also hope to have more areas with detailed cataloging in the future (hopefully through automated ingestion of pre-existing catalog records).
You'll probably want to move back and forth between the alphabetic view and the cluster view if you want to find all the topics related to your area of interest. While many topical relationships are included here, many are not, particularly when the related topics are alphabetically close. (This has to do with the way that subject authority structures were first constructed, back in the days of card catalogs. See below for more about the inner workings involved.)
The subject views here are built using the data in "authority files", which are library-maintained databases of preferred subject terms, related terms, aliases, etc., that are used for cataloging. This program uses Library of Congress Subject Headings records from the Penn Library, which are in turn coordinated with records at other agencies, in particular the Library of Congress Authorities.
We use these records to build a graph of subject terms and relationships. Then we use the bibliographic records of a particular collection, and a lexical analysis of their subject terms, to construct an overlay of the graph that's tuned to the particular terms used in that collection, adding and subtracting terms and relationships to ensure full coverage and no dead ends.
In this case, the collection is The Online Books Page, which has about 25,000 records. In principle, we could also do this for any other bibliographic collection that is based on the same ontology as the underlying authority file. The graph scales with the number of headings used in the collection, not the number of items in the collection, and most of the graph analysis is localized, so we are hopeful that our mechanisms could adapt fairly well to other, larger collections. (In the Penn Library, for instance, we've also experimented with applying it to the full Franklin catalog.)
On our initial Online Books beta test, we have not adjusted the authority structure at all from what's in Penn's library system. This means that many potentially useful links are missing, and some odd links are present. In a future release, we hope to provide mechanisms to tune the underlying structure to a particular collection or usage pattern, without having to muck directly with the authority records in our library catalog.
We're very interested to hear what reactions you have to this beta test. Please write to the address below if you have questions, comments, or suggestions.
Home -- Search -- New Listings -- Authors -- Titles -- Subjects -- Serials
Books -- News -- Features -- Archives -- The Inside Story
Copyright 1993-2007 by John Mark Ockerbloom (onlinebooks@pobox.upenn.edu)