A weblog following developments around the world in FRBR: Functional Requirements for Bibliographic Records.

Maintained by William Denton, Web Librarian at York University. Suggestions and comments welcome at wtd@pobox.com.


Confused? Try What Is FRBR? (2.8 MB PDF) by Barbara Tillett, or Jenn Riley's introduction. For more, see the basic reading list.

Books: FRBR: A Guide for the Perplexed by Robert Maxwell (ISBN 9780838909508) and Understanding FRBR: What It Is and How It Will Affect Our Retrieval Tools edited by Arlene Taylor (ISBN 9781591585091) (read my chapter FRBR and the History of Cataloging).

Calendar

March 2007
M T W T F S S
« Feb   Apr »
 1234
567891011
12131415161718
19202122232425
262728293031  

28 March 2007

Comparing xISBN and thingISBN (3)

Filed under: Implementations, LibraryThing, OCLC — William Denton @ 7:31 am

Today I’m comparing how some of my mathematics books fare in LibraryThing’s thingISBN and OCLC’s xISBN services. Given an ISBN, they each return a list of ISBNs of other manifestations (that is, editions) of the same work. Other manifestations that they know about. Of course, if they don’t know about a book, or don’t think it matches with any others, or in LibraryThing’s case the users haven’t grouped it, they won’t have anything to say about it.

Here’s a table showing the results. Each book takes up two rows. Yes, the formatting is a bit ugly, but you can bear it. The top row has the title and author. On the second row are some numbers. The first is the combined and de-duped count of how many ISBNs both thingISBN and xISBN know about. Next is the thingISBN count, then the xISBN count, then the count taken from WorldCat’s Editions tab. (WorldCat’s numbers will never be greater than 25, because 25 is the limit of results it will show.) xISBN and WorldCat’s Editions tab are both from OCLC, but their sources aren’t always in sync. Follow the links to see the raw results.

Some things to notice about the list:

  • These books tend to the academic side of things, but some are quite popular. (As math books go.)
  • Most of them are paperback. University libraries would more likely have them in hardcover, however, xISBN is bound to do a good job of grouping the two together.
  • No-one on LibraryThing has my old Linear Algebra textbook. It’s probably not in use in first-year algebra courses now. My edition of Flatland is completely unknown to thingISBN, which is very surprising. No-one there has Mathematics and the Imagination either. My edition is a Penguin paperback, and I see it in used bookstores occasionnally. The latter two results are unexpected.
  • thingISBN has a 28 count for my edition of Gödel, Escher, Bach, but xISBN doesn’t know about any others. xISBN is failing, or missing something.
  • Forever Undecided by Raymond Smullyan (mine is a trade paperback) gets a 5 at thingISBN but just a 2 at xISBN. I imagine it’s in a lot of libraries, though.
  • My two volumes of Heath’s translation of Euclid give confusing results. Volume 1 isn’t matched up with other editions at either place. Volume 2 gets a 19 from xISBN, but has no companions at thingISBN. Strange. Is it something to do with being Dover reprints? All of the 0-486 books are from Dover, who do a great job of reprinting old math books. Perhaps it’s because Euclid’s Elements has a confusing printing history.
  • Gödel’s Proof by Nagel and Newman is a classic, and thingISBN gives an 8, but xISBN only 1. I’m sure it’s widely held in many libraries and personal collections, so xISBN is failing or missing something.
  • Most things that aren’t extreme cases or probable misses or mistakes do well at both places. For example, Bertrand Russell’s Introduction to Mathematical Philosophy and Boolos and Jeffrey’s Computability and Logic, both old textbooks and classics in their fields, do about equally well.
  • I’m a bit surprised by the number of cases where thingISBN knows more than xISBN.
x+t   t   x  WC
Alan Turing: The Enigma (Hodges)
  8   8   1   4 0099116413
On Numbers and Games (Conway)
  2   2   2   4 0121863506
Elementary Differential Equations with Applications (Penney and Edwards)
  9   6   5   6 0132541297
Linear Algebra (Insel, Spence, and Friedberg)
  6   0   6   8 0135370191
Gödel, Escher, Bach: An Eternal Golden Braid (Hofstadter)
 28  28   1   0 0140055797
Mathematics and the Imagination (Kasner and Newman)
  4   0   4  20 0140803882
Forever Undecided: A Puzzle Guide to Gödel (Smullyan)
  5   5   2   3 0192821962
Reflections on Kurt Gödel (Wang)
  2   2   1   0 0262730871
The Fifty-Nine Icosahedra (Coxeter et al)
  3   2   3   1 038790770X
Differential Equations and Their Applications (Braun)
  9   4   7   7 0387908064
Uses of Infinity (Zippin)
  3   0   3   4 0394015630
The Universal History of Numbers (Ifrah)
  9   9   2   2 0471375683
Introduction to Mathematical Philosophy (Russell)
  8   6   6  23 0486277240
The Thirteen Books of Euclid’s Elements (v 1) (Euclid and Heath)
  1   1   1   8 0486600882
The Thirteen Books of Euclid’s Elements (v 2) (Euclid and Heath)
 19   1  19  25 0486600890
On Formally Undecidable Propositions (Gödel)
  1   1   1   4 0486669807
Proofs and Refutations (Lakatos)
  4   2   4   8 0521290384
Philosophy of Mathematics (Putnam and Benacerraf)
  4   4   2   2 052129648X
Computability and Logic (Boolos and Jeffrey)
  8   7   8   6 0521389232
Flatland (Abbott)
 28   0  28  25 0631029605
The Man Who Knew Infinity (Kanigel)
  4   4   3   8 0684192594
Godel’s Proof (Nagel and Newman)
  8   8   1   2 0710070780
Geometry Revisited (Coxeter and Greitzer)
  3   2   2   3 088385600X
Calculus (Spivak)
  5   5   4   6 0914098772
x+t   t   x  WC

It would be interesting, though somewhat onerous, to do a more in-depth project comparing thingISBN and xISBN, perhaps by comparing results for random samples of different kinds of books from different kinds of libraries. This would tell us something about how well xISBN works and what sorts of books LibraryThing users have and how well they’ve made their clusters. On the other hand, if you’re actually implementing something and need the best results, the same holds true as yesterday: use both.

Upshot of this comparison based on a small sample of my math books: Sometimes xISBN misses manifestations that must be there; something about the data or its algorithm stops it from doing the clustering. Sometimes thingISBN doesn’t know anything about a given book. For best results, combine and de-dupe results from both services.

Tomorrow: who knows more about all the books in my library? Summary results only! No big table.

(Slightly edited after first posting.)