A weblog following developments around the world in FRBR: Functional Requirements for Bibliographic Records.

Maintained by William Denton, Web Librarian at York University. Suggestions and comments welcome at wtd@pobox.com.


Confused? Try What Is FRBR? (2.8 MB PDF) by Barbara Tillett, or Jenn Riley's introduction. For more, see the basic reading list.

Books: FRBR: A Guide for the Perplexed by Robert Maxwell (ISBN 9780838909508) and Understanding FRBR: What It Is and How It Will Affect Our Retrieval Tools edited by Arlene Taylor (ISBN 9781591585091) (read my chapter FRBR and the History of Cataloging).

Calendar

March 2010
M T W T F S S
« Feb    
1234567
891011121314
15161718192021
22232425262728
293031  

Work superclusters

Posted by: William Denton, 10 December 2008 7:46 am
Categories: Aggregates, OpenFRBR

I wanted lots of Harry Potter ISBNs, so I was doing some superduping. For example, I superduped 1551922460, the ISBN of the 1999 hardcover Raincoast Books manifestation of Harry Potter and the Prisoner of Azkaban.

If you combine and dedupe them, you get 169 ISBNs. If you superdupe them, you get 1083. That is, you take the first number from thingISBN that xISBN didn’t tell you about, and ask xISBN about it. That gives you a new set of ISBNs. Do that for all the thingISBN-only numbers, then reverse the process and ask thingISBN about the numbers xISBN told you about. Repeat, back and forth, until you’ve exhausted both sides and pulled all of the ISBNs out of their different partitions and put them into one big bucket.

I did this for all of the Harry Potter books, and after careful examination my keen eyes noticed something:

   ISBNs Title
    1083 isbns-01-philosophers-stone.txt
    1083 isbns-02-chamber-of-secrets.txt
    1083 isbns-03-prisoner-of-azkaban.txt
    1083 isbns-04-goblet-of-fire.txt
    1083 isbns-05-order-of-the-phoenix.txt
    1083 isbns-06-half-blood-prince.txt
     121 isbns-07-deathly-hallows.txt
       3 isbns-0x-beedle.txt
      53 isbns-0x-scamander.txt

Superduping the ISBNs of the first six Harry Potter books had given 1083 ISBNs for each! And sure enough they’re the same 1083 ISBNs. What’s going on here is that because of boxed sets and other collections, and possibly incorrect work-groupings by hand and by algorithm, once you start looking at one Harry Potter book through xISBN and thingISBN, you end up looking at all of them. Or almost all. The seventh one stands alone, but I think that will change in a year or two, and it will fall in with the others.

This work supercluster includes all of the Harry Potter books, the movies, some soundtracks, some scores, some derivative works like pop-up books, and more. It also includes books by Carl Sagan, Philip Pullman, and C.S. Forester (!).

This supercluster phenomenon is interesting. In part it’s caused by collected editions and boxed sets and no easy standard way of handling two works in one manifestation. Human and machine error is also involved. xISBN and thingISBN aren’t perfect, and superduping their results compounds errors from one into the other and you can end up with a bit of a mess.

(I tried superduping Pride and Prejudice and stopped when I started getting into the complete works of Shakespeare. I’ll post more about that if I try it again, but perhaps all the great works of English literature are in one giant confused FRBRy supercluster.)

Full FRBRization, where relationships between works and aggregate works (such as boxed sets and omnibus editions) are clearly specified, will mean this isn’t a problem. That’ll be a lot of work, though.

Using isbn2marc I found MARC records for 978 of the 1234 total ISBNs.

978 Harry Potter-related MARC records (1 MB MARC)

I ran them through the LC FRBRization tool and put them into OpenFRBR.

~/src/openfrbr$ ./script/console
Loading development environment (Rails 2.1.0)
>> Work.find(:all).size
=> 171
>> Expression.find(:all).size
=> 471
>> Manifestation.find(:all).size
=> 973
>> Person.find(:all).size
=> 22
>> Creation.find(:all).size
=> 138

OpenFRBR 2.1

Posted by: William Denton, 5 December 2008 7:50 am
Categories: OpenFRBR

I’ve been hacking away the last few days and got a few good things working in OpenFRBR so I called it 2.1 and put it up.

  • Click to edit attributes of works and expressions (but reload the page to see the changes in the heading, because I don’t know how to update two elements at once yet). No Submit button necessary, because I use in_place_editor, some Ajax magic.
  • You can reassign an expression to a different work using the select menu at the top of the page. Look at the works list and find a work that has one or two expressions. The name (identifier) will end in “novel/1″ or “novel?/2″ or something like that. Look at the work, then look at its expression, then reassign it. Delete the work with zero expressions, if you want. This makes it easier to fix the mistakes made by an FRBRizing algorithm (like the Library of Congress’s, which I used on 160 or so Harry Potter bib records).
  • I updated isbn2marc so it uses a WorldCat API key if you have one. This really speeds things up and makes it easier to get lots of MARC records. I originally wrote isbn2marc because I had no access to WorldCat and needed to poll any open Z39.50 server I could find to get MARC records. Now I do have access to it, which is nice, but the script works as well as ever if you don’t use WorldCat. I think I got MARC records at WorldCat for 140 or 150 ISBNs out of the 180 I had, and I found 10 or 20 more at open Z39.50 servers.

There are character set problems but I”m just ignoring them until the next version of Ruby.

Working on this has given me a better idea of what to do next, but what with Christmas and all I may not get much hacking done the next little while. Some parts of the entity relationships aren’t all fleshed in, but I’ll probably take the Work-Expression relation and expose the relation attribute there and make that all nice and editable. After that I could use the same approach for other entities and relationships, that is, doing for the Embodiment relation (Expression-Manifestation) what I do with Reification (what I call the Work-Expression relation). I haven’t documented the data model yet but you can check the source code, or ask.


OpenFRBR and Azkaban test

Posted by: William Denton, 17 November 2008 7:55 am
Categories: OpenFRBR

OpenFRBR’s showing a bunch of Harry Potter and the Prisoner of Azkaban stuff now. I got ISBNs from thingISBN and xISBN, got MARC records (found them for about half of the ISBNs), processed it with the new version of the LC FRBRization tool (temporary location) that Jodi Schneider found out about, cleaned up some MARC records with bad 240/243/245 fields, and loaded it up into OpenFRBR. I’ll post some thoughts about all this later. I’m getting a better idea of what’s involved.


OpenFRBR and LC Fifth Business test

Posted by: William Denton, 14 November 2008 7:22 am
Categories: OpenFRBR, Uncategorized

I took an ISBN for a manifestation of Fifth Business, the great novel by Robertson Davies, and superduped it with my new superduping script. That left me with a list of ISBNs that xISBN and thingISBN think are other manifestations of the same work. I ran those ISBNs through isbn2marc and dumped out MARCXML. I ran that MARCXML through the LC FRBR Display Tool and ended up with an XML file that looks in part like this:

<work>
   <mods:name type="personal">
      <mods:namePart>Davies, Robertson</mods:namePart>
      <mods:role>
         <mods:text>creator</mods:text>
      </mods:role>
   </mods:name>
   <mods:titleInfo>
      <mods:title>World of wonders</mods:title>
   </mods:titleInfo>
   <expression>
      <mods:typeOfResource>text</mods:typeOfResource>
      <manifestation>
         <imprint>
            <mods:titleInfo>
               <mods:title>World of wonders</mods:title>
            </mods:titleInfo>
            <mods:note type="statement of responsiblity">Robertson Davies.</mods:note>
            <mods:originInfo>
               <mods:publisher>Penguin books</mods:publisher>
               <mods:dateIssued>1977</mods:dateIssued>
            </mods:originInfo>
            <mods:physicalDescription>
               <mods:extent>315 p ; 20 cm.</mods:extent>
            </mods:physicalDescription>
            <mods:identifier type="ISBN">0-14-016796-X (pbk)</mods:identifier>
         </imprint>
      </manifestation>
   </expression>
</work>

I wrote a script that reads that file, picks out the bits that OpenFRBR can use, and loads it up into the system. (script/runner lets you use the Rails environment from the command line.)

The results are on OpenFRBR now. I’m going to keep on this track for a bit to see how much I can get out of the LC tool and what bits are missing. I ran into some trouble along the way with character encodings but for now I just remove troublesome MARC records. Now I can go give Jodi Schneider’s paper a good read!


OpenFRBR 2.0

Posted by: William Denton, 29 September 2008 7:28 am
Categories: OpenFRBR

Almost two years ago I posted OpenFRBR Manifesto Number One. Then nothing happened. Now I’m going to take another stab at it.

Over July and August, mostly while on the train back and forth to the IFLA conference and then while I was on holiday, I hacked up OpenFRBR 2.0. It’s a simple partial implementation of FRBR done in Ruby on Rails. You can go in and fool around; in fact, you can add or delete anything. I’ll reset it back to the initial test data when it needs it. Or you can download it and run it at home (I think you’ll need Rails 2.1.0 exactly).

I hope to:

  • Polish up some of the big missing bits (like work-to-work relationships), then get into the smaller things that need to be done. Or maybe I’ll do it the other way around.
  • Learn more about Ruby and Rails as I go. There’s much to be improved in this code.
  • Make it all available through Git so it’s easy for other people to get involved.
  • Write here and elsewhere about how it goes.
  • Work FRAD into it.
  • Use Robert Maxwell’s insights in FRBR: A Guide for the Perplexed to improve the data modelling.
  • Use the Library of Congress and OCLC algorithms with xISBN and thingISBN and other sources of data so that someone can enter in an ISBN and lots of related manifestations will get FRBRized and slotted into place.
  • Use the Harry Potter bibliographic universe as the test domain.
  • Show a working example of what a FRBRy catalogue could look like.
  • Have library geek fun.

Leaving a comment is probably best if you have any questions. I’ll be at the Access 2008 conference the rest of this week and will be glad to have a chat about this.


My Library Geek talk with Dan Chudnov

Posted by: William Denton, 6 November 2006 7:31 am
Categories: Audio/Video, Blog Mentions, OpenFRBR

You already knew I’m a library geek, but now I’m one of the Library Geeks. Last month at the Access 2006 conference in Ottawa Dan Chudnov and I had a lengthy chat in the bar of the Chateau Laurier hotel and now it’s online.

Library Geeks 008 – FRBR and OpenFRBR has the show notes with some links and a correction to a mistake I made about Canadian history (the Last Spike wasn’t gold and it was placed in 1885). You can subscribe to the podcast feed (updated 9 November to point to the right place) to get all the shows, or just listen to our talk about FRBR and OpenFRBR (38.5 MB MP3, 85 minutes).

It was great to meet Dan and I enjoyed the talk very much. What could be more fun than two library geeks talking about FRBR while drinking wine (you’ll hear the waitress come by to ask if we want another glass) in the bar of one of Canada’s best hotels?


OpenFRBR Manifesto Number One

Posted by: William Denton, 1 November 2006 7:23 am
Categories: OpenFRBR

OpenFRBR says it will build a complete free implementation of FRBR (Functional Requirements for Bibliographic Records).

OpenFRBR says, “Everyone FRBRize everything.”

OpenFRBR says that the entities, the relationships, and the user tasks are all equally important.

OpenFRBR says that both people and machines need good interfaces.

OpenFRBR says it will borrow the algorithms it can and invent the ones it must.

OpenFRBR says it is not an integrated library system. OpenFRBR says, “That which is not FRBR belongs to that which is not OpenFRBR.”

OpenFRBR says it is under the MIT License.

OpenFRBR looks at FRAR (Functional Requirements for Authority Records) and says, “Everyone FRARize everything.” When FRSAR (Functional Requirements for Subject Authority Records) is ready, OpenFRBR will look at it and say, “Everyone FRSARize everything.” Everything that OpenFRBR says about FRBR it says about FRAR and FRSAR.

— William Denton <wtd@pobox.com> (1 November 2006)


OpenFRBR

Posted by: William Denton, 3 September 2006 7:29 am
Categories: OpenFRBR

Watch out for an announcement, perhaps at the start of October, about OpenFRBR.

What is OpenFRBR? Well, the motto is “Everyone FRBRize everything.” That should give you a hint.