A weblog following developments around the world in FRBR: Functional Requirements for Bibliographic Records.

Maintained by William Denton, Web Librarian at York University. Suggestions and comments welcome at wtd@pobox.com.


Confused? Try What Is FRBR? (2.8 MB PDF) by Barbara Tillett, or Jenn Riley's introduction. For more, see the basic reading list.

Books: FRBR: A Guide for the Perplexed by Robert Maxwell (ISBN 9780838909508) and Understanding FRBR: What It Is and How It Will Affect Our Retrieval Tools edited by Arlene Taylor (ISBN 9781591585091) (read my chapter FRBR and the History of Cataloging).

Calendar

March 2010
M T W T F S S
« Feb    
1234567
891011121314
15161718192021
22232425262728
293031  

Work superclusters

Posted by: William Denton, 10 December 2008 7:46 am
Categories: Aggregates, OpenFRBR

I wanted lots of Harry Potter ISBNs, so I was doing some superduping. For example, I superduped 1551922460, the ISBN of the 1999 hardcover Raincoast Books manifestation of Harry Potter and the Prisoner of Azkaban.

If you combine and dedupe them, you get 169 ISBNs. If you superdupe them, you get 1083. That is, you take the first number from thingISBN that xISBN didn’t tell you about, and ask xISBN about it. That gives you a new set of ISBNs. Do that for all the thingISBN-only numbers, then reverse the process and ask thingISBN about the numbers xISBN told you about. Repeat, back and forth, until you’ve exhausted both sides and pulled all of the ISBNs out of their different partitions and put them into one big bucket.

I did this for all of the Harry Potter books, and after careful examination my keen eyes noticed something:

   ISBNs Title
    1083 isbns-01-philosophers-stone.txt
    1083 isbns-02-chamber-of-secrets.txt
    1083 isbns-03-prisoner-of-azkaban.txt
    1083 isbns-04-goblet-of-fire.txt
    1083 isbns-05-order-of-the-phoenix.txt
    1083 isbns-06-half-blood-prince.txt
     121 isbns-07-deathly-hallows.txt
       3 isbns-0x-beedle.txt
      53 isbns-0x-scamander.txt

Superduping the ISBNs of the first six Harry Potter books had given 1083 ISBNs for each! And sure enough they’re the same 1083 ISBNs. What’s going on here is that because of boxed sets and other collections, and possibly incorrect work-groupings by hand and by algorithm, once you start looking at one Harry Potter book through xISBN and thingISBN, you end up looking at all of them. Or almost all. The seventh one stands alone, but I think that will change in a year or two, and it will fall in with the others.

This work supercluster includes all of the Harry Potter books, the movies, some soundtracks, some scores, some derivative works like pop-up books, and more. It also includes books by Carl Sagan, Philip Pullman, and C.S. Forester (!).

This supercluster phenomenon is interesting. In part it’s caused by collected editions and boxed sets and no easy standard way of handling two works in one manifestation. Human and machine error is also involved. xISBN and thingISBN aren’t perfect, and superduping their results compounds errors from one into the other and you can end up with a bit of a mess.

(I tried superduping Pride and Prejudice and stopped when I started getting into the complete works of Shakespeare. I’ll post more about that if I try it again, but perhaps all the great works of English literature are in one giant confused FRBRy supercluster.)

Full FRBRization, where relationships between works and aggregate works (such as boxed sets and omnibus editions) are clearly specified, will mean this isn’t a problem. That’ll be a lot of work, though.

Using isbn2marc I found MARC records for 978 of the 1234 total ISBNs.

978 Harry Potter-related MARC records (1 MB MARC)

I ran them through the LC FRBRization tool and put them into OpenFRBR.

~/src/openfrbr$ ./script/console
Loading development environment (Rails 2.1.0)
>> Work.find(:all).size
=> 171
>> Expression.find(:all).size
=> 471
>> Manifestation.find(:all).size
=> 973
>> Person.find(:all).size
=> 22
>> Creation.find(:all).size
=> 138

Adams and Santamauro, Successive Entry etc.

Posted by: William Denton, 20 October 2008 12:42 pm
Categories: Aggregates

“Successive Entry, Latest Entry, or None of the Above? How the MARC21 Format, FRBR and the Concept of a Work Could Revitalize Serials Management,” Kurt Blythe’s notes on a talk by Katherine Adams and Britta Santamauro, The Serials Librarian 54 (3/4), 2008.

ABSTRACT: Current cataloging practices are insufficient to the task of providing access to serial content. The presenters acknowledge that RDA (Resource Description and Access), FRBR (Functional Requirements for Bibliographic Records) and the CONSER (Cooperative Online Serials) standard record will improve libraries’ ability to respond to the exponential increase in information resources brought about by the Internet but feel that more may be done. Adams and Santamauro propose to apply the concepts of FRBR to serialscataloging and database design, in addition to the user interface, thereby saving time in cataloging and providing the user with cleaner records.

Also noted by the Serials Cataloger. I’ve got net connection problems and am running behind.


Working Group on Aggregates

Posted by: William Denton, 18 August 2008 10:48 am
Categories: Aggregates, Conferences, IFLA

Thursday, the last day of programming at IFLA 2008, was when the Working Group on Aggregates met. Ed O’Neill of OCLC is the chair, and there were six others at the table, including Barbara Tillett and Judy Kuhagen of the Library of Congress. The three of them were the most active in the discussion, though I think Maja Žumer would have spoken up too had she been there. About a dozen people watched, most of whom I recognized from earlier FR* meetings.

There were three handouts for group members, and, generously, we observers all got a copy. The discussion centred around two things on the agenda, so I’ll just summarize what was said. No decision was reached, so there’s no official answer on how FRBR will handle aggregates yet.

Here are the “data model principles” that had been set out in some earlier discussions to help test the three ways of modelling aggregates;

  • Inheritance: Properties (attributes and relationships) are inherited by subordinate entities (children) from superior entities. The properties of a work are inherited by its expressions; the properties of an expression are inherited by its manifestations; and the properties of a manifestation are inherited by its items.
  • Universality: If an entity is a work in any of its manifestations, it must be a work in all of its manifestations. [O'Neill explained: the criteria for deciding if something is a work shouldn't depend on its manifestation.]
  • Distinctness: A non-aggregate work is the smallest distinct and autonomous entity.

After that they got on to the three models they had been considering:

  • Work-of-Parts Model: The aggregate as a whole is a work; individual components are simply parts of the work. [Tillett pointed out that "component" already means "component work" in FRBR and they didn't want to confuse the terminology.]
  • Manifestation-of-Works Model: The manifestation is an aggregate of works and may include an “aggregating” work.
  • Work-of-Works Model: Aggregates are works that are comprised of other works.

No-one had any other models to consider, and nobody there thought the first one was valid, so it came down to Manifestation-of-Works (with O’Neill as main advocate) vs. Work-of-Works (with Tillett as main advocate).

A few points from the discussion:

  • Tillett said inheritance holds down through Work, Expression, Manifestation, Item, but there’s no inheritance between Works in the Work-of-Works model. There is a Work-to-Work relation (whole-part, sequential, etc.) between them, with no inheritance happening.
  • There’s a difference between an aggregate that’s an augmentation, for example adding an introduction to a novel, and one that’s a collection, for example putting the three novels of a trilogy together into one book. Kuhagen made the distinction between pre-formed and post-formed collections, where the difference is when the decision to make the collection was made.
  • But none of that matters for this discussion, Tillett said, and she said something she repeated a few times through the meeting: FRBR is meant to work at a high conceptual level, and not get into applications. (Applications as in “the model applied to this situation,” not application as in “computer software.”) FRBR should only think about aggregates in general, any kind of aggregate, and not worry what type an aggregate is. That level of detail belongs in applications of the model, where people can make their own local rules and interpretations.
  • O’Neill said there would probably eventually be a subclassing of aggregates, because they are not all the same. This could be done in a FRBRoo way.
  • What if in different applications of FRBR there are different rules about what a work is? O’Neill asked. How can we share information if we don’t agree on what a work is? “We can link it,” Tillett said. She agreed that something could be a work in one application but not in another, though theoretically it really is a work.
  • O’Neill said that many things are works don’t need to be recognized as such, for example a very short introduction to a book. But that’s a local implementation decision, and if the introduction were to be recognized, it would have to be recognized as a work of its own. Everyone agreed on that. The universality rule was reworded to use “can be” instead of must: “If an entity is a work in any of its manifestations, it can be a work in all of its manifestations.”
  • There was some discussion about the distinctness principle and whether it begged the question of an aggregate being a work of works. They all went to a draft report and looked at something and clarified that a list was ORed, not ANDed.
  • In the end only the distinctness criteria was seen as useful: inheritance doesn’t apply to Work-of-Works and universality as modified is a given so it’s not useful as a test.
  • They got into a metaphor next to think about an “aggregating work.” Take Patrick Le Boeuf’s special FRBR issue of Cataloging & Classification Quarterly, which contained articles by many people. It was also released as a book. O’Neill said each chapter is an independent work, but Le Boeuf’s role as an editor is that of an aggregator. Each article is like a brick, and Le Boeuf’s intellectual contribution was to make the mortar that holds all of the bricks together into a new work. That mortar is an “aggregating work.” The aggregate work is the whole wall of bricks and mortar.
  • Tillett said there was no need for this. The articles and the books are all just works, and they are related to each other. That’s all.
  • Gordon Dunsire spoke up from the audience with an objection to this “aggregating work” idea, I think saying that if you take three bricks and put them together with mortar and call that new thing a brick, then you have bricks containing bricks, and it gets recursive and you get into problems. You can’t have a brick containing other bricks. (This reminded me of Russell’s paradox.)

Eventually they decided on a bake-off. They would take some examples and represent them using both the Manifestation-of-Works model (O’Neill and Žumer) and the Work-of-Works model (Tillett and Kuhagen). They would look at those and decide what to recommend as the amendment to FRBR. The examples will be: a collection of music by Sibelius (based on an example Eeva Murtomaa brought), an augmented version of Humphry Clinker (with an introduction, illustrations, that kind of thing), Le Boeuf’s FBBR journal issue/book, and a moving image to be decided (I’d guess a DVD with extras). In order to have this all done by next year’s conference in Milan they need to have the recommendation done for February, so they’re going to get the modelling done before October.

This working group meeting was really lively and fun to watch. I’m glad I went.


FRBR for Serials: Rounding the Square to Fit the Peg

Posted by: William Denton, 24 April 2008 7:26 am
Categories: Aggregates

The CONSER Operations Meeting is on at the Library of Congress in Washington, D.C., and on the agenda is Adolfo Tarango (of U California at San Diego) presenting a paper: FRBR for Serials: Rounding the Square to Fit the Peg (228 KB PDF). (CONSER is a cooperative online serials cataloguing program, among other things. Cataloguing serials (journals, magazines, newspapers, blogs) is non-trivial.)

Various presentations given and papers published over the past few years have addressed the issue of applying FRBR to serials. Each has started with the premise that FRBRizing serials cataloging is a good idea, but for the most part, all attempts have ended with the conclusion that serials don’t quite fit into the FRBR model. Creating separate usable work, expression, and manifestation level records is not possible. This proposal turns the cart around. Instead of attempting to make serials fit the FRBR model we make FRBR fit the serials publishing reality. As such, this proposal begins with a redefinition of the FRBR concept of work, and for purposes of cataloging, introduces the idea of a “work segment” record. The FRBR definitions of expression, manifestation, and item do not change. The end result is two practical applications: a potential serial authority structure and a possible serial bibliographic cataloging framework. Application of each resolves a variety of existing and emerging bibliographic control problems. These include creating a more holistic presentation of the historical run of a serial through its various title incarnations, limiting the proliferation of and need for uniform titles as distinguishing elements, reducing cataloging workloads, and improving bibliographic displays and navigation. The information that follows is in three parts. The first part gives the new definition of the serial work; the second presents the proposed serial authority structure, and the third covers the proposed serials cataloging concept of the work segment record.

… Taking inspiration from Martha Yee’s recent ALA midwinter presentation and a recently published paper by Everett Allgood, this proposed serials cataloging framework doesn’t attempt to create either an expression or manifestation level record, but rather blends both into a “work segment” record. A significant reason for doing so is a resulting labor savings, but also, it pushes the questions “Given the data recorded and user needs, do we need separate expression and manifestation level records, is having expression and manifestation level data in a single record such a bad thing, especially if there are labor savings and user service advantages to be gained by combining them in one record?”

Thanks to Tim Knight and The Serials Cataloger (who deems it “essential reading”) for the link.

(Updated 29 April so that the quote reads, “The FRBR definitions of expression, manifestation, and item do not change.”)


Report on WG on Aggregates meeting last year

Posted by: William Denton, 21 March 2008 7:49 am
Categories: Aggregates, Conferences, IFLA

David Bigwood noticed that the report of the 21 August 2007 meeting of the Working Group on Aggregates (23 KB PDF) had been posted.

Members and observers discussed the draft of a paper by Ed O’Neill and Maja Žumer. The draft, sent to committee members several weeks prior to IFLA, summarizes the difficulties and inconsistencies in applying the FRBR model to aggregates, and applies three different, previously identified, modeling approaches to two different works: The Deptford trilogy; and The Expedition of Humphry Clinker. This draft document represents a response to the previous year’s meeting in Seoul, South Korea, where committee members and observers felt the need to have a document describing different models for aggregates, and also describing the ambiguity of the FRBR model in terms of the treatment of aggregates.

The group briefly deliberated the often-discussed “Universality Principle”, which states that if an entity is a work in any of its manifestations, it is a work in all of its manifestations.

… A WG member and an observer noted that we need to define these models in a timely manner, or we could be forced to inherit models that have been defined and applied by innovative creators of library management and e-commerce systems. These software creators might even use hidden organizing principles that would not necessarily work in the best interests of users. The forces that produce these new systems already determine display and access, and we need to determine how best to model and work in these new environments.


Weiss and Shadle, FRBR In the Real World

Posted by: William Denton, 6 September 2007 7:25 am
Categories: Aggregates, Papers

Cast your minds back, back, back into the mists of time, all the way to May 2006, and you may recall that I mentioned a North American Serials Interest Group conference where Steve Shadle and Paul Weiss did a talk on “FRBR In the Real World.”

Now it’s in print in The Serials Librarian 52: 1/2, May 2007: FRBR In the Real World.

Abstract: Brief refresher of the main aspects of the FRBR model, a review of various uses of FRBR, and a discussion of how the group 1 entity types apply in a serials context. We focus on levels (work/expression/ manifestation/item and whole/part), the number of entities in particular situations, and terminology. Examples of real-world serials are used to illustrate how the FRBR resource model applies to serials.


Allgood, Serials and Multiple Versions

Posted by: William Denton, 5 July 2007 7:29 am
Categories: Aggregates, Papers

Julian Everett Allgood, a cataloguer at New York University Libraries, has an article called “Serials and Multiple Versions, or the Inexorable Trend Toward Work-Level Displays,” in the new Library Resources & Technical Services (July 2007, 51:3). Here’s the abstract:

The proliferation of multiple versions for bibliographic works presents numerous challenges to the cataloger and, by extension, to the cataloguser. Fifteen years after the Multiple Versions Forum held in Airlie, Virginia, online public access catalog (OPAC) users continue to grapple with confusing displays representing numerous serial manifestations (i.e., versions) resulting from the Anglo-American Cataloguing Rules’ (AACR2) cardinal principle (Rule 0.24). Two initiatives offer hope for more coherent OPAC displays in light of a renewed focus upon user needs: the ongoing revision of AACR2, and the International Federation of Library Associations and Institutions’ Functional Requirements for Bibliographic Records (FRBR) model. A third potential tool for improving OPAC displays exists within a series of standards that have developed to parallel library needs, and today offer a robust communications medium: the MARC 21 authority, bibliographic, and holdings formats. This paper summarizes the challenges posed by multiple versions and presents an analysis of current and emerging solutions.

Rule 0.24 has changed over various revisions of AACR, and Allgood gives the old version and the new one. As of 2002, Rule 0.24 reads, “It is important to bring out all aspects fo the item being described, including its content, its carrier, its type of publication, its bibliographic relationships, and whether it is published or unpublished. In any given area of the description, all relevant aspects should be described. As a rule of thumb, the cataloger should follow the more specific rules applying to the item being cataloged, whenever they differ from general rules.”

“Carrier” refers to the medium used for the publication: Can I get the article I want online, or do I have to go to the shelf and find the right issue of the journal? Allgood says, “Users are more interested in obtaining the journal article content than in the manifestation-level details of the serial title in which the article is published.” Very true!

For an earlier and oft-cited paper about this issue, read Content vs. Carrier by Lynne Howarth.


Kemp, Catalog/Cataloging Changes and Web 2.0 Functionality

Posted by: William Denton, 19 March 2007 7:57 am
Categories: Aggregates, Papers

Rebecca Kemp has a paper coming out later this year, but, happily, we can download a copy now: Catalog/Cataloging Changes and Web 2.0 Functionality: New Directions for Serials (883 KB PDF) (The Serials Libarian 53:4).

ABSTRACT. This article presents an overview of some of the important recent developments in cataloging theory and practice and online catalog design. Changes in cataloging theory and practice include the incorporation of the Functional Requirements for Bibliographic Records principles into catalogs, the new Resource Description and Access cataloging manual, and the new CONSER Standard Record. Web 2.0 functionalities and advances in search technology and results displays are influencing online catalog design. The paper ends with hypothetical scenarios in which a catalog, enhanced by the developments described, fulfills the tasks of finding serials articles and titles.

… The paper will be organized into four sections, the first of which will review recent changes in cataloging theory that have yet to be fully developed into cataloging practice, namely, the Functional Requirements for Bibliographic Records (FRBR). Introducing identifiers into serial records in accordance with FRBR entities will allow better collocation of like titles and differentiation between unlike titles. This section will conclude with a view of the potential serial “superwork record.”

(Thanks to Jonathan Rochkind for putting me wise to this. He recommends it.)


Delsey, CONSER and RDA

Posted by: William Denton, 10 January 2007 7:26 am
Categories: Aggregates, RDA

Two days ago Tom Delsey gave this to the Joint Steering Committee for the Revision of AACR: Analysis of the Proposed CONSER Standard Record vis à vis RDA, CONSER being the Cooperative Online Serials Program. It says, “The following is an analysis of recommendations on cataloguing rules, rule interpretations, and practices set out in appendix M of the Access Level Record for Serials Working Group’s final report as they relate to the development of RDA.

There’s one paragraph about FRBR in Delsey’s report, and I’ll quote it all here.

More importantly, however, the relationships that will be defined in RDA (based on the FRBR model and the relationship types defined by Tillett) treat translations and language editions as modifications of a work. RDA will therefore provide instructions on reflecting the primary relationship between a translation or language edition and the related work (i.e., the work realized by that translation or language edition) by means of an identifier, a name (i.e., a controlled access point), or a description representing the related work. Following the FRBR model, RDA will also provide instructions on reflecting, if necessary, the relationship between a translation or language edition and a related expression (i.e., the specific language version used as the basis for that translation or language edition) by means of an identifier, a name (i.e., a controlled access point), or a description representing the related expression. But RDA will not provide instructions on reflecting the relationship between a translation or language edition and another manifestation (i.e., a manifestation embodying the original language expression of the work). That is because both the FRBR model and the relationship types defined by Tillett categorize translations and language editions as expressions of a work, and therefore define a relationship involving a translation or language edition as either a primary relationship between an expression and the work realized by the expression or as an expression-to-expression relationship, but not as an expression-to-manifestation relationship. If the intent of the CONSER recommendation is to allow the construction of an added entry using the title proper of the original manifestation rather than a uniform title for the work embodied in that manifestation, introducing instructions in RDA to support the recommendation would effectively require defining a new relationship type(s) to cover the relationship between a resource embodying a translation or language edition and a related manifestation embodying another language version of the same work (i.e., a relationship that would function as an expression-to-manifestation relationship). Defining such a relationship would seriously compromise the alignment of RDA both with the FRBR model and with the relationship types defined by Tillett.

(Seen on the AUTOCAT mailing list, posted by Nathalie Schulz.)


Minutes from August IFLA meetings

Posted by: William Denton, 2 November 2006 7:05 am
Categories: Aggregates, Conferences, IFLA

Pat Riva, chair of the group, sent along a pointer to updates to the list of meeting and activity reports of the FRBR Review Group. Two sets of minutes from the August meetings in Seoul are now up.

First, there’s the FRBR Review Group Meeting Report, 20 August 2006 (162 KB PDF). I strongly agree with this quote; this same thinking is what led me to OpenFRBR:

Discussion focused on the recommendation that the RG should give more priority to advocacy, specifically developing evidence-based arguments that demonstrate the value of FRBR. This recommendation stems from the observation that vendors have a perception that FRBR implementation is costly, and that purchasers of systems are not sufficiently aware of the benefits of FRBR for end users to request it. Demonstration projects have great potential in demonstrating value concretely.

The other new report is Working Group on Aggregates Meeting Report, 20 August 2006 (174 KB PDF). Aggregates are tricky, and that’s why there’s a special group looking into them.

The discussion centered on a debate describing two distinct models for aggregates (independently created works published together).

Examples: Audio CD, Web sites, Conference proceedings, Anthologies of poetry and/or prose literature, Song/music books, Trilogies, Conference proceedings, Serials (collections that are intended to be together), Monographic series (collections that are intended to be together)

Model 1: The whole is a manifestation that functions as the glue that holds a set of works together.

Model 2: The whole is a work in and of itself: a “work-of-works.”

Proposed activity: The group will collect examples of aggregates, whose relationships will be described using each of the two models under review.

This is important work and I look forward to the results. I’ll have to think over some examples and see which model works best. It’s a knotty issue.


Next Page »