It’s the last two weeks in FRBR, actually. And it’s a lot of Karen Coyle. She’s busy!
Karen Coyle’s Livescribe videos
In the message Expressions and Manifestations sent to the rda-l mailing list, Coyle linked to two Livescribe recordings she did. Livescribe is a sort of computerized pen that lets you record what you’re saying, as well as writing and drawing, and turn it into a video recording. Watch the demo &mdash it’s pretty neat.
Variations/FRBR project relases FRBR XML Schemas
The Variations/FRBR project at Indiana University (http://vfrbr.info) is pleased to announce the release of an initial set of XML Schemas for the encoding of FRBRized bibliographic data. The Variations/FRBR project aims to provide a concrete testbed for the FRBR conceptual model, and these XML Schemas represent one step towards that goal by prescribing a concrete data format that instantiates the conceptual model. Our project has been watching recent work to represent the FRBR-based Resource Description and Access (RDA) element vocabulary in RDF; however, due to the fact that this work represents RDA data rather than FRBR data directly, and that much metadata work in libraries currently (though perhaps not permanently) operates in an XML rather than an RDF environment, we concluded an XML-based format for FRBR data directly was needed at this time. We view XML conforming to these Schemas to be one possible external representation of FRBRized data, and will be exploring other! [sic] representations (including RDF) in the future. We define "implementing FRBR," as the conceptual models described in the companion FRBR and FRAD reports; at this time we are not actively working on the model defined in the draft FRSAD report. Perhaps the most notable feature of the Variations/FRBR XML Schemas is their existence at three "levels": frbr, which embodies faithfully only those features defined by the FRBR and FRAD reports; efrbr, which adds additional features we hope will make the data format more "useful"; and vfrbr, which both contracts and extends the FRBR and FRAD models to create a data representation optimized for the description of musical materials and we hope provides a model for other domain-specific applications of FRBR.
A User Guide with details on the structure of the Schemas and how they relate to one another may be found at http://vfrbr.info/schemas/1.0/UserGuide.pdf, and
links to all Schemas and documentation may be found at http://vfrbr.info/schemas/1.0. We hope this Schema release will lead to further discussion of FRBR implementation issues within the community. Comments and questions on the Variations/FRBR Schema release may be sent to email@example.com.
That’s big stuff. Congratulations to Riley and everyone else on the project. I haven’t had time to digest it all, but I’ll keep an eye out for comments and follow-ups and link to them. I think examples will help people understand all this, so try your hand at one.
Discussion on code4lib and rda-l; Coyle and Rochkind
Karen Coyle responded on the code4lib list, saying:
[A]re you on the RDA-L list? Because we just went through a very long discussion there in which we concluded that a text aggregate (possibly analogous to a sound recording aggregate) is an expression, not a “set” of separate work/expression entities. Your example implies the latter, with the aggregate being described only at the manifestation level. (And now I’m confused as to what the work would be in something like a text collection, such as an anthology of poems. Would the anthology be a work?)
There was some back and forth between a few people. Coyle later said:
What the RDA folks (that is, the folks who have created RDA, the JSC members) said (some of them off-list to me), is that if your manifestation is an aggregate, then your Expression must be an equal aggregate. So the Expression is pretty much one-to-one with the Manifestation. (And I think we were all seeing a many-to-many.)
This is what I was told (off-list):
“the additional bibliographies or other intellectual or artistic content are viewed as parts of a new expression – not just new pieces for the manifestation … – it’s useful to declare expression level changes to facilitate collocation and make distinctions, but sometimes such distinctions aren’t necessary and we can collocate at the work level. Please don’t start people getting confused with throwing in expression level elements at the manifestation level.”
So those were my marching orders! (And I don’t see how anyone could be more confused than I am.) But a reprint of Moby Dick with a new preface or bibliography becomes a new expression. In crude MARC terms, every time the 245 $c changes, you’ve got a new expression, unless you determine that it’s something really insignificant. And I would guess that you can link the Expression to one or more Works, as you wish, except that the FRBR diagram shows that expressions can only relate to one Work. (See, no one could be more confused than I am!)
Jonathan Rochkind responded at length and turned his e-mail into a blog post: Notes FRBR WEMI entities, physicality, interchangeability, merging. He and Coyle post lengthy comments (with more about Moby Dick) and you should go read it.
The consensus on rda-l about aggregates being expressions
The discussion on rda-l that Coyle mentions above is, I think, in the Contents of Manifestations as Entities thread that she started on 10 March. She’d been looking at Chapter 25: Related Works and didn’t see how entity relationships could be set up from what’s described there. When all that was going on I had some other things on my mind so I was only skimming the list traffic, and I can’t pick out one representative e-mail to sum it all up. Perhaps someone can leave a link in the comments?
The FRBR Working Group on Aggregates still doesn’t have any answers about any of this.
Hugh Cayless points out some very cool stuff in Making a new Numbers Server for papyri.info. papyri.info is brand new to me but instantly fascinating: it is “dedicated to the study of ancient papyrological documents. It offers links to papyrological resources and a customized search engine (called the Papyrological Navigator) capable of retrieving information from multiple related sites. The Papyrological Navigator currently retrieves and displays information from the Advanced Papyrological Information System (APIS), the Duke Databank of Documentary Papyri (DDbDP) and the Heidelberger Gesamtverzeichnis (HGV).” The Heidelberger Gesamtverzeichnis! You know you want to be involved.
Cayless describes a rich set of relationships involving all the various parts of the study of ancient manuscripts. The old system wasn’t good enough to support that, but the new system, which uses RDF to model things, can handle it. Part of the model is FRBR:
FRBR: there’s a Work (the ancient document itself), which has expression in a scholarly publication, from which the DDbDP transcription, HGV records and translations, and APIS records and translations are derived; these may be made manifest in a variety of ways, including EpiDoc XML, an HTML view, etc. The scholarly work has bibliography, which is surfaced in the HGV records. There is the possibility of attaching bibliography at the volume level as well (since these are actual books, sitting in libraries). Libraries may have series-level catalog records too.
(Thanks to Jodi Schneider for pointing this out.)
The CastAlbumCollector website (http://www.castalbumcollector.com) is a very good example of how a database (in this case of recordings of musicals) based on FRBR entities can help users navigate large results sets and clarify what they’re looking for. The website lets you browse by show (=Work), by recording (=Expression) or by release (=Manifestation); you can also do a keyword search by show or recording. If you search for the work “Les misérables”, you will be presented with only two results: the musical by Schönberg or the one by Spencer (who knew there was another musical on the same subject?). Once you select the Schönberg work, you can pick among the 55 recordings of the musical in the database. Only after you selected a recording are you offered with specific releases. At this stage, the details pertaining to the work and expression entities are neatly grouped together under “Show details” and “Recording details.” I was not surprised to read on his personal website that the creator of this database has an MLS…
Check out The Music Man, for example. You can see the original cast recording, the soundtrack of the film, different releases of each, etc.
Percentage of holdings for multi-manifestation works in WorldCat
On our latest run against WorldCat we found an average of just under 1.4 records/work. 85% of the works just have one manifestation.
38% of WorldCat records are in the 15% of works with multiple manifestations.
Timothy Faile did some calculations, and I hope he won’t mind if I quote them fully:
Of the total number of records (i.e., manifestations or expressions) in WC: single-manifestation works (62% of records); multi-manifestation works (38% of records)
Of the total number of works in WC: single-manifestation works (85% of works); multi-manifestation works (15% of works)
Averages to just under 1.4 records per work.
So…in trying to get my head around these numbers…
Imagine that there are only 1,000 records (manifestations/expressions of works) in WorldCat. Of these 1,000 records, there are: 620 records with a one-to-one relationship to 620 works; 380 records with a many-to-one relationship to the remainder of works (note: the “remainder of works” is calculated below)
If 620 = 85% of works620 850 ----- = ----- = (620*1000)/850 = 729.4 (total number of works) x 1000
Total number of works = 730
x – 620 = total number of works with multiple manifestations = 730-620 = 110
Remainder of works = 110
Therefore, of the total number of works (730 works): 620 works have exactly one manifestation; 110 works have multiple manifestations/expressions
Considering all 730 works (single- and multi-manifestation), there are on average 1.37 manifestations/expressions for each work.
Calculation: [total records / total works] 1000 / 730 = 1.36986
But this stat is rather useless (from my understanding), since it mixes the categories which should be distinct. Considering only the 110 multi-manifestation/expression works, there are on average 3.46 manifestations(expressions) for each work.
Calculation: [total many-to-one manifestations(expressions) / total multi-manifestation works] 380 / 110 = 3.4545
620 / 730 = 0.8493 (or 85% of works have only one manifestation)
110 / 730 = 0.1507 (or 15% of works have multiple manifestations/expressions)
Hickey confirmed that looked right. Interesting.