|
| CNI Structured Content Workshop | ||
|
Workshop Presentations
|
Workshop Notes On Dec. 5, 2004, a workshop was held in conjunction with the Fall 2004 CNI Task Force Meeting to discuss emerging standards for structured content in research and higher education, including the IMS Global Learning Consortium's Content Packaging Specification (IMS-CP), the MPEG 21 Digital Item Declaration Language (DIDL) and the Metadata Encoding and Transmission Syntax (METS). This workshop brought together individuals involved in various structured content standards efforts as well as those working to build repository software which may need to import, export or maintain structured content conforming to these various standards. A list of the meeting participants and the initial agenda for the meeting can be found here. The meeting opened with a review of the MPEG 21 DIDL, IMS-CP and METS standards. While a number of similarites noted between the various standards (e.g., all use hierarchical structuring, all allow metadata of various kinds to be associated with different nodes within that structure, all allow for both internal and external metadata records), there are differences between them. MPEG 21 DIDL, for example, provides facilities for communicating a series of choices, for example, which IMS-CP and METS lack. It was noted that due to the flexilibity of all of these standards, some details essential to any implementation are undefined, and these can potentially be the cause of a certain amount of grief. Selection and implementation of persistent identifiers for content/metadata and resolution mechanisms for those identifiers quickly came up as a potential headache for implementors of any of these standards. Tyde Richards then provided an overview of the IEEE Learning Technology Standards Committee Working Group 11 (Computer Managed Instruction) Project (the Resource Aggregations Reference Model). Briefly, those working in the area of structured content for Computer Managed Instruction became aware that other groups were also struggling with trying to define standards for structured content (the "resource aggregation" problem), and decided that defining a reference model which could encompass a variety of different approaches to structured content (including IMS-CP, MPEG 21 DIDL, METS, and others) would be useful for implementors of Computer Managed Instructional systems attempting to understand how to deal with and/or crosswalk non-SCORM/IMS-CP content. This effort is on-going, and the participants in IEEE LTSC WG 11 have already begun reaching out to other communities, including METS, to try to insure that the broad applicability of the reference model they are defining. Individuals representing various institutions developing repositories which deal with structured content then discussed their various efforts and their experiences in trying to deal with various structured content standards. Brian Tingle discussed the California Digital Library's preservation repository, their efforts to support the METS standard, and their growing awareness that support for learning objects might be necessary. Herbert Van de Sompel discussed the MPEG 21 DIDL-based repository that LANL has built and their efforts to try achieve interoperability with other repository systems (noting that LANL, as a research institution, has not had to worry about exchanging learning objects, but only research materials (e.g., electronic journals, pre-prints, etc.). He noted numerous areas of potential difficulty in exchanging content (different identifier systems, how identifiers are used (is this for an object or a bitstream or metadata?), inconsistent application of digests, unknown meaning of terms such as 'creation date' (creation date of what exactly?)), and suggested a reference model, along with a common vocabulary for relationships between data and metadata, would be very helpful in beginning to clear up some of this mess. Carl Lagoze discussed Fedora, an open-source content management/repository system, and noted that any discussion of interoperability needs to include discussion of not only document models but protocols of interoperation, and raised the issue that we may wish to consider whether object requests between systems should be parameterized, so that systems can negotiate based on shared features and needs. An e-mail from Robert Tansley, who is working on the DSpace institutional repository system, noted that if these various standards could agree on the use of various shared components, this might help interoperability along. Jerome McDonough then discussed the METS Profile schema, and demonstrated the Telcert schema profiling tool. While the Telcert project is focused on eLearning issues, the schema profiling tool can be applied to any XML schema, such as METS, to create a specific application profile of that schema which may constrain or loosen constraints imposed by the original schema. The METS Profile schema is not intended to allow the specification of an application profile instance of the main METS schema, but rather to allow institutions to record and disseminate all of their local practices with respect to construction of a particular class of METS documents (including practices which may not be subject to machine validation). Some people expressed concerns that a proliferation of profiles might actually impede interoperability by limiting incentives to force broader agreements on object encodingn practices. There was interest expressed in trying to develop a central registry of profiles (perhaps in conjunction with a global format registry), and in trying to develop some public list of who is working on different classes/subclasses of objects. Discussion then turned to extension schema for descriptive, technical, rights and digital preservation metadata. None of the structured content standards mandates the use of a particular metadata schema (although there is a strong predisposition to the use of some version of LOM within the IMS-CP community for descriptive metadata). There was again discussion of the need for some common agreements about persistent identifiers and their application with respect to metadata records within a content object. Somewhat related to this (and perhaps to Dr. Van de Sompel's earlier comments about agreeing on terms to describe relationships in objects) was a discussion about the need for agreements on anchoring mechanisms to be used in specifying the locations of bitstreams and bitstream fragments. There also a surprising degree of uniformity on the need for digital preservation/provenance metadata schema, and in particular, something that allows for the specification of a series of transformations needed to 'unpack' bitstreams to access content. This has come up in the OCLC/RLG PREMIS metadata work with respect to dealing with tar'ed, gzipped archive files. Discussion then turned to next steps. There was universal agreement that the IEEE LTSC WG 11 work on a common reference model for content objects is extremely valuable for both standards developers and repository developers; several present at the meeting were willing to volunteer to assist the working group if they need additional input from other perspectives/standards communities. In addition to the IEEE work, Prof. Lagoze suggested that we lack a good framework for understanding what we're trying to achieve with respect to interoperability of repositories dealing with complex structured content, and that a white paper trying to establish such a framework might be a valuable counterpoint to the reference model work. Funding for such a white paper might be available from a variety of sources. A survey of the various standards efforts for structured content, trying to obtain a better indication of what communities they thought they were designing for and what problems they thought they were trying to solve, might be a good first step towards such a white paper. With a white paper framing interoperability issues and a common reference model in hand, some of the institutions present might be able to start exploring the possibility of establishing some interoperability testbed projects. Specific next steps following from the workshop are:
|
|