Fedora Admin Client Batch Utility

Software Release 1.1

Fedora Development Team

$Id: batchclient.dbx,v 1.13 2003/07/23 17:33:28 rag9b Exp $


Table of Contents

Introduction
Building Fedora objects in batch
Ingesting Fedora objects in batch
Building and ingesting Fedora objects in batch
Object Processing Map
object-specifics

Introduction

The fedora-admin client utility includes tools to create and ingest multiple Fedora objects, which are Fedora-specific METS XML documents contained in files outside the repository.

It's simple to ingest objects created by one-up edit or by custom scripting.

fedora-admin also supports building objects. This takes a general template common to all objects in a batch and makes object-specific substitutions into the template. It also substitutes a common datetime stamp for all date attributes. The template is a Fedora METS XML document, with data common to the objects of the batch. Separate XML documents hold the per-object substitution values.

The relatedness of objects in a batch is defined by what fedora-admin allows to be substituted and by which substitutions you choose to make. Data from the template are retained, unless replaced per individual object, including XML comments.

fedora-admin provides for three modes of object batch processing: batch build, batch ingest, and a combined batch build and ingest.

This phased processing is shown in the following diagram.

Building Fedora objects in batch

Build a set of Fedora METS XML files from a common Fedora METS template and simple (non-METS) XML object-specs. The resulting objects are then ready for ingesting into Fedora.

fedora-admin instructions

Select Tools on the fedora-admin menu bar, and select item Build Batch.

This will open a Batch Build window. You may need to adjust this window’s size to see its controls. Use the browse buttons to enter the four required settings. Clicking on a browse button opens a standard directory/file selection dialog.

Then click the Build this batch button to build the batch of Fedora METS XML documents.

A second (output-only) window will open to show progress. You can build multiple different batches before closing the Batch Build window.

You can then ingest the created batch as described elsewhere in this document.

No subdirectories or files are deleted by fedora-admin. Setup and cleanup of the files in the batch must be done by you using standard operating systems facilities.

to demo

You can use files and subdirectories of directory client/demo/batch-demo, relative to your FEDORA_HOME environment variable. (When you create your own batches, the needed directories and files can be anywhere in the file space of the system on which you are running fedora-admin or command-line BatchTool.)

Use file mets-template.xml for METS template (input file).

Use subdirectory object-specifics for XML specs (input directory); this is a directory holding (all and only) per-object data.

Use subdirectory objects for METS objects (output directory); this is a directory to hold (all and only) Fedora METS files built by fedora-admin.

Specify a file path of your choice for object processing map (output file); this is a file which maps object-specs to objects built. See the section on object processing maps, elsewhere in this documentation. Note that PIDs cannot be reported in this (Batch Build) mode, as they have not yet been assigned.

Optionally select the output format for object processing map, either xml or text (xml is the default format).

Ingesting Fedora objects in batch

Create a set of Fedora objects in your repository from a corresponding set of Fedora METS XML files.

fedora-admin instructions

Select Tools on the fedora-admin menu bar, and select item Ingest Batch.

This will open a Batch Ingest window. You may need to adjust this window’s size to see its controls. Use the browse buttons to enter the two required settings. Clicking on a browse button opens a standard directory/file selection dialog.

Then click the Ingest this batch button to ingest the batch into your Fedora repository.

A second (output-only) window will open to show progress. You can ingest multiple different batches before closing the Batch Ingest window.

No subdirectories or files are deleted by fedora-admin. Setup and cleanup is by using standard operating systems facilities. fedora-admin does not itself validate on Batch Build, but batch ingest into Fedora does. The batch fails on the first individual object ingest failure.

Fedora will not ingest a METS file whose METS:xmldata elements are empty or contain non-tagged character data.

to demo

You can use files and subdirectories of directory client/demo/batch-demo, relative to your FEDORA_HOME environment variable. (When you create your own batches, the needed directories and files can be anywhere in the file space of the system on which you are running fedora-admin or command-line BatchTool.)

You will need to have already done a Build Batch demo, explained elsewhere in this document, to populate the objects directory needed in this current demo. If you have ingested these objects before, either in this Ingest Batch mode following a separate Build Batch mode, or in a Build and Ingest Batch mode, you will first need to edit OBJIDs in the object-spec files, or to remove the corresponding objects from your Fedora repository.

Use subdirectory objects for METS objects (input directory); this is a directory holding (all and only) Fedora METS files to ingest.

Specify a file path of your choice for object processing map (output file); this is a file which maps objects to their assigned PIDs. See the section on object processing maps, elsewhere in this documentation. Note that object-specs of objects previously built by fedora-admin cannot be reported in this (Batch Ingest) mode, as they (as source documents) are no longer known.

Optionally select the output format for object processing map, either xml or text (xml is the default format).

Building and ingesting Fedora objects in batch

This process builds a set of Fedora METS XML files from a common Fedora METS template and simple (non-METS) XML object-specs, then ingests the resulting batch into Fedora.

fedora-admin instructions

Select Tools on the fedora-admin menu bar, and select item Build and Ingest Batch.

This will open a Batch Build and Ingest window. You may need to adjust this window’s size to see its controls. Use the browse buttons to enter the four required settings. Clicking on a browse button opens a standard directory/file selection dialog.

Then click the Build and Ingest this batch button to build the batch of Fedora METS XML documents and then ingest them into Fedora.

A second (output-only) window will open to show progress. You can build and ingest multiple different batches before closing the Batch Build and Ingest window.

There is then no need to separately ingest the created batch.

No subdirectories or files are deleted by fedora-admin. Setup and cleanup of the files in the batch must be done by you using standard operating systems facilities.

fedora-admin does not itself validate on Batch Build, but batch ingest into Fedora does. The batch fails on the first individual object ingest failure.

Fedora will not ingest a METS file whose METS:xmldata elements are empty or contain non-tagged character data.

to demo

You can use files and subdirectories of directory client/demo/batch-demo, relative to your FEDORA_HOME environment variable. (When you create your own batches, the needed directories and files can be anywhere in the file space of the system on which you are running fedora-admin or command-line BatchTool.)

If you have ingested these objects before, either in this Build and Ingest Batch mode or in separate sequential Build Batch and Ingest Batch modes, you will first need to edit OBJIDs in the object-spec files, or to remove the corresponding objects from your Fedora repository.

Use file mets-template.xml for METS template (input file).

Use subdirectory object-specifics for XML specs (input directory); this is a directory holding (all and only) per-object data.

Use subdirectory objects for METS objects (output directory); this is a directory to hold (all and only) Fedora METS files built by fedora-admin.

Specify a file path of your choice for object processing map (output file); this is a file which maps object-specs through objects built and on to PIDs assigned. See the section on object processing maps, elsewhere in this documentation. Unlike separate Batch Build and Batch Ingest modes, the complete triple is reported in this Batch Build and Ingest mode.

Optionally select the output format for object processing map, either xml or text (xml is the default format).

Object Processing Map

The object-processing-map file has one of the following formats, depending on the choice of xml or text in fedora-admin. Batch Build processing results in an object processing map whose individual maps have only path2spec and path2object attributes or fields. Batch ingest processing results in an object processing map whose individual maps have only path2object and pid attributes or fields. Batch build and Ingest processing results in an object processing map whose individual maps have all three path2spec, path2object and pid attributes or fields.

xml format

<object-processing-map>
	<map
	path2spec="/mellon/dist/client/demo/batch-demo/object-specifics/americanacademy.xml"
	path2object=" /mellon/dist/client/demo/batch-demo/objects/americanacademy.xml"
	pid="demo:3010" />
	. . .
	<map
	path2spec="/mellon/dist/client/demo/batch-demo/object-specifics/vaticanlibrary.xml" 
	path2object="/mellon/dist/client/demo/batch-demo/objects/vaticanlibrary.xml" 
	pid="demo:3019" />
</object-processing-map>
			

text format

(field separator is tab; relative paths used for practical illustration)

object-specifics/americanacademy.xml	objects/americanacademy.xml	demo:3010
. . .
object-specifics/vaticanlibrary.xml	objects/vaticanlibrary.xml	demo:3019
			

object-specifics

Object-specifics are coded in XML files. These data include: object ID, label, and comment; datastream and object metadata and accompanying label; datastream URLs, titles, and labels; disseminator-specific datastream labels. Where possible, attribute names are the same as in the Fedora METS schema, and so correspond to like-named attributes in the Fedora METS template. How these map is described below and by running the demo and viewing the results for one of the objects.

Any individual substitution is optional. When absent as a substitution, the value in the template will be used for the resulting Fedora METS object. (Demo template and object-specific contents are chosen instructively to highlight substitutions made.) Datastream URLs will generally be specific to an object; practice will show which other substitutions are generally made.

All non-METS namespaces used in your own metadata must be declared, as in xmlns:uvalibadmin in the demo.

Metadata IDs here map to those found in the Fedora METS:amdSec and Fedora METS:dmdSecFedora element. The associated metadata is substituted as the content of METS:xmlData element, which is nested within that Fedora METS:amdSec or Fedora METS:dmdSecFedora element.

Datastream IDs here map to those found in the Fedora METS:fileGrp element (the nested, not the nesting, one). The associated xlink:href and xlink:title attributes are substituted into the Fedora METS:Flocat element, which is nested within that Fedora METS:fileGrp element.

Datastream labels substitute instead into METS:structMap If a datastream label is given specific to a disseminator, it is what’s substituted; otherwise the general datastream label is used.

Case matters in attribute and element names.

Fedora will retain as PIDs only OBJIDs prefixed "test:" or "demo:". Other OBJIDs will be replaced by Fedora-generated PIDs.

object-specs in a given batch should meet the structural requirements of that batch’s template: same number and tagging of datastreams, same number and tagging of metadata elements. Since substitutions are optional, individual object-specs cannot have "missing" data: the resulting object simply retains the template's value. Neither can object-specs have "extra" data: the resulting object simply lacks the object-spec's data -- because the template isn't designed to use it. In either case, the batch goes on.

The following object-spec fragment illustrates some of this.

<?xml version="1.0" encoding="ISO-8859-1"?>
<input OBJID="test:2800" LABEL="my object" xmlns:METS="http://www.loc.gov/METS/" 
xmlns:uvalibadmin="http://www.lib.virginia.edu/uvalibadmin/" >
    <metadata>
        <metadata ID="RIGHTS1" LABEL="">
            <!-- include comment optionally -->
            <uvalibadmin:admin>
                <uvalibadmin:adminrights>
                    <uvalibadmin:policy>
                        <uvalibadmin:access>unrestricted</uvalibadmin:access>
                        <uvalibadmin:use>educational</uvalibadmin:use>
                    </uvalibadmin:policy>
                </uvalibadmin:adminrights>
            </uvalibadmin:admin>
        </metadata>
        . . .
        other metadata
        . . .
    </metadata>
    <datastreams>
        <datastream ID="DS1" xlink:href="http://localhost:8080/demo/batch-demo/thumb/4868090.jpg" xlink:title="" LABEL="copied into every disseminator’s label for this datastream"/>
        <datastream ID="DS2" xlink:href="http://localhost:8080/demo/batch-demo/thumb/4868090.jpg" xlink:title="" LABEL=" copied into a disseminator’s label for this datastream, unless the disseminator has a nested node with a label herein ">
            <disseminator ID="" LABEL=" copied into only this disseminator’s label for this datastream " />
        </datastream>
        . . .
        other datastreams
        . . .
    </datastreams>
</input>