Krang::DataSet - Krang interface to XML data sets
Creating data sets:
# create a new data set
my $set = pkg('DataSet')->new();
# add an objects to it $set->add(object => $story);
# add an object linked from another object $set->add(object => $media, from => $story);
# add a file (used by media to include their files) $set->add(file => $file, path => $path, from => $media);
# write it out to a kds file $set->write(path => "foo.kds");
Loading data sets:
# load a data set from a file on disk
my $set = pkg('DataSet')->new(path => "foo.kds");
# get a list of objects in the set my @objects = $set->list();
# import objects from the set, solving dependencies and updating links $set->import_all();
Utility methods:
# get list of all classes available for import/export my @classes = $set->classes;
# get list of all valid ID method names (story_id, media_id, etc.) my @id_names = $set->id_meths;
# find the class for a given ID method name
my $class = $set->id_meth_to_class('story_id');
This modules manages export and import of XML data sets for Krang. This module is used by krang_export and krang_import. This module uses Krang::XML to serialize and deserialize individual objects.
$set = Krang::DataSet->new(...)
Creates a new set object, either empty or by loading an existing data set previously created with write().
May throw a Krang::DataSet::ValidationFailed exception if the archive is found to contain errors. See EXCEPTIONS below for details.
Available parameters:
Specify the path of an existing .kds or .kds.gz file to open.
Specify a subroutine to be call objects are added to the data set. The callback will recieve the same arguments as are passed to add(). This is useful if you need to provide progress messages for the user. Note that the callback will only be called once for each object.
Specify a subroutine to be call objects are imported from the data set. The callback will recieve a single named parameter called object which contains the object imported.
If set to 1, this will override the normal behavior of adding related/linked objects to the dataset. Use this with caution! In many situations it will omit objects that are essential to have.
$set->add(object => $story, from => $self)
Adds an object to the data-set. This operation will also add any linked objects necessary to later load the object. If an object already exists in the data set then this call does nothing.
The from must contain the object calling add() when add() is called
from within serialize_xml(). This is used by Krang::DataSet to
include link information in the index.xml file.
Objects added to data-sets with add() must support serialize_xml() and
deserialize_xml(). For details, see REQUIRED METHODS below.
$set->add(file => $file, path => $path)
Adds a file to a data-set. This is used by media to store media files in the data set. The file argument must be the full path to the file on disk. Path must be the destination path of the file within the archive.
@objects = $set->list()
This returns a list of objects in the data set. The list is composed of two-element arrays listing the class of the object and its id. For example:
@objects = ( [ Krang::Story => 1 ],
[ Krang::Story => 2 ],
[ Krang::Category => 1 ],
[ Krang::Site => 5 ] );
$set->write(path => "foo.kds")
$set->write(path => "foo.kds.gz", compress => 1)
Writes out the set in a kds file named in the path provided.
May throw a Krang::DataSet::ValidationFailed exception if the archive is found to contain errors. See EXCEPTIONS below for details.
$set->import_all(...)
This method tells the set to deserialize all objects in the set and save them into the current system. The following optional parameters are available:
Normally import will attempt to update objects when creating a new
object would create an invalid duplicate. Set this parameter to 1 and
duplicates will cause the object to fail to import. (Note that the
exact policy on updates is decided by the individual class'
deserialize_xml() method.)
Ignore UUIDs for the purpose of finding matches to update. This essentially reverts Krang to its behavior before v2.008.
Only use UUIDs for the purpose of finding matches to update. Matches using other fields (URL, name, etc) will be treated as errors.
Set this option to an array of class names and content for these classes will not be used to update existing objects. This is useful in cases where you wish to update an object without updating objects it must point to. For example, to load stories from a set without altering existing categories:
$set->import_all(skip_classes => [ 'Krang::Category' ]);
This is currently implemented only for Krang::Category and Krang::Site.
May throw a Krang::DataSet::ValidationFailed exception if the archive is found to contain errors. May also throw a Krang::DataSet::ImportRejects exception if one or more objects failed to import. See EXCEPTIONS below for details.
$real_id = $set->map_id(class => "Krang::Foo", id => $id)
This call is used during import to return the mapping from an ID in
the import data to an ID on the target system. This method will croak
if called outside of an import_all() run or if the object can't be
found in the data set.
This call will trigger a deserialization if the object has not yet been deserialized.
$set->register_id(class => $class, id => $id, import_id => $import_id)
An object which points to objects which may contain circular
references must call register_id() before calling map_id() on those
objects. For example, Krang::Story::deserialize_xml() calls
register_id() before deserializing its element tree since those
elements might point to stories which may point back to the original
story.
$full_path = $set->map_file(class => $class, id => $id)
Get the full path to a file within a set previously added with add().
$tmp_dir = $set->dir
Get the full path to the directory where the dataset is being worked on.
As documented above, the methods in this class may throw the following exceptions:
This exception indicates that the data set failed schema validation
against the XML Schema files in schema/. This exception contains a
single field, errors, which is a hash mapping filenames inside the
data set to error message. Note that message already contains a
reasonable textual representation of the error report.
If basic sanity checks on the archive fail then this exception will be
returned with message set to an explanation of what went wrong.
Modules implementing deserialze_xml() can use this method to
communicate the fact that the import didn't work. The 'message' field
must be set to a description of why the import failed.
This exeception communicates to the caller of import_all() that one or
more objects failed to import. The 'message' field will describe the
problems. The 'set' field will contain a Krang::DataSet object
containing the failed date and their dependencies. This can be
written out to a file, repaired and then reimported.
Objects which are serialized in data-sets must support three methods:
$object->id_meth
Must return the name of the ID-returning method for the object. For example, Krang::Story returns ``story_id''. You must also implement the returned method, of course.
Note that id_meth() should return a value of the form ``NAME_id'' where
``NAME'' will be used as a short-hand name for your class and must be
unique.
$object->serialize_xml(writer => $writer, set => $set)
This call must write XML data representing the object using the
provided XML::Writer, or croak on error. This call should not write
the XML declaration or call $writer->end().
The set parameter includes the Krang::DataSet object where the
serialized object will be packaged. The object is responsible for
calling $set->add() on any objects referenced by ID in the
output XML.
$object = Krang::Foo->deserialize_xml(xml => $xml, set => $set, no_update => 0, no_uuid => 0, uuid_only => 0, skip_update => 0);
This call must instantiate a new object using the XML provided. If
no_update is false then the method should make an effort to use the
data to update an existing record if creating it as a new record would
result in an invalid duplicate.
If skip_update is true then the method should not make changes to
an existing object. Instead, it should return the object unchanged.
New objects should still be created as usual.
If no_uuid is true then UUIDs should not be used to match objects
for update. If uuid_only is true then only UUIDs should be used
to match. The default should be to prefer UUID matches and fall-back
to pre-existing keys.
This call must use $set->map_id() to request ID mappings for
linked objects (the same ones the object calls $set->add() on during
serialize_xml()). For example, Krang::Media would use this call to
translate from the category_id in the XML file into the ID to be used
by the media object:
$category = $set->get_object(class => "Krang::Category",
id => $xml->{category_id});