ocfl.Object
OCFL Object Implementation.
This code uses PyFilesystem2 (import fs) exclusively for access to files,
with some convenience functions in ocfl.pyfs. This enables application
beyond the operating system filesystem to include mem://, zip:// and
s3:// filesystems.
- class ocfl.Object(*, identifier=None, content_directory='content', digest_algorithm='sha512', content_path_normalization='uri', spec_version='1.1', forward_delta=True, dedupe=True, lax_digests=False, fixity=None, obj_fs=None, path=None, create=False)
Class for handling OCFL Object data and operations.
Operation supported include building, updating and inspecting OCFL Objects. Also provides support for generating and updating OCFL inventories implemented via the ocfl.Inventory class.
Example use:
>>> import ocfl >>> object = ocfl.Object(path="fixtures/1.1/good-objects/spec-ex-full") >>> passed, validator = object.validate() >>> passed True >>> validator.spec_version "1.1" >>> inv = object.parse_inventory() >>> inv.digest_algorithm 'sha512' >>> inv.version_directories ['v1', 'v2', 'v3'] >>> inv.version("v1").created '2018-01-01T01:01:01Z' >>> inv.version("v1").message 'Initial import' >>> inv.version("v1").user_name 'Alice'
- identifier
id for this object
- Type:
str
- content_directory
the content directory used within this object (default “content”)
- Type:
str
- digest_algorithm
the digest algorithm used for content addressing within this object (default “sha512”)
- Type:
str
- content_path_normalization
the filepath normalization strategy to use when files are added to this object (default “uri”)
- Type:
str
- spec_version
OCFL specification version of this object
- Type:
str
- forward_delta
if True then indicates that forward delta file versioning should be used when files are added, not if False
- Type:
bool
- dedupe
if True then indicates that files are deduped within a version when files are added, not if False
- Type:
bool
- lax_digests
if True then digests beyond those included in the specification for fixity and to allow non-preferred digest algorithms for content references in the object will be allowed. Defaults to False
- Type:
bool
- fixity
list of fixity types to add as fixity section
- Type:
list
- obj_fs
a pyfs filesystem reference for the root of this object
- Type:
io.IOBase
- __init__(*, identifier=None, content_directory='content', digest_algorithm='sha512', content_path_normalization='uri', spec_version='1.1', forward_delta=True, dedupe=True, lax_digests=False, fixity=None, obj_fs=None, path=None, create=False)
Initialize OCFL object.
- Parameters:
identifier (str) – id for this object
content_directory (str) – allow override of the default “content”
digest_algorithm (str) – allow override of the default “sha512”
content_path_normalization (str) – allow override of default “uri”
spec_version (str) – OCFL specification version
forward_delta (bool) – set False to turn off foward delta. With forward delta turned off, the same content will be repeated in a new version rather than simply being included by reference through the digest linking to the copy in the previous version
dedupe (bool) – set False to turn off dedupe within versions. With dedupe turned off, the same content will be repeated within a given version rather than one copy being included and then a reference being used from the multiple logical files
lax_digests (bool) – set True to allow digests beyond those included in the specification for fixity and to allow non-preferred digest algorithms for content references in the object
fixity (list of str) – list of fixity types to add as fixity section
obj_fs (str) – a pyfs filesystem for the root of this object
path (str) – if set then open a pyfs filesystem at path (alternative to obj_fs)
create (bool) – set True to allow opening filesystem at path to create a directory
- add_version_with_content(objdir='', srcdir=None, metadata=None, abort_if_no_difference=False)
Update object by adding a new version with content matching srcdir.
- Parameters:
objdir (str) – sub-directory of the object filesystem that contains the object to be update. The default is “” in which case the object is assume to be at the filesystem root.
srcdir (str) – source directory with version sub-directories
metadata (ocfl.VersionMetadata) – object applied to all versions
abort_if_no_difference (bool) – if True, do not create a new version if the content of srcdir is the same as the latest version
- Returns:
inventory of updated object or None if no new version was created.
- Return type:
As a first step the object is validated.
If srcdir is None then the update will be just of metadata and any settings (such as using a new digest). There will be no content change between versions.
- build(srcdir, versions_metadata=None, objdir=None)
Build an OCFL object with multiple versions.
Will write the object to objdir if set, else just build inventory.
- Parameters:
srcdir (str) – source directory with version sub-directories.
versions_metadata (dict) – dict of VersionMetadata objects for each version, key is the integer version number. Default is None in which case no metadata is added.
objdir (str) – output directory for object (must not already exist), if not set then will just return head inventory that would have been created as a dry-run.
- Returns:
object for the last version.
- Return type:
See also create(…) for creating a new object with one version.
- copy_into_object(src_fs, srcfile, filepath, create_dirs=False)
Copy from srcfile to filepath in object.
- create(srcdir, metadata=None, objdir=None)
Create a new OCFL object with v1 content from srcdir.
- Parameters:
srcdir (str) – source directory with content for v1
metadata (ocfl.VersionMetadata) – metadata object for v1
objdir (str) – output directory for object (must not already exist), if not set then will just return inventory for object that would have been created
- Returns:
object for the last version.
- Return type:
See also build(…) for building a new object with multiple versions.
- extract(objdir, version, dstdir)
Extract version from object at objdir into dstdir.
- Parameters:
objdir (str) – directory for the object
version (str) – version to be extracted (“v1”, etc.) or “head” for latest
dstdir (str) – directory to create with extracted version
- Returns:
metadata object for the version extracted.
- Return type:
The dstdir itself may exist bit if it is then it must be empty. The parent directory of dstdir must exist.
- extract_file(objdir, version, dstdir, logical_path)
Extract one file from version from object at objdir into dstdir.
- Parameters:
objdir (str) – directory for the object
version (str) – version to be extracted (“v1”, etc.) or “head” for latest
dstdir (str)
logical_path (str) – extract just one logical path into dstdir, without any path segments below dstdir
- Returns:
metadata object for the version extracted.
- Return type:
If dstdir doesn’t exists then create it. The parent directory of dstdir must exist. If dstdir exists, then a file of the same name must not exist.
- id_from_inventory(failure_value='UNKNOWN-ID')
Read JSON root inventory file for this object and extract id.
- Parameters:
failure_value (str or None) – value to return if no id can be extracted. Default is “UNKNOWN-ID”
- Returns:
the id from the inventory or failure_value is none can be extracted.
- Return type:
str
- object_declaration_object()
NAMASTE object declaration Namaste object.
- open_obj_fs(objdir, create=False)
Open an fs filesystem for this object.
- Parameters:
objdir (str) – path string to either regular filesystem directory or else to a fs filesystem string (e.g. may be
zip://.../zipfile.zipormem://)create (bool) – True to create path/filesystem as needed, defaults to False
- Raises:
ocfl.ObjectException – on failure to open filesystem
Sets obj_fs attribute with the filesystem instance
- parse_inventory()
Read JSON root inventory file for this object.
Will validate the inventory and normalize the digests so that the rest of the Object methods can assume correctness and matching string digests between state and manifest blocks.
- Returns:
new Inventory object for the parsed inventory.
- Return type:
- start_inventory()
Create inventory start with metadata from self.
- Returns:
the start of an Inventory object based on the instance data in this object
- Return type:
- start_new_version(*, objdir=None, srcdir='', digest_algorithm=None, fixity=None, metadata=None, carry_content_forward=True)
Start a new version to be added to this object.
- Parameters:
objdir (str or None) – sub-directory of the object filesystem that contains the object to be update. The default is None in which case the object is assumed to be at the filesystem root of the currently open object filesystem.
srcdir (str) – the source directory
digest_algorithm (str or None) – the digest algorithm used for content addressing within the new version of this object. Default None which means use same digest algorithm as the last version
fixity (list or None) – list of fixity types use for the fixity section of the new version. Default None which means to use the same fixity digests as the last version
carry_content_forward (bool) – True to carry forward the state from the last current version as a starting point. False to start with empty version state.
- Returns:
object where the new version will be built before finally be added with write_new_version()
- Return type:
- tree(objdir)
Build human readable tree showing OCFL object at objdir.
- Parameters:
objdir (str) – object directory to examine
- Returns:
human readable string with tree of object structure
- Return type:
str
- validate(objdir=None, log_warnings=True, log_errors=True, check_digests=True)
Validate OCFL object at objdir.
- Parameters:
objdir (str) – path to object to validate
log_warnings (bool) – True (default) to include warnings in the validation log
log_errors (bool) – True (default) to include errors in the validation log
check_digests (bool) – True (deafult) to check content file digests in the validation process
- Returns:
(passed, validator)where passed is True if validation passed, False otherwise. validator is the ocfl.Validator object used for validation. The state in validator records validation history including validator.messages- Return type:
tuple
- validate_inventory(path, log_warnings=True, log_errors=True, force_spec_version=None)
Validate just an OCFL Object inventory at path.
- Parameters:
path (str) – path of inventory file
log_warnings (bool) – True to log warnings
log_errors (bool) – True to log errors
force_spec_version (str or None) – None to read specification version from inventory; or specific number to force validation against that specification version
- Returns:
(passed, validator)where passed is True if validation passed, False otherwise. validator is the ocfl.Validator object used for validation. The state in validator records validation history includingvalidator.messages- Return type:
tuple
- version_dirs_and_metadata(src_fs, versions_metadata=None)
Generate an OCFL inventory from a set of source files.
- Parameters:
src_fc (str) – pyfs filesystem of source files.
versions_metadata (dict) – dict of VersionMetadata objects for each version, key is the integer version number. Default is None in which case no metadata is added.
- Yields:
tuple – (vdir, inventory, manifest_to_srcfile) for each version in sequence, where vdir is the version directory name, inventory is the Inventory object for that version, and manifest_to_srcfile is a dictionary that maps filepaths in the manifest to actual source filepaths.
- write_inventory_and_sidecar(inventory=None, vdir='')
Write inventory and sidecar to vdir in the current object.
- Parameters:
inventory – an Inventory object to write the inventory, else None if only the sidecar should be written (default)
vdir – string of the directory name within self.obj_fs that the inventory and sidecar should be written to. Default is “”
- Returns:
the inventory sidecar filename
- Return type:
str
Assumes self.obj_fs is open for this object. Will create vdir if that does not exist. If vdir is not specified then will write to root of the object filesystem.
- write_inventory_sidecar()
Write just sidecare for this object’s already existing root inventory file.
- Returns:
the inventory sidecar filename
- Return type:
str
- write_new_version(new_version)
Update this object with the specified new version.
- Parameters:
object (ocfl.NewVersion) – object with new version information to be added
- Returns:
of the latest version just written
- Return type:
- write_object_declaration()
Write NAMASTE object declaration.
Assumes self.obj_fs is open for this object and writes into the root directory of that filesystem.
- class ocfl.ObjectException
Exception class for OCFL Object.