aiida.repository.backend package#
Module for file repository backend implementations.
Submodules#
Class that defines the abstract interface for an object repository.
The scope of this class is intentionally very narrow. Any backend implementation should merely provide the methods to store binary blobs, or “objects”, and return a string-based key that unique identifies the object that was just created. This key should then be able to be used to retrieve the bytes of the corresponding object or to delete it.
- class aiida.repository.backend.abstract.AbstractRepositoryBackend[source]#
Bases:
object
Class that defines the abstract interface for an object repository.
The repository backend only deals with raw bytes, both when creating new objects as well as when returning a stream or the content of an existing object. The encoding and decoding of the byte content should be done by the client upstream. The file repository backend is also not expected to keep any kind of file hierarchy but must be assumed to be a simple flat data store. When files are created in the file object repository, the implementation will return a string-based key with which the content of the stored object can be addressed. This key is guaranteed to be unique and persistent. Persisting the key or mapping it onto a virtual file hierarchy is again up to the client upstream.
- __abstractmethods__ = frozenset({'_put_object_from_filelike', 'delete_objects', 'erase', 'get_info', 'has_objects', 'initialise', 'is_initialised', 'iter_object_streams', 'key_format', 'list_objects', 'maintain', 'uuid'})#
- __dict__ = mappingproxy({'__module__': 'aiida.repository.backend.abstract', '__doc__': 'Class that defines the abstract interface for an object repository.\n\n The repository backend only deals with raw bytes, both when creating new objects as well as when returning a stream\n or the content of an existing object. The encoding and decoding of the byte content should be done by the client\n upstream. The file repository backend is also not expected to keep any kind of file hierarchy but must be assumed\n to be a simple flat data store. When files are created in the file object repository, the implementation will return\n a string-based key with which the content of the stored object can be addressed. This key is guaranteed to be unique\n and persistent. Persisting the key or mapping it onto a virtual file hierarchy is again up to the client upstream.\n ', 'uuid': <property object>, 'key_format': <property object>, 'initialise': <function AbstractRepositoryBackend.initialise>, 'is_initialised': <property object>, 'erase': <function AbstractRepositoryBackend.erase>, 'is_readable_byte_stream': <staticmethod(<function AbstractRepositoryBackend.is_readable_byte_stream>)>, 'put_object_from_filelike': <function AbstractRepositoryBackend.put_object_from_filelike>, '_put_object_from_filelike': <function AbstractRepositoryBackend._put_object_from_filelike>, 'put_object_from_file': <function AbstractRepositoryBackend.put_object_from_file>, 'has_objects': <function AbstractRepositoryBackend.has_objects>, 'has_object': <function AbstractRepositoryBackend.has_object>, 'list_objects': <function AbstractRepositoryBackend.list_objects>, 'get_info': <function AbstractRepositoryBackend.get_info>, 'maintain': <function AbstractRepositoryBackend.maintain>, 'open': <function AbstractRepositoryBackend.open>, 'get_object_content': <function AbstractRepositoryBackend.get_object_content>, 'iter_object_streams': <function AbstractRepositoryBackend.iter_object_streams>, 'get_object_hash': <function AbstractRepositoryBackend.get_object_hash>, 'delete_objects': <function AbstractRepositoryBackend.delete_objects>, 'delete_object': <function AbstractRepositoryBackend.delete_object>, '__dict__': <attribute '__dict__' of 'AbstractRepositoryBackend' objects>, '__weakref__': <attribute '__weakref__' of 'AbstractRepositoryBackend' objects>, '__abstractmethods__': frozenset({'get_info', '_put_object_from_filelike', 'is_initialised', 'has_objects', 'initialise', 'delete_objects', 'maintain', 'iter_object_streams', 'key_format', 'uuid', 'erase', 'list_objects'}), '_abc_impl': <_abc._abc_data object>, '__annotations__': {}})#
- __module__ = 'aiida.repository.backend.abstract'#
- __weakref__#
list of weak references to the object
- _abc_impl = <_abc._abc_data object>#
- delete_object(key: str) None [source]#
Delete the object from the repository.
- Parameters:
key – fully qualified identifier for the object within the repository.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be deleted.
- abstract delete_objects(keys: List[str]) None [source]#
Delete the objects from the repository.
- Parameters:
keys – list of fully qualified identifiers for the objects within the repository.
- Raises:
FileNotFoundError – if any of the files does not exist.
OSError – if any of the files could not be deleted.
- abstract erase() None [source]#
Delete the repository itself and all its contents.
Note
This should not merely delete the contents of the repository but any resources it created. For example, if the repository is essentially a folder on disk, the folder itself should also be deleted, not just its contents.
- abstract get_info(detailed: bool = False, **kwargs) dict [source]#
Returns relevant information about the content of the repository.
- Parameters:
detailed – flag to enable extra information (detailed=False by default, only returns basic information).
- Returns:
a dictionary with the information.
- get_object_content(key: str) bytes [source]#
Return the content of a object identified by key.
- Parameters:
key – fully qualified identifier for the object within the repository.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be opened.
- get_object_hash(key: str) str [source]#
Return the SHA-256 hash of an object stored under the given key.
Important
A SHA-256 hash should always be returned, to ensure consistency across different repository implementations.
- Parameters:
key – fully qualified identifier for the object within the repository.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be opened.
- has_object(key: str) bool [source]#
Return whether the repository has an object with the given key.
- Parameters:
key – fully qualified identifier for the object within the repository.
- Returns:
True if the object exists, False otherwise.
- abstract has_objects(keys: List[str]) List[bool] [source]#
Return whether the repository has an object with the given key.
- Parameters:
keys – list of fully qualified identifiers for objects within the repository.
- Returns:
list of logicals, in the same order as the keys provided, with value True if the respective object exists and False otherwise.
- abstract initialise(**kwargs) None [source]#
Initialise the repository if it hasn’t already been initialised.
- Parameters:
kwargs – parameters for the initialisation.
- abstract iter_object_streams(keys: List[str]) Iterator[Tuple[str, BinaryIO]] [source]#
Return an iterator over the (read-only) byte streams of objects identified by key.
Note
handles should only be read within the context of this iterator.
- Parameters:
keys – fully qualified identifiers for the objects within the repository.
- Returns:
an iterator over the object byte streams.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if a file could not be opened.
- abstract property key_format: str | None#
Return the format for the keys of the repository.
Important for when migrating between backends (e.g. archive -> main), as if they are not equal then it is necessary to re-compute all the Node.base.repository.metadata before importing (otherwise they will not match with the repository).
- abstract list_objects() Iterable[str] [source]#
Return iterable that yields all available objects by key.
- Returns:
An iterable for all the available object keys.
- abstract maintain(dry_run: bool = False, live: bool = True, **kwargs) None [source]#
Performs maintenance operations.
- Parameters:
dry_run – flag to only print the actions that would be taken without actually executing them.
live – flag to indicate to the backend whether AiiDA is live or not (i.e. if the profile of the backend is currently being used/accessed). The backend is expected then to only allow (and thus set by default) the operations that are safe to perform in this state.
- open(key: str) Iterator[BinaryIO] [source]#
Open a file handle to an object stored under the given key.
Note
this should only be used to open a handle to read an existing file. To write a new file use the method
put_object_from_filelike
instead.- Parameters:
key – fully qualified identifier for the object within the repository.
- Returns:
yield a byte stream object.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be opened.
- put_object_from_file(filepath: str | Path) str [source]#
Store a new object with contents of the file located at filepath on this file system.
- Parameters:
filepath – absolute path of file whose contents to copy to the repository.
- Returns:
the generated fully qualified identifier for the object within the repository.
- Raises:
TypeError – if the handle is not a byte stream.
- put_object_from_filelike(handle: BinaryIO) str [source]#
Store the byte contents of a file in the repository.
- Parameters:
handle – filelike object with the byte content to be stored.
- Returns:
the generated fully qualified identifier for the object within the repository.
- Raises:
TypeError – if the handle is not a byte stream.
Implementation of the AbstractRepositoryBackend
using the disk-objectstore
as the backend.
- class aiida.repository.backend.disk_object_store.DiskObjectStoreRepositoryBackend(container: Container)[source]#
Bases:
AbstractRepositoryBackend
Implementation of the
AbstractRepositoryBackend
using thedisk-object-store
as the backend.Note
For certain methods, the container may create a sessions which should be closed after the operation is done to make sure the connection to the underlying sqlite database is closed. The best way is to accomplish this is by using the container as a context manager, which will automatically call the
close
method when it exits which ensures the session being closed. Note that not all methods may open the session and so need closing it, but to be on the safe side, we put every use of the container in a context manager. If no session is created, theclose
method is essentially a no-op.- __abstractmethods__ = frozenset({})#
- __module__ = 'aiida.repository.backend.disk_object_store'#
- _abc_impl = <_abc._abc_data object>#
- _put_object_from_filelike(handle: BinaryIO) str [source]#
Store the byte contents of a file in the repository.
- Parameters:
handle – filelike object with the byte content to be stored.
- Returns:
the generated fully qualified identifier for the object within the repository.
- Raises:
TypeError – if the handle is not a byte stream.
- delete_objects(keys: List[str]) None [source]#
Delete the objects from the repository.
- Parameters:
keys – list of fully qualified identifiers for the objects within the repository.
- Raises:
FileNotFoundError – if any of the files does not exist.
OSError – if any of the files could not be deleted.
- get_info(detailed=False) Dict[str, int | str | Dict[str, int] | Dict[str, float]] [source]#
Return information on configuration and content of the repository.
- get_object_hash(key: str) str [source]#
Return the SHA-256 hash of an object stored under the given key.
Important
A SHA-256 hash should always be returned, to ensure consistency across different repository implementations.
- Parameters:
key – fully qualified identifier for the object within the repository.
- Raises:
FileNotFoundError – if the file does not exist.
- has_objects(keys: List[str]) List[bool] [source]#
Return whether the repository has an object with the given key.
- Parameters:
keys – list of fully qualified identifiers for objects within the repository.
- Returns:
list of logicals, in the same order as the keys provided, with value True if the respective object exists and False otherwise.
- initialise(**kwargs) None [source]#
Initialise the repository if it hasn’t already been initialised.
- Parameters:
kwargs – parameters for the initialisation.
- iter_object_streams(keys: List[str]) Iterator[Tuple[str, BinaryIO]] [source]#
Return an iterator over the (read-only) byte streams of objects identified by key.
Note
handles should only be read within the context of this iterator.
- Parameters:
keys – fully qualified identifiers for the objects within the repository.
- Returns:
an iterator over the object byte streams.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if a file could not be opened.
- property key_format: str | None#
Return the format for the keys of the repository.
Important for when migrating between backends (e.g. archive -> main), as if they are not equal then it is necessary to re-compute all the Node.base.repository.metadata before importing (otherwise they will not match with the repository).
- list_objects() Iterable[str] [source]#
Return iterable that yields all available objects by key.
- Returns:
An iterable for all the available object keys.
- maintain(dry_run: bool = False, live: bool = True, pack_loose: bool | None = None, do_repack: bool | None = None, clean_storage: bool | None = None, do_vacuum: bool | None = None, compress: bool = False) dict [source]#
Performs maintenance operations.
- Parameters:
live – if True, will only perform operations that are safe to do while the repository is in use.
pack_loose – flag for forcing the packing of loose files.
do_repack – flag for forcing the re-packing of already packed files.
clean_storage – flag for forcing the cleaning of soft-deleted files from the repository.
do_vacuum – flag for forcing the vacuuming of the internal database when cleaning the repository.
compress – flag for compressing the data when packing loose files. Set to
Compress.AUTO
ifTrue
.
- Returns:
a dictionary with information on the operations performed.
- open(key: str) Iterator[BinaryIO] [source]#
Open a file handle to an object stored under the given key.
Note
this should only be used to open a handle to read an existing file. To write a new file use the method
put_object_from_filelike
instead.- Parameters:
key – fully qualified identifier for the object within the repository.
- Returns:
yield a byte stream object.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be opened.
Implementation of the AbstractRepositoryBackend
using a sandbox folder on disk as the backend.
- class aiida.repository.backend.sandbox.SandboxRepositoryBackend(filepath: str | None = None)[source]#
Bases:
AbstractRepositoryBackend
Implementation of the
AbstractRepositoryBackend
using a sandbox folder on disk as the backend.- __abstractmethods__ = frozenset({})#
- __init__(filepath: str | None = None)[source]#
Construct a new instance.
- Parameters:
filepath – The path to the directory in which the sandbox folder should be created.
- __module__ = 'aiida.repository.backend.sandbox'#
- _abc_impl = <_abc._abc_data object>#
- _put_object_from_filelike(handle: BinaryIO) str [source]#
Store the byte contents of a file in the repository.
- Parameters:
handle – filelike object with the byte content to be stored.
- Returns:
the generated fully qualified identifier for the object within the repository.
- Raises:
TypeError – if the handle is not a byte stream.
- delete_objects(keys: list[str]) None [source]#
Delete the objects from the repository.
- Parameters:
keys – list of fully qualified identifiers for the objects within the repository.
- Raises:
FileNotFoundError – if any of the files does not exist.
OSError – if any of the files could not be deleted.
- get_info(detailed: bool = False, **kwargs) dict [source]#
Returns relevant information about the content of the repository.
- Parameters:
detailed – flag to enable extra information (detailed=False by default, only returns basic information).
- Returns:
a dictionary with the information.
- has_objects(keys: list[str]) list[bool] [source]#
Return whether the repository has an object with the given key.
- Parameters:
keys – list of fully qualified identifiers for objects within the repository.
- Returns:
list of logicals, in the same order as the keys provided, with value True if the respective object exists and False otherwise.
- initialise(**kwargs) None [source]#
Initialise the repository if it hasn’t already been initialised.
- Parameters:
kwargs – parameters for the initialisation.
- iter_object_streams(keys: list[str]) Iterator[tuple[str, BinaryIO]] [source]#
Return an iterator over the (read-only) byte streams of objects identified by key.
Note
handles should only be read within the context of this iterator.
- Parameters:
keys – fully qualified identifiers for the objects within the repository.
- Returns:
an iterator over the object byte streams.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if a file could not be opened.
- property key_format: str | None#
Return the format for the keys of the repository.
Important for when migrating between backends (e.g. archive -> main), as if they are not equal then it is necessary to re-compute all the Node.base.repository.metadata before importing (otherwise they will not match with the repository).
- list_objects() Iterable[str] [source]#
Return iterable that yields all available objects by key.
- Returns:
An iterable for all the available object keys.
- maintain(dry_run: bool = False, live: bool = True, **kwargs) None [source]#
Performs maintenance operations.
- Parameters:
dry_run – flag to only print the actions that would be taken without actually executing them.
live – flag to indicate to the backend whether AiiDA is live or not (i.e. if the profile of the backend is currently being used/accessed). The backend is expected then to only allow (and thus set by default) the operations that are safe to perform in this state.
- open(key: str) Iterator[BinaryIO] [source]#
Open a file handle to an object stored under the given key.
Note
this should only be used to open a handle to read an existing file. To write a new file use the method
put_object_from_filelike
instead.- Parameters:
key – fully qualified identifier for the object within the repository.
- Returns:
yield a byte stream object.
- Raises:
FileNotFoundError – if the file does not exist.
OSError – if the file could not be opened.
- property sandbox#
Return the sandbox instance of this repository.