aiida.engine.processes.calcjobs package¶
Module for the CalcJob process and related utilities.
-
class
aiida.engine.processes.calcjobs.
CalcJob
(*args, **kwargs)[source]¶ Bases:
aiida.engine.processes.process.Process
Implementation of the CalcJob process.
-
_Process__called
= True¶
-
__abstractmethods__
= frozenset({})¶
-
__init__
(*args, **kwargs)[source]¶ Construct a CalcJob instance.
Construct the instance only if it is a sub class of CalcJob, otherwise raise InvalidOperation.
See documentation of
aiida.engine.Process
.
-
__module__
= 'aiida.engine.processes.calcjobs.calcjob'¶
-
_abc_impl
= <_abc_data object>¶
-
_node_class
¶ alias of
aiida.orm.nodes.process.calculation.calcjob.CalcJobNode
-
_spec
= <aiida.engine.processes.process_spec.CalcJobProcessSpec object>¶
-
_spec_class
¶ alias of
aiida.engine.processes.process_spec.CalcJobProcessSpec
-
link_label_retrieved
= 'retrieved'¶
-
on_terminated
()[source]¶ Cleanup the node by deleting the calulation job state.
Note
This has to be done before calling the super because that will seal the node after we cannot change it
-
property
options
¶ Return the options of the metadata that were specified when this process instance was launched.
- Returns
options dictionary
- Return type
dict
-
parse
(retrieved_temporary_folder=None)[source]¶ Parse a retrieved job calculation.
This is called once it’s finished waiting for the calculation to be finished and the data has been retrieved.
-
presubmit
(folder)[source]¶ Prepares the calculation folder with all inputs, ready to be copied to the cluster.
- Parameters
folder (
aiida.common.folders.Folder
) – a SandboxFolder that can be used to write calculation input files and the scheduling script.- Return calcinfo
the CalcInfo object containing the information needed by the daemon to handle operations.
- Rtype calcinfo
-
run
()[source]¶ Run the calculation job.
This means invoking the presubmit and storing the temporary folder in the node’s repository. Then we move the process in the Wait state, waiting for the UPLOAD transport task to be started.
-
spec_options
= <aiida.engine.processes.ports.PortNamespace object>¶
-
Submodules¶
Implementation of the CalcJob process.
-
class
aiida.engine.processes.calcjobs.calcjob.
CalcJob
(*args, **kwargs)[source]¶ Bases:
aiida.engine.processes.process.Process
Implementation of the CalcJob process.
-
_Process__called
= True¶
-
__abstractmethods__
= frozenset({})¶
-
__init__
(*args, **kwargs)[source]¶ Construct a CalcJob instance.
Construct the instance only if it is a sub class of CalcJob, otherwise raise InvalidOperation.
See documentation of
aiida.engine.Process
.
-
__module__
= 'aiida.engine.processes.calcjobs.calcjob'¶
-
_abc_impl
= <_abc_data object>¶
-
_node_class
¶ alias of
aiida.orm.nodes.process.calculation.calcjob.CalcJobNode
-
_spec
= <aiida.engine.processes.process_spec.CalcJobProcessSpec object>¶
-
_spec_class
¶ alias of
aiida.engine.processes.process_spec.CalcJobProcessSpec
-
link_label_retrieved
= 'retrieved'¶
-
on_terminated
()[source]¶ Cleanup the node by deleting the calulation job state.
Note
This has to be done before calling the super because that will seal the node after we cannot change it
-
property
options
¶ Return the options of the metadata that were specified when this process instance was launched.
- Returns
options dictionary
- Return type
dict
-
parse
(retrieved_temporary_folder=None)[source]¶ Parse a retrieved job calculation.
This is called once it’s finished waiting for the calculation to be finished and the data has been retrieved.
-
presubmit
(folder)[source]¶ Prepares the calculation folder with all inputs, ready to be copied to the cluster.
- Parameters
folder (
aiida.common.folders.Folder
) – a SandboxFolder that can be used to write calculation input files and the scheduling script.- Return calcinfo
the CalcInfo object containing the information needed by the daemon to handle operations.
- Rtype calcinfo
-
run
()[source]¶ Run the calculation job.
This means invoking the presubmit and storing the temporary folder in the node’s repository. Then we move the process in the Wait state, waiting for the UPLOAD transport task to be started.
-
spec_options
= <aiida.engine.processes.ports.PortNamespace object>¶
-
Module containing utilities and classes relating to job calculations running on systems that require transport.
-
class
aiida.engine.processes.calcjobs.manager.
JobsList
(authinfo, transport_queue, last_updated=None)[source]¶ Bases:
object
Manager of calculation jobs submitted with a specific
AuthInfo
, i.e. computer configured for a specific user.This container of active calculation jobs is used to update their status periodically in batches, ensuring that even when a lot of jobs are running, the scheduler update command is not triggered for each job individually.
In addition, the
Computer
for which theAuthInfo
is configured, can define a minimum polling interval. This class will guarantee that the time between update calls to the scheduler is larger or equal to that minimum interval.Note that since each instance operates on a specific authinfo, the guarantees of batching scheduler update calls and the limiting of number of calls per unit time, through the minimum polling interval, is only applicable for jobs launched with that particular authinfo. If multiple authinfo instances with the same computer, have active jobs these limitations are not respected between them, since there is no communication between
JobsList
instances. See theJobManager
for example usage.-
__dict__
= mappingproxy({'__module__': 'aiida.engine.processes.calcjobs.manager', '__doc__': 'Manager of calculation jobs submitted with a specific ``AuthInfo``, i.e. computer configured for a specific user.\n\n This container of active calculation jobs is used to update their status periodically in batches, ensuring that\n even when a lot of jobs are running, the scheduler update command is not triggered for each job individually.\n\n In addition, the :py:class:`~aiida.orm.computers.Computer` for which the :py:class:`~aiida.orm.authinfos.AuthInfo`\n is configured, can define a minimum polling interval. This class will guarantee that the time between update calls\n to the scheduler is larger or equal to that minimum interval.\n\n Note that since each instance operates on a specific authinfo, the guarantees of batching scheduler update calls\n and the limiting of number of calls per unit time, through the minimum polling interval, is only applicable for jobs\n launched with that particular authinfo. If multiple authinfo instances with the same computer, have active jobs\n these limitations are not respected between them, since there is no communication between ``JobsList`` instances.\n See the :py:class:`~aiida.engine.processes.calcjobs.manager.JobManager` for example usage.\n ', '__init__': <function JobsList.__init__>, 'logger': <property object>, 'get_minimum_update_interval': <function JobsList.get_minimum_update_interval>, 'last_updated': <property object>, '_get_jobs_from_scheduler': <function JobsList._get_jobs_from_scheduler>, '_update_job_info': <function JobsList._update_job_info>, 'request_job_info_update': <function JobsList.request_job_info_update>, '_ensure_updating': <function JobsList._ensure_updating>, '_has_job_state_changed': <staticmethod object>, '_get_next_update_delay': <function JobsList._get_next_update_delay>, '_update_requests_outstanding': <function JobsList._update_requests_outstanding>, '_get_jobs_with_scheduler': <function JobsList._get_jobs_with_scheduler>, '__dict__': <attribute '__dict__' of 'JobsList' objects>, '__weakref__': <attribute '__weakref__' of 'JobsList' objects>})¶
-
__init__
(authinfo, transport_queue, last_updated=None)[source]¶ Construct an instance for the given authinfo and transport queue.
- Parameters
authinfo (
aiida.orm.AuthInfo
) – The authinfo used to check the jobs listtransport_queue – A transport queue
last_updated – initialize the last updated timestamp
- Type
- Type
float
-
__module__
= 'aiida.engine.processes.calcjobs.manager'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
_ensure_updating
()[source]¶ Ensure that we are updating the job list from the remote resource.
This will automatically stop if there are no outstanding requests.
-
_get_jobs_from_scheduler
()[source]¶ Get the current jobs list from the scheduler.
- Returns
a mapping of job ids to
JobInfo
instances- Return type
dict
-
_get_jobs_with_scheduler
()[source]¶ Get all the jobs that are currently with scheduler.
- Returns
the list of jobs with the scheduler
- Return type
list
-
_get_next_update_delay
()[source]¶ Calculate when we are next allowed to poll the scheduler.
This delay is calculated as the minimum polling interval defined by the authentication info for this instance, minus time elapsed since the last update.
- Returns
delay (in seconds) after which the scheduler may be polled again
- Return type
float
-
static
_has_job_state_changed
(old, new)[source]¶ Return whether the states old and new are different.
- Return type
bool
-
_update_job_info
()[source]¶ Update all of the job information objects.
This will set the futures for all pending update requests where the corresponding job has a new status compared to the last update.
-
get_minimum_update_interval
()[source]¶ Get the minimum interval that should be respected between updates of the list.
- Returns
the minimum interval
- Return type
float
-
property
last_updated
¶ Get the timestamp of when the list was last updated as produced by time.time()
- Returns
The last update point
- Return type
float
-
property
logger
¶ Return the logger configured for this instance.
- Returns
the logger
-
request_job_info_update
(job_id)[source]¶ Request job info about a job when the job next changes state.
If the job is not found in the jobs list at the update, the future will resolve to None.
- Parameters
job_id – job identifier
- Returns
future that will resolve to a JobInfo object when the job changes state
-
-
class
aiida.engine.processes.calcjobs.manager.
JobManager
(transport_queue)[source]¶ Bases:
object
A manager for
CalcJob
submitted toComputer
instances.When a calculation job is submitted to a
Computer
, it actually uses a specificAuthInfo
, which is a computer configured for aUser
. TheJobManager
maintains a mapping ofJobsList
instances for each authinfo that has active calculation jobs. These jobslist instances are then responsible for bundling scheduler updates for all the jobs they maintain (i.e. that all share the same authinfo) and update their status.As long as a
Runner
will create a singleJobManager
instance and use that for its lifetime, the guarantees made by theJobsList
about respecting the minimum polling interval of the scheduler will be maintained. Note, however, that since eachRunner
will create its own job manager, these guarantees only hold per runner.-
__dict__
= mappingproxy({'__module__': 'aiida.engine.processes.calcjobs.manager', '__doc__': 'A manager for :py:class:`~aiida.engine.processes.calcjobs.calcjob.CalcJob` submitted to ``Computer`` instances.\n\n When a calculation job is submitted to a :py:class:`~aiida.orm.computers.Computer`, it actually uses a specific\n :py:class:`~aiida.orm.authinfos.AuthInfo`, which is a computer configured for a :py:class:`~aiida.orm.users.User`.\n The ``JobManager`` maintains a mapping of :py:class:`~aiida.engine.processes.calcjobs.manager.JobsList` instances\n for each authinfo that has active calculation jobs. These jobslist instances are then responsible for bundling\n scheduler updates for all the jobs they maintain (i.e. that all share the same authinfo) and update their status.\n\n As long as a :py:class:`~aiida.engine.runners.Runner` will create a single ``JobManager`` instance and use that for\n its lifetime, the guarantees made by the ``JobsList`` about respecting the minimum polling interval of the scheduler\n will be maintained. Note, however, that since each ``Runner`` will create its own job manager, these guarantees\n only hold per runner.\n ', '__init__': <function JobManager.__init__>, 'get_jobs_list': <function JobManager.get_jobs_list>, 'request_job_info_update': <function JobManager.request_job_info_update>, '__dict__': <attribute '__dict__' of 'JobManager' objects>, '__weakref__': <attribute '__weakref__' of 'JobManager' objects>})¶
-
__module__
= 'aiida.engine.processes.calcjobs.manager'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
get_jobs_list
(authinfo)[source]¶ Get or create a new JobLists instance for the given authinfo.
- Parameters
authinfo – the AuthInfo
- Returns
a JobsList instance
-
request_job_info_update
(authinfo, job_id)[source]¶ Get a future that will resolve to information about a given job.
This is a context manager so that if the user leaves the context the request is automatically cancelled.
- Returns
A tuple containing the JobInfo object and detailed job info. Both can be None.
- Return type
tornado.concurrent.Future
-
-
class
aiida.engine.processes.calcjobs.tasks.
Waiting
(process, done_callback, msg=None, data=None)[source]¶ Bases:
plumpy.process_states.Waiting
The waiting state for the CalcJob process.
-
__init__
(process, done_callback, msg=None, data=None)[source]¶ - Parameters
state_machine (
StateMachine
) – The process this state belongs to
-
__module__
= 'aiida.engine.processes.calcjobs.tasks'¶
-
execute
()[source]¶ Execute the state, performing the actions that this state is responsible for. Return a state to transition to or None if finished.
-
interrupt
(reason)[source]¶ Interrupt the Waiting state by calling interrupt on the transport task InterruptableFuture.
-
-
aiida.engine.processes.calcjobs.tasks.
task_kill_job
(node, transport_queue, cancellable)[source]¶ Transport task that will attempt to kill a job calculation
The task will first request a transport from the queue. Once the transport is yielded, the relevant execmanager function is called, wrapped in the exponential_backoff_retry coroutine, which, in case of a caught exception, will retry after an interval that increases exponentially with the number of retries, for a maximum number of retries. If all retries fail, the task will raise a TransportTaskException
- Parameters
node – the node that represents the job calculation
transport_queue – the TransportQueue from which to request a Transport
cancellable (
aiida.engine.utils.InterruptableFuture
) – the cancelled flag that will be queried to determine whether the task was cancelled
- Raises
Return if the tasks was successfully completed
- Raises
TransportTaskException if after the maximum number of retries the transport task still excepted
-
aiida.engine.processes.calcjobs.tasks.
task_retrieve_job
(node, transport_queue, retrieved_temporary_folder, cancellable)[source]¶ Transport task that will attempt to retrieve all files of a completed job calculation
The task will first request a transport from the queue. Once the transport is yielded, the relevant execmanager function is called, wrapped in the exponential_backoff_retry coroutine, which, in case of a caught exception, will retry after an interval that increases exponentially with the number of retries, for a maximum number of retries. If all retries fail, the task will raise a TransportTaskException
- Parameters
node – the node that represents the job calculation
transport_queue – the TransportQueue from which to request a Transport
cancellable (
aiida.engine.utils.InterruptableFuture
) – the cancelled flag that will be queried to determine whether the task was cancelled
- Raises
Return if the tasks was successfully completed
- Raises
TransportTaskException if after the maximum number of retries the transport task still excepted
-
aiida.engine.processes.calcjobs.tasks.
task_submit_job
(node, transport_queue, calc_info, script_filename, cancellable)[source]¶ Transport task that will attempt to submit a job calculation
The task will first request a transport from the queue. Once the transport is yielded, the relevant execmanager function is called, wrapped in the exponential_backoff_retry coroutine, which, in case of a caught exception, will retry after an interval that increases exponentially with the number of retries, for a maximum number of retries. If all retries fail, the task will raise a TransportTaskException
- Parameters
node – the node that represents the job calculation
transport_queue – the TransportQueue from which to request a Transport
calc_info – the calculation info datastructure returned by CalcJobNode._presubmit
script_filename – the job launch script returned by CalcJobNode._presubmit
cancellable (
aiida.engine.utils.InterruptableFuture
) – the cancelled flag that will be queried to determine whether the task was cancelled
- Raises
Return if the tasks was successfully completed
- Raises
TransportTaskException if after the maximum number of retries the transport task still excepted
-
aiida.engine.processes.calcjobs.tasks.
task_update_job
(node, job_manager, cancellable)[source]¶ Transport task that will attempt to update the scheduler status of the job calculation
The task will first request a transport from the queue. Once the transport is yielded, the relevant execmanager function is called, wrapped in the exponential_backoff_retry coroutine, which, in case of a caught exception, will retry after an interval that increases exponentially with the number of retries, for a maximum number of retries. If all retries fail, the task will raise a TransportTaskException
- Parameters
node (
aiida.orm.nodes.process.calculation.calcjob.CalcJobNode
) – the node that represents the job calculationjob_manager (
aiida.engine.processes.calcjobs.manager.JobManager
) – The job managercancellable (
aiida.engine.utils.InterruptableFuture
) – A cancel flag
- Raises
Return containing True if the tasks was successfully completed, False otherwise
-
aiida.engine.processes.calcjobs.tasks.
task_upload_job
(node, transport_queue, calc_info, script_filename, cancellable)[source]¶ Transport task that will attempt to upload the files of a job calculation to the remote
The task will first request a transport from the queue. Once the transport is yielded, the relevant execmanager function is called, wrapped in the exponential_backoff_retry coroutine, which, in case of a caught exception, will retry after an interval that increases exponentially with the number of retries, for a maximum number of retries. If all retries fail, the task will raise a TransportTaskException
- Parameters
node – the node that represents the job calculation
transport_queue – the TransportQueue from which to request a Transport
calc_info – the calculation info datastructure returned by CalcJobNode._presubmit
script_filename – the job launch script returned by CalcJobNode._presubmit
cancellable (
aiida.engine.utils.InterruptableFuture
) – the cancelled flag that will be queried to determine whether the task was cancelled
- Raises
Return if the tasks was successfully completed
- Raises
TransportTaskException if after the maximum number of retries the transport task still excepted