aiida.schedulers.plugins package#

Submodules#

Job scheduler that is interacted with through a CLI in bash.

class aiida.schedulers.plugins.bash.BashCliScheduler[source]#

Bases: Scheduler

Job scheduler that is interacted with through a CLI in bash.

__abstractmethods__ = frozenset({'_get_joblist_command', '_get_kill_command', '_get_submit_command', '_get_submit_script_header', '_parse_joblist_output', '_parse_kill_output', '_parse_submit_output'})#
__module__ = 'aiida.schedulers.plugins.bash'#
_abc_impl = <_abc._abc_data object>#
abstract _get_joblist_command(jobs: list[str] | None = None, user: str | None = None) str[source]#

Return the command to get the most complete description possible of currently active jobs.

Note

Typically one can pass only either jobs or user, depending on the specific plugin. The choice can be done according to the value returned by self.get_feature(‘can_query_by_user’)

Parameters:
  • jobs – either None to get a list of all jobs in the machine, or a list of jobs.

  • user – either None, or a string with the username (to show only jobs of the specific user).

abstract _get_kill_command(jobid: str) str[source]#

Return the command to kill the job with specified jobid.

abstract _get_submit_command(submit_script: str) str[source]#

Return the string to execute to submit a given script.

Warning

the submit_script should already have been bash-escaped

Parameters:

submit_script – the path of the submit script relative to the working directory.

Returns:

the string to execute to submit a given script.

abstract _parse_joblist_output(retval: int, stdout: str, stderr: str) list[JobInfo][source]#

Parse the joblist output as returned by executing the command returned by _get_joblist_command method.

Returns:

list of JobInfo objects, one of each job each with at least its default params implemented.

abstract _parse_kill_output(retval: int, stdout: str, stderr: str) bool[source]#

Parse the output of the kill command.

Returns:

True if everything seems ok, False otherwise.

abstract _parse_submit_output(retval: int, stdout: str, stderr: str) str | ExitCode[source]#

Parse the output of the submit command returned by calling the _get_submit_command command.

Returns:

a string with the job ID or an exit code if the submission failed because the submission script is invalid and the job should be terminated.

get_jobs(jobs: list[str] | None = None, user: str | None = None, as_dict: bool = False) list[JobInfo] | dict[str, JobInfo][source]#

Return the list of currently active jobs.

Parameters:
  • jobs – A list of jobs to check; only these are checked.

  • user – A string with a user: only jobs of this user are checked.

  • as_dict – If False (default), a list of JobInfo objects is returned. If True, a dictionary is returned, where the job_id is the key and the values are the JobInfo objects.

Returns:

List of active jobs.

kill_job(jobid: str) bool[source]#

Kill a remote job and parse the return value of the scheduler to check if the command succeeded.

..note:

On some schedulers, even if the command is accepted, it may take some seconds for the job to actually
disappear from the queue.
Parameters:

jobid – the job ID to be killed

Returns:

True if everything seems ok, False otherwise.

submit_job(working_directory: str, filename: str) str | ExitCode[source]#

Submit a job.

Parameters:
  • working_directory – The absolute filepath to the working directory where the job is to be exectued.

  • filename – The filename of the submission script relative to the working directory.

Plugin for direct execution.

class aiida.schedulers.plugins.direct.DirectJobResource(**kwargs)[source]#

Bases: NodeNumberJobResource

An implementation of JobResource for the direct excution bypassing schedulers.

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.direct'#
_abc_impl = <_abc._abc_data object>#
classmethod accepts_default_memory_per_machine()[source]#

Return True if this subclass accepts a default_memory_per_machine key, False otherwise.

num_cores_per_machine: int#
num_cores_per_mpiproc: int#
num_machines: int#
num_mpiprocs_per_machine: int#
class aiida.schedulers.plugins.direct.DirectScheduler[source]#

Bases: BashCliScheduler

Support for the direct execution bypassing schedulers.

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.direct'#
_abc_impl = <_abc._abc_data object>#
_convert_time(string)[source]#

Convert a string in the format HH:MM:SS to a number of seconds.

_features: dict[str, bool] = {'can_query_by_user': True}#
_get_joblist_command(jobs=None, user=None)[source]#

The command to report full information on existing jobs.

TODO: in the case of job arrays, decide what to do (i.e., if we want

to pass the -t options to list each subjob).

_get_kill_command(jobid)[source]#

Return the command to kill the job with specified jobid.

_get_submit_command(submit_script)[source]#

Return the string to execute to submit a given script.

Note

One needs to redirect stdout and stderr to /dev/null otherwise the daemon remains hanging for the script to run

Parameters:

submit_script – the path of the submit script relative to the working directory. IMPORTANT: submit_script should be already escaped.

_get_submit_script_header(job_tmpl)[source]#

Return the submit script header, using the parameters from the job_tmpl.

Args:#

job_tmpl: an JobTemplate instance with relevant parameters set.

_job_resource_class#

alias of DirectJobResource

_logger = <Logger aiida.scheduler.direct (WARNING)>#
_parse_joblist_output(retval, stdout, stderr)[source]#

Parse the queue output string, as returned by executing the command returned by _get_joblist_command command (qstat -f).

Return a list of JobInfo objects, one of each job, each relevant parameters implemented.

Note

depending on the scheduler configuration, finished jobs may either appear here, or not. This function will only return one element for each job find in the qstat output; missing jobs (for whatever reason) simply will not appear here.

_parse_kill_output(retval, stdout, stderr)[source]#

Parse the output of the kill command.

To be implemented by the plugin.

Returns:

True if everything seems ok, False otherwise.

_parse_submit_output(retval, stdout, stderr)[source]#

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Return a string with the JobID.

get_jobs(jobs=None, user=None, as_dict=False)[source]#

Overrides original method from DirectScheduler in order to list missing processes as DONE.

Plugin for LSF. This has been tested on the CERN lxplus cluster (LSF 9.1.3)

class aiida.schedulers.plugins.lsf.LsfJobResource(**kwargs)[source]#

Bases: JobResource

An implementation of JobResource for LSF, that supports the OPTIONAL specification of a parallel environment (a string) + the total number of processors.

‘parallel_env’ should contain a string of the form “host1 host2! hostgroupA! host3 host4” where the “!” symbol indicates the first execution host candidates. Other hosts are added only if the number of processors asked is more than those of the first execution host. See https://www-01.ibm.com/support/knowledgecenter/SSETD4_9.1.2/lsf_command_ref/bsub.1.dita?lang=en for more details about the parallel environment definition (the -m option of bsub).

__abstractmethods__ = frozenset({})#
__init__(**kwargs)[source]#

Initialize the job resources from the passed arguments (the valid keys can be obtained with the function self.get_valid_keys()).

Raises:
__module__ = 'aiida.schedulers.plugins.lsf'#
_abc_impl = <_abc._abc_data object>#
_default_fields = ('parallel_env', 'tot_num_mpiprocs', 'default_mpiprocs_per_machine', 'num_machines')#
classmethod accepts_default_mpiprocs_per_machine()[source]#

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]#

Return the total number of cpus of this job resource.

classmethod validate_resources(**kwargs)[source]#

Validate the resources against the job resource class of this scheduler.

Parameters:

kwargs – dictionary of values to define the job resources

Returns:

attribute dictionary with the parsed parameters populated

Raises:

ValueError – if the resources are invalid or incomplete

class aiida.schedulers.plugins.lsf.LsfScheduler[source]#

Bases: BashCliScheduler

Support for the IBM LSF scheduler

https://www-01.ibm.com/support/knowledgecenter/SSETD4_9.1.2/lsf_welcome.html

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.lsf'#
_abc_impl = <_abc._abc_data object>#
_features: dict[str, bool] = {'can_query_by_user': False}#
_get_detailed_job_info_command(job_id)[source]#

Return the command to run to get the detailed information on a job, even after the job has finished.

The output text is just retrieved, and returned for logging purposes.

_get_joblist_command(jobs=None, user=None)[source]#

The command to report full information on existing jobs.

Separates the fields with the _field_separator string order: jobnum, state, walltime, queue[=partition], user, numnodes, numcores, title

_get_kill_command(jobid)[source]#

Return the command to kill the job with specified jobid.

_get_submit_command(submit_script)[source]#

Return the string to execute to submit a given script.

Parameters:

submit_script – the path of the submit script relative to the working directory. IMPORTANT: submit_script should be already escaped.

Return the submit script final part, using the parameters from the job_tmpl.

Parameters:

job_tmpl – a JobTemplate instance with relevant parameters set.

_get_submit_script_header(job_tmpl)[source]#

Return the submit script header, using the parameters from the job_tmpl. See the following manual https://www-01.ibm.com/support/knowledgecenter/SSETD4_9.1.2/lsf_command_ref/bsub.1.dita?lang=en for more details about the possible options to bsub, in particular for the parallel environment definition (with the -m option).

Parameters:

job_tmpl – an JobTemplate instance with relevant parameters set.

_job_resource_class#

alias of LsfJobResource

_joblist_fields = ['id', 'stat', 'exit_reason', 'exec_host', 'user', 'slots', 'max_req_proc', 'exec_host', 'queue', 'finish_time', 'start_time', '%complete', 'submit_time', 'name']#
_logger = <Logger aiida.scheduler.lsf (WARNING)>#
_parse_joblist_output(retval, stdout, stderr)[source]#

Parse the queue output string, as returned by executing the command returned by _get_joblist_command command, that is here implemented as a list of lines, one for each job, with _field_separator as separator. The order is described in the _get_joblist_command function.

Return a list of JobInfo objects, one of each job, each relevant parameters implemented.

Note: depending on the scheduler configuration, finished jobs may

either appear here, or not. This function will only return one element for each job find in the qstat output; missing jobs (for whatever reason) simply will not appear here.

_parse_kill_output(retval, stdout, stderr)[source]#

Parse the output of the kill command.

Returns:

True if everything seems ok, False otherwise.

_parse_submit_output(retval, stdout, stderr)[source]#

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Return a string with the JobID.

_parse_time_string(string, fmt='%b %d %H:%M')[source]#

Parse a time string and returns a datetime object. Example format: ‘Feb 2 07:39’ or ‘Feb 2 07:39 L’

Base classes for PBSPro and PBS/Torque plugins.

class aiida.schedulers.plugins.pbsbaseclasses.PbsBaseClass[source]#

Bases: BashCliScheduler

Base class with support for the PBSPro scheduler

(http://www.pbsworks.com/) and for PBS and Torque (http://www.adaptivecomputing.com/products/open-source/torque/).

Only a few properties need to be redefined, see examples of the pbspro and torque plugins

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.pbsbaseclasses'#
_abc_impl = <_abc._abc_data object>#
static _convert_time(string)[source]#

Convert a string in the format HH:MM:SS to a number of seconds.

_features: dict[str, bool] = {'can_query_by_user': False}#
_get_detailed_job_info_command(job_id)[source]#

Return the command to run to get the detailed information on a job, even after the job has finished.

The output text is just retrieved, and returned for logging purposes.

_get_joblist_command(jobs=None, user=None)[source]#

The command to report full information on existing jobs.

TODO: in the case of job arrays, decide what to do (i.e., if we want

to pass the -t options to list each subjob).

_get_kill_command(jobid)[source]#

Return the command to kill the job with specified jobid.

_get_resource_lines(num_machines, num_mpiprocs_per_machine, num_cores_per_machine, max_memory_kb, max_wallclock_seconds)[source]#

Return a set a list of lines (possibly empty) with the header lines relative to:

  • num_machines

  • num_mpiprocs_per_machine

  • num_cores_per_machine

  • max_memory_kb

  • max_wallclock_seconds

This is done in an external function because it may change in different subclasses.

_get_submit_command(submit_script)[source]#

Return the string to execute to submit a given script.

Args:#

submit_script: the path of the submit script relative to the working

directory. IMPORTANT: submit_script should be already escaped.

_get_submit_script_header(job_tmpl)[source]#

Return the submit script header, using the parameters from the job_tmpl.

Args:#

job_tmpl: an JobTemplate instance with relevant parameters set.

TODO: truncate the title if too long

_job_resource_class#

alias of PbsJobResource

_map_status = {'B': JobState.RUNNING, 'C': JobState.DONE, 'E': JobState.RUNNING, 'F': JobState.DONE, 'H': JobState.QUEUED_HELD, 'M': JobState.UNDETERMINED, 'Q': JobState.QUEUED, 'R': JobState.RUNNING, 'S': JobState.SUSPENDED, 'T': JobState.QUEUED, 'U': JobState.SUSPENDED, 'W': JobState.QUEUED, 'X': JobState.DONE}#
_parse_joblist_output(retval, stdout, stderr)[source]#

Parse the queue output string, as returned by executing the command returned by _get_joblist_command command (qstat -f).

Return a list of JobInfo objects, one of each job, each relevant parameters implemented.

Note: depending on the scheduler configuration, finished jobs may

either appear here, or not. This function will only return one element for each job find in the qstat output; missing jobs (for whatever reason) simply will not appear here.

_parse_kill_output(retval, stdout, stderr)[source]#

Parse the output of the kill command.

To be implemented by the plugin.

Returns:

True if everything seems ok, False otherwise.

_parse_submit_output(retval, stdout, stderr)[source]#

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Return a string with the JobID.

static _parse_time_string(string, fmt='%a %b %d %H:%M:%S %Y')[source]#

Parse a time string in the format returned from qstat -f and returns a datetime object.

class aiida.schedulers.plugins.pbsbaseclasses.PbsJobResource(**kwargs)[source]#

Bases: NodeNumberJobResource

Class for PBS job resources.

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.pbsbaseclasses'#
_abc_impl = <_abc._abc_data object>#
num_cores_per_machine: int#
num_cores_per_mpiproc: int#
num_machines: int#
num_mpiprocs_per_machine: int#
classmethod validate_resources(**kwargs)[source]#

Validate the resources against the job resource class of this scheduler.

This extends the base class validator and calculates the num_cores_per_machine fields to pass to PBSlike schedulers. Checks that num_cores_per_machine is a multiple of num_cores_per_mpiproc and/or num_mpiprocs_per_machine.

Parameters:

kwargs – dictionary of values to define the job resources

Returns:

attribute dictionary with the parsed parameters populated

Raises:

ValueError – if the resources are invalid or incomplete

Plugin for PBSPro. This has been tested on PBSPro v. 12.

class aiida.schedulers.plugins.pbspro.PbsproScheduler[source]#

Bases: PbsBaseClass

Subclass to support the PBSPro scheduler (http://www.pbsworks.com/).

I redefine only what needs to change from the base class

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.pbspro'#
_abc_impl = <_abc._abc_data object>#
_get_resource_lines(num_machines, num_mpiprocs_per_machine, num_cores_per_machine, max_memory_kb, max_wallclock_seconds)[source]#

Return the lines for machines, memory and wallclock relative to pbspro.

Plugin for SGE. This has been tested on GE 6.2u3.

Plugin originally written by Marco Dorigo. Email: marco(DOT)dorigo(AT)rub(DOT)de

class aiida.schedulers.plugins.sge.SgeJobResource(**kwargs)[source]#

Bases: ParEnvJobResource

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.sge'#
_abc_impl = <_abc._abc_data object>#
parallel_env: str#
tot_num_mpiprocs: int#
class aiida.schedulers.plugins.sge.SgeScheduler[source]#

Bases: BashCliScheduler

Support for the Sun Grid Engine scheduler and its variants/forks (Son of Grid Engine, Oracle Grid Engine, …)

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.sge'#
_abc_impl = <_abc._abc_data object>#
_features: dict[str, bool] = {'can_query_by_user': True}#
_get_detailed_job_info_command(job_id)[source]#

Return the command to run to get detailed information for a given job.

This is typically called after the job has finished, to retrieve the most detailed information possible about the job. This is done because most schedulers just make finished jobs disappear from the qstat command, and instead sometimes it is useful to know some more detailed information about the job exit status, etc.

Raises:

aiida.common.exceptions.FeatureNotAvailable

_get_joblist_command(jobs=None, user=None)[source]#

The command to report full information on existing jobs.

TODO: in the case of job arrays, decide what to do (i.e., if we want

to pass the -t options to list each subjob).

!!!ALL COPIED FROM PBSPRO!!! TODO: understand if it is worth escaping the username, or rather leave it unescaped to allow to pass $USER

_get_kill_command(jobid)[source]#

Return the command to kill the job with specified jobid.

_get_submit_command(submit_script)[source]#

Return the string to execute to submit a given script.

Args:#

submit_script: the path of the submit script relative to the working

directory. IMPORTANT: submit_script should be already escaped.

_get_submit_script_header(job_tmpl)[source]#

Return the submit script header, using the parameters from the job_tmpl.

Args:#

job_tmpl: an JobTemplate instance with relevant parameters set.

TODO: truncate the title if too long

_job_resource_class#

alias of SgeJobResource

_logger = <Logger aiida.scheduler.sge (WARNING)>#
_parse_joblist_output(retval, stdout, stderr)[source]#

Parse the joblist output as returned by executing the command returned by _get_joblist_command method.

Returns:

list of JobInfo objects, one of each job each with at least its default params implemented.

_parse_kill_output(retval, stdout, stderr)[source]#

Parse the output of the kill command.

To be implemented by the plugin.

Returns:

True if everything seems ok, False otherwise.

_parse_submit_output(retval, stdout, stderr)[source]#

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Return a string with the JobID.

_parse_time_string(string, fmt='%Y-%m-%dT%H:%M:%S')[source]#

Parse a time string in the format returned from qstat -xml -ext and returns a datetime object. Example format: 2013-06-13T11:53:11

Plugin for SLURM. This has been tested on SLURM 14.03.7 on the CSCS.ch machines.

class aiida.schedulers.plugins.slurm.SlurmJobResource(**kwargs)[source]#

Bases: NodeNumberJobResource

Class for SLURM job resources.

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.slurm'#
_abc_impl = <_abc._abc_data object>#
num_cores_per_machine: int#
num_cores_per_mpiproc: int#
num_machines: int#
num_mpiprocs_per_machine: int#
classmethod validate_resources(**kwargs)[source]#

Validate the resources against the job resource class of this scheduler.

This extends the base class validator to check that the num_cores_per_machine are a multiple of num_cores_per_mpiproc and/or num_mpiprocs_per_machine.

Parameters:

kwargs – dictionary of values to define the job resources

Returns:

attribute dictionary with the parsed parameters populated

Raises:

ValueError – if the resources are invalid or incomplete

class aiida.schedulers.plugins.slurm.SlurmScheduler[source]#

Bases: BashCliScheduler

Support for the SLURM scheduler (http://slurm.schedmd.com/).

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.slurm'#
_abc_impl = <_abc._abc_data object>#
_convert_time(string)[source]#

Convert a string in the format DD-HH:MM:SS to a number of seconds.

_features: dict[str, bool] = {'can_query_by_user': False}#
_get_detailed_job_info_command(job_id)[source]#

Return the command to run to get the detailed information on a job, even after the job has finished.

The output text is just retrieved, and returned for logging purposes. –parsable split the fields with a pipe (|), adding a pipe also at the end.

_get_joblist_command(jobs=None, user=None)[source]#

The command to report full information on existing jobs.

Separate the fields with the _field_separator string order: jobnum, state, walltime, queue[=partition], user, numnodes, numcores, title

_get_kill_command(jobid)[source]#

Return the command to kill the job with specified jobid.

_get_submit_command(submit_script)[source]#

Return the string to execute to submit a given script.

Args:#

submit_script: the path of the submit script relative to the working

directory. IMPORTANT: submit_script should be already escaped.

_get_submit_script_header(job_tmpl)[source]#

Return the submit script header, using the parameters from the job_tmpl.

Args:#

job_tmpl: an JobTemplate instance with relevant parameters set.

TODO: truncate the title if too long

_job_resource_class#

alias of SlurmJobResource

_logger = <Logger aiida.scheduler.slurm (WARNING)>#
_parse_joblist_output(retval, stdout, stderr)[source]#

Parse the queue output string, as returned by executing the command returned by _get_joblist_command command, that is here implemented as a list of lines, one for each job, with _field_separator as separator. The order is described in the _get_joblist_command function.

Return a list of JobInfo objects, one of each job, each relevant parameters implemented.

Note: depending on the scheduler configuration, finished jobs may

either appear here, or not. This function will only return one element for each job find in the qstat output; missing jobs (for whatever reason) simply will not appear here.

_parse_kill_output(retval, stdout, stderr)[source]#

Parse the output of the kill command.

To be implemented by the plugin.

Returns:

True if everything seems ok, False otherwise.

_parse_submit_output(retval, stdout, stderr)[source]#

Parse the output of the submit command, as returned by executing the command returned by _get_submit_command command.

To be implemented by the plugin.

Return a string with the JobID.

_parse_time_string(string, fmt='%Y-%m-%dT%H:%M:%S')[source]#

Parse a time string in the format returned from qstat -f and returns a datetime object.

fields = [('%i', 'job_id'), ('%t', 'state_raw'), ('%r', 'annotation'), ('%B', 'executing_host'), ('%u', 'username'), ('%D', 'number_nodes'), ('%C', 'number_cpus'), ('%R', 'allocated_machines'), ('%P', 'partition'), ('%l', 'time_limit'), ('%M', 'time_used'), ('%S', 'dispatch_time'), ('%j', 'job_name'), ('%V', 'submission_time')]#
parse_output(detailed_job_info=None, stdout=None, stderr=None)[source]#

Parse the output of the scheduler.

Parameters:
  • detailed_job_info – dictionary with the output returned by the Scheduler.get_detailed_job_info command. This should contain the keys retval, stdout and stderr corresponding to the return value, stdout and stderr returned by the accounting command executed for a specific job id.

  • stdout – string with the output written by the scheduler to stdout.

  • stderr – string with the output written by the scheduler to stderr.

Returns:

None or an instance of aiida.engine.processes.exit_code.ExitCode.

Raises:

TypeError or ValueError – if the passed arguments have incorrect type or value.

Plugin for PBS/Torque. This has been tested on Torque v.2.4.16 (from Ubuntu).

class aiida.schedulers.plugins.torque.TorqueScheduler[source]#

Bases: PbsBaseClass

Subclass to support the Torque scheduler..

I redefine only what needs to change from the base class

__abstractmethods__ = frozenset({})#
__module__ = 'aiida.schedulers.plugins.torque'#
_abc_impl = <_abc._abc_data object>#
_get_resource_lines(num_machines, num_mpiprocs_per_machine, num_cores_per_machine, max_memory_kb, max_wallclock_seconds)[source]#

Return the lines for machines, memory and wallclock relative to pbspro.