_images/AiiDA_transparent_logo.png

Automated Interactive Infrastructure and Database for Computational Science

Welcome to AiiDA’s documentation!

AiiDA is a sophisticated framework designed from scratch to be a flexible and scalable infrastructure for computational science. Being able to store the full data provenance of each simulation, and based on a tailored database solution built for efficient data mining implementations, AiiDA gives the user the ability to interact seamlessly with any number of HPC machines and codes thanks to its flexible plugin interface, together with a powerful workflow engine for the automation of simulations.

The software is available at http://www.aiida.net.

If you use AiiDA for your research, please cite the following work:

Giovanni Pizzi, Andrea Cepellotti, Riccardo Sabatini, Nicola Marzari, and Boris Kozinsky, AiiDA: automated interactive infrastructure and database for computational science, Comp. Mat. Sci 111, 218-230 (2016); http://dx.doi.org/10.1016/j.commatsci.2015.09.013; http://www.aiida.net.

This is the documentation of the AiiDA framework. For the first setup, configuration and usage, refer to the user’s guide below.

If, instead, you plan to add new plugins, or you simply want to understand AiiDA internals, refer to the developer’s guide.

User’s guide

User’s guide

Installation

Quick install

This section of the manual will guide you through the process of installing AiiDA on your system as quick as possible. For more detailed instructions and explanations refer to the later sections. The installation procedure can generally be split into three separate steps:

  1. Install prerequisite software
  2. Install AiiDA and its python dependencies
  3. Configure the AiiDA installation
Install prerequisite software

The installation procedure itself requires certain software, which therefore will have to be installed first. The following software is required to continue with the installation:

  • git (To download the aiida package)
  • python-2.7.x (The programming language used for AiiDA)
  • python-pip (Python package manager)
  • virtualenv (Software to create a virtual python environment to install AiiDA in)
  • postgresql (Database software version 9.4 or higher)

Installation instructions will depend on your system. For Ubuntu and any other Debian derived distributions you can use:

$ sudo apt-get install git python2.7-dev python-pip virtualenv postgresql postgresql-server-dev-all postgresql-client

For MacOS X using Homebrew as the package manager:

$ brew install git python postgresql
$ pg_ctl -D /usr/local/var/postgres start

For a more detailed description of database requirements and usage see section database.

Install AiiDA and its python dependencies

With the prerequisites installed, we can now download AiiDA itself and install it along with all its python dependencies. Create a directory where you want to install AiiDA and clone the repository:

$ mkdir <your_directory>
$ cd <your_directory>
$ git clone https://github.com/aiidateam/aiida_core

To prevent the python packages that AiiDA depends on, from clashing with the packages you already have installed on your system, we will install them in a virtual environment. For detailed information, see the section on virtual environments. To create a new virtual environment and activate it, run the following commands:

$ virtualenv ~/aiidapy
$ source ~/aiidapy/bin/activate

This will create a directory in your home directory named aiidapy where all the packages will be installed. After activation, your prompt now should have (aiidapy) in front of it, indicating that you are working in the virtual environment.

Note

You may need to install pip and setuptools in your virtual environment in case the system or user version of these tools is old:

$ pip install -U setuptools pip

Finally, to install AiiDA, run the following command from the directory where you cloned the repository:

(aiidapy) $ pip install -e aiida_core

(In this example the AiiDA directory is in aiida_core)

There are additional optional packages that you may want to install, which are grouped in the following categories:

  • atomic_tools: packages that allow importing and manipulating crystal structure from various formats
  • ssh_kerberos: adds support for ssh transport authentication through Kerberos
  • REST: allows a REST server to be ran locally to serve AiiDA data
  • docs: tools to build the documentation
  • advanced_plotting: tools for advanced plotting
  • notebook: jupyter notebook - to allow it to import AiiDA modules
  • testing: python modules required to run the automatic unit tests

In order to install any of these package groups, simply append them as a comma separated list in the pip install command:

(aiidapy) $ pip install -e aiida_core[atomic_tools,docs,advanced_plotting]

Note

If you are installing the optional ssh_kerberos and you are on Ubuntu you might encounter an error related to the gss package. To fix this you need to install the libffi-dev and libkrb5-dev packages:

sudo apt-get install libffi-dev libkrb5-dev
Configure the AiiDA installation

After successful installation, AiiDA needs to be configured, such as setting up a profile and creating a database, which can be done through AiiDA’s command line interface verdi. For a fast and default setup use verdi quicksetup and for greater control use verdi setup (see verdi setup). Here we will use the quicksetup by executing:

(aiidapy) $ verdi quicksetup

You will be asked for your user information. Be aware that this information will be associated with your data if you decide later to share it. Alternatively you can give your information as commandline options (use verdi quicksetup --help option for a list of options).

Note

verdi setup used to be called verdi install, but the new name better reflects the command’s purpose.

Congratulations, you should now have a working installation of AiiDA. You can verify that the installation was successful by running:

$ verdi profile list

This should list the profile that was just created by the quicksetup:

> quicksetup (DEFAULT) (DAEMON PROFILE)

Remember that in order to work with AiiDA through for example the verdi command, you need to be in your virtual environment. If you open a new terminal for example, be sure to activate it first with:

$ source ~/aiidapy/bin/activate

At this point, you can choose to read on for additional installation details and configuration options, or you can choose to start using AiiDA and go straight to the section Get started.

Installing plugins

The plugins available for AiiDA are listed on the AiiDA homepage

For a plugin aiida-plugin-template hosted on PyPI, simply do:

pip install aiida-plugin-template
reentry scan -r aiida   # notify aiida of new entry points

In case there is no PyPI package available, you can install the plugin from the python source, e.g.:

git clone https://github.com/aiidateam/aiida-plugin-template
pip install aiida-plugin-template
reentry scan -r aiida
Background

What does pip install aiida-plugin-template do?

  • resolves and installs the dependencies on other python packages as specified in setup.py
  • creates a folder aiida_plugin_template.egg-info/ with metadata about the package
  • if the -e option is given, creates a symbolic link from the python package search path to the aiida-plugin-template directory and puts the .egg-info folder there. Changes to the source code will be picked up by python without reinstalling, but changes to the metadata in setup.json will not.

For further details, see the Python packaging user guide.

Optional configuration
Verdi tab-completion

The verdi command line tool has many commands and options. To simplify its usage, there is a way to enable tab-completion for it in your bash shell. To do so, simply run the following command:

$ verdi completioncommand

and append the result to the activation script of your virtual environment (or to your bash config, e.g. .bashrc). Alternatively, you can accomplish the same by simply adding the following line to the activation script:

eval "$(verdi completioncommand)"

For the changes to apply to your current shell, make sure to source the activation script or .bashrc (depending the approach you chose).

Adding AiiDA to the PATH

If you used a virtual environment for the installation of AiiDA, the required commands such as verdi should have been added automatically to your PATH. Otherwise, you may have to add the install directory of AiiDA manually to your PATH so that the binaries are found.

For Linux systems, the path to add is usually ~/.local/bin:

export PATH=~/.local/bin:${PATH}

For Mac OS X systems, the path to add is usually ~/Library/Python/2.7/bin:

export PATH=~/Library/Python/2.7/bin:${PATH}

To verify if this is the correct path to add, navigate to this location and you should find the executable supervisord, or celeryd, in the directory.

After updating your PATH you can check if it worked in the following way:

  • type verdi on your terminal, and check if the program starts (it should provide a list of valid commands). If it doesn’t, check if you correctly set up the PATH environment variable above.

  • go into your home folder or in another folder different from the AiiDA folder, run python or ipython and try to import a module, e.g. typing:

    import aiida
    

    If the setup is ok, you shouldn’t get any error. If you do get an ImportError instead, check that you are in the correct virtual environment. If you did not install AiiDA within a virtual environment, you will have to set up the PYTHONPATH environment variable in your .bashrc:

    export PYTHONPATH="${PYTHONPATH}:<AiiDA_folder>"
    
Customizing the configuration directory location

By default, the AiiDA configuration is stored in the directory ~/.aiida. This can be changed by setting the AIIDA_PATH environment variable. The value of AIIDA_PATH can be a colon-separated list of paths. For each of the paths in the list, AiiDA will look for a .aiida directory in the given path and all of its parent folders. If no .aiida directory is found, ~/.aiida will be used.

For example, the directory structure in your home might look like this

.
├── .aiida
├── project_a
│   ├── .aiida
│   └── subfolder
└── project_b
    └── .aiida

If you set

export AIIDA_PATH='~/project_a:~/project_b'

the configuration directory used will be ~/project_a/.aiida. The same is true if you set AIIDA_PATH='~/project_a/subdir', because subdir itself does not contain a .aiida folder, so AiiDA will first check its parent directories.

If you set AIIDA_PATH='.', the configuration directory used depends on the current working directory. Inside the project_a and project_b directories, their respective .aiida directory will be used. Outside of these directories, ~/.aiida is used.

An example for when this option might be used is when two different AiiDA versions are used simultaneously. Using two different .aiida directories also allows running two daemon concurrently. Note however that this option does not change the database cluster that is being used. This means that by default you still need to take care that the database names do not clash.

Using AiiDA in Jupyter

Jupyter is an open-source web application that allows you to create in-browser notebooks containing live code, visualizations and formatted text.

Originally born out of the iPython project, it now supports code written in many languages and customized iPython kernels.

If you didn’t already install AiiDA with the [notebook] option (during pip install), run pip install jupyter inside the virtualenv, and then run from within the virtualenv:

$ jupyter notebook

This will open a tab in your browser. Click on New -> Python 2 and type:

import aiida

followed by Shit-Enter. If no exception is thrown, you can use AiiDA in Jupyter.

If you want to set the same environment as in a verdi shell, add the following code in <your.home.folder>/.ipython/profile_default/ipython_config.py:

try:
    import aiida
except ImportError:
    pass
else:
    c = get_config()
    c.InteractiveShellApp.extensions = [
          'aiida.common.ipython.ipython_magics'
    ]

then open a Jupyter notebook as explained above and type in a cell:

%aiida

followed by Shift-Enter. You should receive the message “Loaded AiiDA DB environment.”

Virtual environment
Why a virtual environment?

AiiDA depends on third party python packages and very often on specific versions of those packages. If AiiDA were to be installed system wide, it may up- or downgrade third party packages used by other parts of the system and leave them potentially broken. Conversely, if a different version of a package is later installed which is incompatible with AiiDA, it too will become broken.

In short, installing AiiDA might interfere with installed python packages and installing other packages might interfere with AiiDA. Since your scientific data is important to you and to us, we strongly recommend isolating AiiDA in what is called a virtual environment.

For a single purpose machine, only meant to run AiiDA and nothing else, you may at your own risk opt to omit working in a virtual environment. In this case, you may want to install AiiDA and its dependencies in user space by using a --user flag, to avoid the need for administrative rights to install them system wide.

What is a virtual environment?

A python virtual environment is essentially a folder, that contains everything that is needed to run python programs, including

  • python executable
  • python standard packages
  • package managers such as pip
  • an activation script that sets the PYTHONPATH and PATH variables

The python executable might be a link to an executable elsewhere, depending on the way the environment is created. The activation script ensures that the python executable of the virtualenv is the first in PATH, and that python programs have access only to packages installed inside the virtualenv (unless specified otherwise during creation). This allows to have an isolated environment for programs that rely on running with a specific version of python or specific versions of third party python packages.

A virtual environment as well as the packages that will be installed within it, will often be installed in the home space of the user such that administrative rights are not required, therefore also making this technique very useful on machines where one has restricted access.

Creating a virtual environment

There are different programs that can create and work with virtual environments. An example for python virtual environments is called virtualenv and can be installed with for example pip by running:

$ pip install --user -U virtualenv

As explained before, a virtual environment is in essence little more than a directory containing everything it needs. In principle a virtual environment can thus be created anywhere where you can create a directory. You could for example opt to create a directory for all your virtual environments in your home folder:

$ mkdir ~/.virtualenvs

Using virtualenv you can then create a new virtual environment with python 2.7 by running:

$ virtualenv --python=<path/to/python2.7> ~/.virtualenvs/my_env

This will create the environment my_env and automatically activate it for you. If you open a new terminal, or you have deactivated the environment, you can reactivate it as follows:

$ ~/.virtualenvs/my_env/bin/activate

If it is activated successfully, you should see that your prompt is prefixed with the name of the environment:

(my_env) $

To leave or deactivate the environment and set all the settings back to default, simply run:

(my_env) $ deactivate
Creating a .aiida folder in your virtualenvironment

When you run AiiDA in multiple virtual environments, it can be convenient to use a separate .aiida folder for each virtualenv. To do this, you can use the AIIDA_PATH mechanism as follows:

  1. Create your virtualenv, as described above

  2. Create a .aiida directory in your virtualenv directory:

    $ mkdir ~/.virtualenvs/my_env/.aiida
    
  3. At the end of ~/.virtualenvs/my_env/bin/activate, add the following line:

    export AIIDA_PATH='~/.virtualenvs/my_env'
    
  4. Deactivate and re-activate the virtualenv

  5. You can test that everything is set up correctly if you can reproduce the following:

    (my_env)$ echo $AIIDA_PATH
    >>> ~/.virtualenvs/my_env
    
    (my_env)$ verdi profile list
    >>> Configuration folder: /home/my_username/.virtualenvs/my_env/.aiida
    >>> Stopping: No configuration file found
    >>> Note: if no configuration file was found, it means that you have not run
    >>> 'verdi setup' yet to configure at least one AiiDA profile.
    
  6. Continue setting up AiiDA with verdi setup or verdi quicksetup.

Database

AiiDA needs a database backend to store the nodes, node attributes and other information, allowing the end user to perform very fast queries of the results. Currently, only postgresql is allowed as a database backend.

Setup instructions

In order for AiiDA to be able to use postgres it needs to be installed first. On Ubuntu and other Debian derivative distributions this can be accomplished with:

$ sudo apt-get install postgresql postgresql-server-dev-all postgresql-client

For Mac OS X, binary packages can be downloaded from the official website of postgresql or you can use brew:

$ brew install postgresql
$ pg_ctl -D /usr/local/var/postgres start

To manually create a database for AiiDA that will later be used in the configuration with verdi setup, you should follow these instructions. First you will need to run the program psql to interact with postgres and you have to do so as the postgres user that was created upon installing the software. To assume the role of postgres run as root:

$ su - postgres

and launch the postgres program:

$ psql

Create a new database user account for AiiDA by running:

CREATE USER aiida WITH PASSWORD '<password>';

replacing <password> with a password of your choice. Make sure to remember it, as you will need it again when you configure AiiDA to use this database through verdi setup. If you want to change the password you just created use the command:

ALTER USER aiida PASSWORD '<password>';

Next we create the database itself:

CREATE DATABASE aiidadb OWNER aiida;

and grant all privileges on this DB to the previously-created aiida user:

GRANT ALL PRIVILEGES ON DATABASE aiidadb to aiida;

You have now created a database for AiiDA and you can close the postgres shell by typing \q. To test if the database was created successfully, you can run the following command as a regular user in a bash terminal:

$ psql -h localhost -d aiidadb -U aiida -W

and type the password you inserted before, when prompted. If everything worked well, you should get no error and see the prompt of the psql shell.

If you uses the same names used in the example commands above, during the verdi setup phase you want to use the following parameters to use the database you just created:

Database engine: postgresql_psycopg2
PostgreSQL host: localhost
PostgreSQL port: 5432
AiiDA Database name: aiidadb
AiiDA Database user: aiida
AiiDA Database password: <password>

Note

Do not forget to backup your database (instructions here).

Note

If you want to move the physical location of the data files on your hard drive AFTER it has been created and filled, look at the instructions here.

Note

Due to the presence of a bug, PostgreSQL could refuse to restart after a crash, or after a restore from binary backup. The workaround given below is adapted from here. The error message would be something like:

* Starting PostgreSQL 9.1 database server
* The PostgreSQL server failed to start. Please check the log output:
2015-05-26 03:27:20 UTC [331-1] LOG:  database system was interrupted; last known up at 2015-05-21 19:56:58 UTC
2015-05-26 03:27:20 UTC [331-2] FATAL:  could not open file "/etc/ssl/certs/ssl-cert-snakeoil.pem": Permission denied
2015-05-26 03:27:20 UTC [330-1] LOG:  startup process (PID 331) exited with exit code 1
2015-05-26 03:27:20 UTC [330-2] LOG:  aborting startup due to startup process failure

If this happens you should change the permissions on any symlinked files to being writable by the Postgres user. For example, on Ubuntu, with PostgreSQL 9.1, the following should work (WARNING: Make sure these configuration files are symbolic links before executing these commands! If someone has customized the server.crt or server.key file, you can erase them by following these steps. It’s a good idea to make a backup of the server.crt and server.key files before removing them):

(as root)
# go to PGDATA directory
cd /var/lib/postgresql/9.1/main
ls -l server.crt server.key
# confirm both of those files are symbolic links
# to files in /etc/ssl before going further
# remove symlinks to SSL certs
rm server.crt
rm server.key
# copy the SSL certs to the local directory
cp /etc/ssl/certs/ssl-cert-snakeoil.pem server.crt
cp /etc/ssl/private/ssl-cert-snakeoil.key server.key
# set permissions on ssl certs
# and postgres ownership on everything else
# just in case
chown postgres *
chmod 640 server.crt server.key

service postgresql start
Verdi setup

The quick install section detailed how verdi quicksetup can be used to quickly setup AiiDA by creating a profile and a database for you. If you want more control over this process, for example if you want to use a database that you created yourself, you can use verdi setup:

$ verdi setup <profile_name>

or equivalently:

$ verdi -p <profile_name> setup

The same commands can also be used to edit already existing profiles. The verdi setup command will guide you through the setup process through a series of prompts.

The first thing that will be asked to you is the timezone, extremely important to get correct dates and times for your calculations.

AiiDA will do its best to try and understand the local timezone (if properly configured on your machine), and will suggest a set of sensible values. Choose the timezone that fits best to you (that is, the nearest city in your timezone - for Lausanne, for instance, we choose Europe/Zurich) and type it at the prompt.

As a second parameter to input during the verdi setup phase, the “Default user email” is asked. We suggest here to use your institution email, that will be used to associate the calculations to you.

Note

In AiiDA, the user email is used as username, and also as unique identifier when importing/exporting data from AiiDA.

Note

Even if you choose an email different from the default one (aiida@localhost), a user with email aiida@localhost will be set up, with its password set to None (disabling access via this user via API or Web interface).

The existence of a default user is internally useful for multi-user setups, where only one user runs the daemon, even if many users can simultaneously access the DB. See the page on setting up AiiDA in multi-user mode for more details (only for advanced users).

Note

The password, in the current version of AiiDA, is not used (it will be used only in the REST API and in the web interface). If you leave the field empty, no password will be set and no access will be granted to the user via the REST API and the web interface.

Then, the following prompts will help you configure the database. Typical settings are:

Insert your timezone: Europe/Zurich
Default user email: richard.wagner@leipzig.de
Database engine: postgresql_psycopg2
PostgreSQL host: localhost
PostgreSQL port: 5432
AiiDA Database name: aiida_dev
AiiDA Database user: aiida
AiiDA Database password: <password>
AiiDA repository directory: /home/wagner/.aiida/repository/
[...]
Configuring a new user with email 'richard.wagner@leipzig.de'
First name: Richard
Last name: Wagner
Institution: BRUHL, LEIPZIG
The user has no password, do you want to set one? [y/N] y
Insert the new password:
Insert the new password (again):
Installation requirements

Read on for more information about the kind of operating system AiiDA can run on and what software needs to be installed before AiiDA can work.

Supported architecture

AiiDA is tested to run on:

  • Mac OS X (tested)
  • Ubuntu 14.04 & 16.04

AiiDA should run on:

  • Older / newer Ubuntu versions
  • Other Linux distributions
Troubleshooting
  • On a clean Ubuntu 16.04 install the pip install command pip install -e aiida_core may fail due to a problem with dependencies on the numpy package. In this case you may be presented with a message like the following:

    from numpy.distutils.misc_util import get_numpy_include_dirs
    ImportError: No module named numpy.distutils.misc_util
    

    To fix this, simply install numpy individually through pip in your virtual env, i.e.:

    pip install numpy
    

    followed by executing the original install command once more:

    pip install -e .
    

    This should fix the dependency error.

  • If the pip install command gives you an error that resembles the one shown below, you might need to downgrade to an older version of pip:

    Cannot fetch index base URL https://pypi.python.org/simple/
    

    To downgrade pip, use the following command:

    sudo easy_install pip==1.2.1
    
  • In order to use the AiiDA objects and functions in Jupyter, this latter has to be instructed to use the iPython kernel installed in the AiiDA virtual environment. This happens by default if you install AiiDA with pip including the notebook option and run Jupyter from the AiiDA virtual environment.

    If, for any reason, you do not want to install Jupyter in the virtual environment, you might consider to install it out of the virtual environment, if not already done:

    $ pip install jupyter
    

    Then, activate the AiiDA virtual environment:

    $ source ~/<aiida.virtualenv>/bin/activate
    

    and setup the AiiDA iPython kernel:

    $ pip install ipykernel
    $ python -m ipykernel install --user --name=<aiida.kernel.name>
    

    where you have chosen a meaningful name for the new kernel.

    Finally, start a Jupyter server:

    $ jupyter notebook
    

    and from the newly opened browser tab select New -> <aiida.kernel.name>

  • When installing the ssh_kerberos optional requirement through Anaconda you may encounter the following error on Ubuntu machines:

    version 'GFORTRAN_1.4' not found (required by /usr/lib/libblas.so.3)
    

    This is related to an open issue in anaconda ContinuumIO/anaconda-issues#686. A potential solution is to run the following command:

    export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libgfortran.so.3
    
  • Several users reported the need to install also libpq-dev (header files for libpq5 - PostgreSQL library):

    apt-get install libpq-dev
    

    But under Ubuntu 12.04 this is not needed.

  • If the installation fails while installing the packages related to the database, you may have not installed or set up the database libraries.

    In particular, on Mac OS X, if you installed the binary package of PostgreSQL, it is possible that the PATH environment variable is not set correctly, and you get a “Error: pg_config executable not found.” error. In this case, discover where the binary is located, then add a line to your ~/.bashrc file similar to the following:

    export PATH=/the/path/to/the/pg_config/file:${PATH}
    

    and then open a new bash shell. Some possible paths can be found at this Stackoverflow link and a non-exhaustive list of possible paths is the following (version number may change):

    • /Applications/Postgres93.app/Contents/MacOS/bin
    • /Applications/Postgres.app/Contents/Versions/9.3/bin
    • /Library/PostgreSQL/9.3/bin/pg_config

    Similarly, if the package installs but then errors occur during the first of AiiDA (with Symbol not found errors or similar), you may need to point to the path where the dynamical libraries are. A way to do it is to add a line similar to the following to the ~/.bashrc and then open a new shell:

    export DYLD_FALLBACK_LIBRARY_PATH=/Library/PostgreSQL/9.3/lib:$DYLD_FALLBACK_LIBRARY_PATH
    

    (you should of course adapt the path to the PostgreSQL libraries).

  • For some reasons, on some machines (notably often on Mac OS X) there is no default locale defined, and when you run verdi setup for the first time it fails (see also this issue of django). Run in your terminal (or maybe even better, add to your .bashrc, but then remember to open a new shell window!):

    export LANG="en_US.UTF-8"
    export LC_ALL="en_US.UTF-8"
    

    and then run verdi setup again.

  • Within a virtual environment, attempt to visualize a structure with ase (either from the shell, or using the command verdi data structure show --format=ase <PK>), might end up with the following error message:

    ImportError: No module named pygtk
    

    The issue is that pygtk is currently not pip-installable. One has to install it separately and create the appropriate bindings manually in the virtual environment. You can follow the following procedure to get around this issue:

    • Install the python-gtk2 package. Under Ubuntu, do:

      sudo apt-get install python-gtk2
      
    • Create the lib/python2.7/dist-packages folder within your virtual environment:

      mkdir <AIIDA_VENV_FOLDER>/lib/python2.7/dist-packages
      chmod 755 <AIIDA_VENV_FOLDER>/lib/python2.7/dist-packages
      

      where <AIIDA_VENV_FOLDER> is the virtual environment folder you have created during the installation process.

    • Create several symbolic links from this folder, pointing to a number of files in /usr/lib/python2.7/dist-packages/:

      cd <AIIDA_VENV_FOLDER>/lib/python2.7/dist-packages
      ln -s /usr/lib/python2.7/dist-packages/glib glib
      ln -s /usr/lib/python2.7/dist-packages/gobject gobject
      ln -s /usr/lib/python2.7/dist-packages/gtk-2.0 gtk-2.0
      ln -s /usr/lib/python2.7/dist-packages/pygtk.pth pygtk.pth
      ln -s /usr/lib/python2.7/dist-packages/pygtk.py pygtk.py
      ln -s /usr/lib/python2.7/dist-packages/cairo cairo
      

    After that, verdi data structure show --format=ase <PK> should work.

  • [Only for developers] The developer tests of the SSH transport plugin are performed connecting to localhost. The tests will fail if a passwordless ssh connection is not set up. Therefore, if you want to run the tests:

    • make sure to have a ssh server. On Ubuntu, for instance, you can install it using:

      sudo apt-get install openssh-server
      
    • Configure a ssh key for your user on your machine, and then add your public key to the authorized keys of localhsot. The easiest way to achieve this is to run:

      ssh-copy-id localhost
      

      (it will ask your password, because it is connecting via ssh to localhost to install your public key inside ~/.ssh/authorized_keys).

Updating AiiDA

Before you update your AiiDA installation, first make sure that you do the following:

  • Stop your daemon by executing verdi daemon stop
  • Create a backup of your database(s) by following the guidelines in the backup section
  • Create a backup of the ~/.aiida folder (where configuration files are stored)

If you have installed AiiDA manually from a local clone of the aiida_core repository, skip to the instructions for developers. Otherwise, if you have installed AiiDA through pip, you can also update your installation through pip. If you installed aiida_core in a virtual environment make sure to load it first. Now you are ready to update your AiiDA installation through pip:

pip install --upgrade aiida_core

After upgrading your AiiDA installation you may have to perform version specific migrations. When all necessary migrations are completed, finalize the update by executing:

verdi setup

This updates your daemon profile and related files. It should not be done when another version of aiida is wished to be used productively on the same machine/user.

Updating AiiDA for developers

After you have performed all the steps in the checklist described in the previous section, go to your local clone of the aiida_core repository and checkout the desired branch or tag. If you installed aiida_core in a virtual environment make sure that you have loaded it.

Each version increase may come with its own necessary migrations and you should only ever update the version by one at a time. Therefore, first make sure you know the version number of the current installed version by using verdi shell and typing:

import aiida
aiida.__version__

Now you can install the updated version of aiida_core by simply executing:

pip install -e .

After upgrading your AiiDA installation you may have to perform version specific migrations based on the version of your previous installation. When all necessary migrations are completed, finalize the update by executing:

verdi setup

This updates your daemon profile and related files.

Note

A few general remarks:

  • If you want to update the code in the same folder, but modified some files locally, you can stash them (git stash) before cloning or pulling the new code. Then put them back with git stash pop (note that conflicts might appear).
  • If you encounter any problems and/or inconsistencies, delete any .pyc files that may have remained from the previous version. E.g. If you are in your AiiDA folder you can type find . -name "*.pyc" -type f -delete.

Note

Since AiiDA 0.9.0, we use Alembic for the database migrations of the SQLAlchemy backend. In case you were using SQLAlchemy before the introduction of Alembic, you may experience problems during your first migration. If it is the case, please have a look at the following section Debugging Alembic

Version migration instructions
Updating from 0.9.* to 0.10.0

Multiple things have changed in AiiDA v0.10.0 that require some manual attention when upgrading the aiida_core code base. There have been changes to the:

  1. Database schema
  2. Export archive format
  3. Plugins for Quantum ESPRESSO, ASE, COD tools and NWChem

For each of these three points, you will find instructions on how to perform the necessary migration below.

Database migration

The exact migration procedure will differ slightly depending on which backend the profile uses, but for both Django and SqlAlchemy the procedure starts as follows.

  • Backup your database
  • Upgrade AiiDA within the virtual environment

After having performed these steps, the remainder of the migration can be triggered by executing any verdi command. For example you can execute verdi calculation list and you should be prompted with an exception for Django or a message for SqlAlchemy. Depending on your backend, follow the instructions below.

Django

When the profile that you want to migrate uses Django for the backend you will get an exception and instructions to run a command that looks like the following:

python aiida_core/aiida/backends/djsite/manage.py --aiida-profile=PROFILENAME migrate

After you execute the migration command, a warning will appear since we are deleting a table:

The following content types are stale and need to be deleted:

    db | dbpath

Any objects related to these content types by a foreign key will also
be deleted. Are you sure you want to delete these content types?
If you're unsure, answer 'no'.

    Type 'yes' to continue, or 'no' to cancel:

Have faith in your AiiDA team and type yes!

Note

For everyone who tuned his AiiDA-database by dropping the path-table and the corresponding triggers, the migration will fail because the table db_dbpath does not exist. In such a case, you have to insert the table manually into the database of your profile (which we call AIIDADB in the demonstration):

> psql AIIDADB
AIIDADB=# CREATE TABLE db_dbpath (
id integer NOT NULL,
depth integer NOT NULL,
entry_edge_id integer,
direct_edge_id integer,
exit_edge_id integer,
child_id integer NOT NULL,
parent_id integer NOT NULL
);
SqlAlchemy

When the profile that you want to migrate uses SqlAlchemy for the backend you will get a message that looks like the following:

It is time to perform your first SQLAlchemy migration.
Would you like to migrate to the latest version? [Y/n]

Simply enter Y and hit enter and the database migration should be automatically applied.

Export archive file migration

The format of the export archives, created with verdi export, has changed in aiida_core v0.10.0 and in order to be able to import them, they have to be migrated. To do this you can use the verdi export migrate command. The archive format version up to 0.10.0 was 0.2 and starting from 0.10.0 it is now 0.3.

Plugin migration

In v0.10.0 the plugins for Quantum ESPRESSO, ASE, COD tools and NWChem that used to be included in aiida_core have been moved to separate plugin repositories which can be found here:

With the new plugin system introduced in aiida_core v0.9.0, all you have to do to install a plugin for AiiDA is to install it with pip. For example, to install all four original plugins you can execute:

pip install aiida-quantumespresso aiida-ase aiida-codtools aiida-nwchem

Note, however, that if you are upgrading an existing manual installation of aiida_core, you first need to make sure that your code base is cleaned. After you have upgraded your local repository to v0.10.0 by checking out the relevant branch or tag, before you run pip install, make sure that all old *pyc files are removed, by running the following command from your local checked out repository:

find . -name "*pyc" -type f -delete

Now you can install the new version of aiida_core with any of the optional extra dependencies that you might need:

pip install -e .[<EXTRAS>]

and make sure to refresh the plugin cache by executing:

reentry scan

Now you can reinstall any of the Quantum ESPRESSO, ASE, COD tools or NWChem plugins, either through pip for example:

pip install aiida-quantumespresso

or you can install them for development just like aiida_core by checking out the repository and using pip install -e, like so:

git clone https://github.com/aiidateam/aiida-quantumespresso
pip install -e aiida-quantumespresso

You can verify that the plugins were properly installed by running the following verdi command:

verdi calculation plugins

Now everything should be working properly and you can use the plugin as you were used to. You can use the class factories, such as CalculationFactory, exactly in the same way to load the plugin classes. For example you can still call CalculationFactory('quantumespresso.pw') to load the PwCalculation class. The only thing that will have changed is that you can no longer use any of the old direct import paths, as those files no longer exist.

Updating from older versions

To find the update instructions for older versions of AiiDA follow the following links to the documentation of the corresponding version:

Get started

In this section, we assume you have successfully installed AiiDA. If this is not the case, please refer to instructions in the Installation section. With AiiDA up and running, this section will explain how to get started and put AiiDA to good use. First we will launch the daemon, which is a process that runs in the background and takes care of a lot of tasks for you.

Starting the daemon

Starting the daemon is relatively straight forward by issuing the command:

$ verdi daemon start

If you run the verdi quicksetup to setup AiiDA and you entered your own personal email address, you will see the following error message:

You are not the daemon user! I will not start the daemon.
(The daemon user is 'aiida@localhost', you are 'richard.wagner@leipzig.de')

** FOR ADVANCED USERS ONLY: **
To change the current default user, use 'verdi install --only-config'
To change the daemon user, use 'verdi daemon configureuser'

This is a safeguard, because AiiDA detects that the person whose profile is active is not the same as the one configured for the daemon. If you are working in a single-user mode, and you are sure that nobody else is going to run the daemon, you can configure your user as the (only) one who can run the daemon. To configure the deamon for your profile, first make sure the daemon is stopped:

$ verdi daemon stop

and then run the command:

$ verdi daemon configureuser

This will prompt you with a warning which you can accept and then fill in the email address of your profile. If all went well, it will confirm that the new email address was set for the daemon:

The new user that can run the daemon is now Richard Wagner.

Now that the daemon is properly configured, you can start it with:

verdi daemon start

If everything was done correctly, the daemon should start. You can inspect the status of the daemon by running:

verdi daemon status

and, if the daemon is running, you should see something like the following:

* aiida-daemon[0]        RUNNING    pid 12076, uptime 0:39:05
* aiida-daemon-beat[0]   RUNNING    pid 12075, uptime 0:39:05

To stop the daemon once again, use:

verdi daemon stop

A log of the warning/error messages of the daemon can be found in in ~/.aiida/daemon/log/. The log can also be retrieved through verdi with the command:

$ verdi daemon logshow

The daemon is a fundamental component of AiiDA, and it is for example in charge of submitting new calculations, checking their status on the cluster, retrieving and parsing the results of finished calculations, and managing the workflow steps. But in order to actually be able to launch calculations on a computer, we will first have to register them with AiiDA. This will be shown in detail in the next section.

Setup of computers and codes

Before being able to run the first calculation, you need to setup at least one computer and one code, as described below.

Remote computer requirements

A computer in AiiDA denotes any computational resource (with a batch job scheduler) on which you will run your calculations. Computers typically are clusters or supercomputers.

Requirements for a computer are:

  • It must run a Unix-like operating system
  • The default shell must be bash
  • It should have a batch scheduler installed (see here for a list of supported batch schedulers)
  • It must be accessible from the machine that runs AiiDA using one of the available transports (see below).

The first step is to choose the transport to connect to the computer. Typically, you will want to use the SSH transport, apart from a few special cases where SSH connection is not possible (e.g., because you cannot setup a password-less connection to the computer). In this case, you can install AiiDA directly on the remote cluster, and use the local transport (in this way, commands to submit the jobs are simply executed on the AiiDA machine, and files are simply copied on the disk instead of opening an SFTP connection).

If you plan to use the local transport, you can skip to the next section.

If you plan to use the SSH transport, you have to configure a password-less login from your user to the cluster. To do so type first (only if you do not already have some keys in your local ~/.ssh directory - i.e. files like id_rsa.pub):

ssh-keygen -t rsa

Then copy your keys to the remote computer (in ~/.ssh/authorized_keys) with:

ssh-copy-id YOURUSERNAME@YOURCLUSTERADDRESS

replacing YOURUSERNAME and YOURCLUSTERADDRESS by respectively your username and cluster address. Finally add the following lines to ~/.ssh/config (leaving an empty line before and after):

Host YOURCLUSTERADDRESS
  User YOURUSERNAME
  HostKeyAlgorithms ssh-rsa
  IdentityFile YOURRSAKEY

replacing YOURRSAKEY by the path to the rsa private key you want to use (it should look like ~/.ssh/id_rsa).

Note

In principle you don’t have to put the IdentityFile line if you have only one rsa key in your ~/.ssh folder.

Before proceeding to setup the computer, be sure that you are able to connect to your cluster using:

ssh YOURCLUSTERADDRESS

without the need to type a password. Moreover, make also sure you can connect via sftp (needed to copy files). The following command:

sftp YOURCLUSTERADDRESS

should show you a prompt without errors (possibly with a message saying Connected to YOURCLUSTERADDRESS).

Warning

Due to a current limitation of the current ssh transport module, we do not support ECDSA, but only RSA or DSA keys. In the present guide we’ve shown RSA only for simplicity. The first time you connect to the cluster, you should see something like this:

The authenticity of host 'YOURCLUSTERADDRESS (IP)' can't be established.
RSA key fingerprint is xx:xx:xx:xx:xx.
Are you sure you want to continue connecting (yes/no)?

Make sure you see RSA written. If you already installed the keys in the past, and you don’t know which keys you are using, you could remove the cluster YOURCLUSTERADDRESS from the file ~/.ssh/known-hosts (backup it first!) and try to ssh again. If you are not using a RSA or DSA key, you may see later on a submitted calculation going in the state SUBMISSIONFAILED.

Note

If the ssh command works, but the sftp command does not (e.g. it just prints Connection closed), a possible reason can be that there is a line in your ~/.bashrc that either produces an output, or an error. Remove/comment it until no output or error is produced: this should make sftp working again.

Finally, try also:

ssh YOURCLUSTERADDRESS QUEUE_VISUALIZATION_COMMAND

replacing QUEUE_VISUALIZATION_COMMAND by the scheduler command that prints on screen the status of the queue on the cluster (i.e. qstat for PBSpro scheduler, squeue for SLURM, etc.). It should print a snapshot of the queue status, without any errors.

Note

If there are errors with the previous command, then edit your ~/.bashrc file in the remote computer and add a line at the beginning that adds the path to the scheduler commands, typically (here for PBSpro):

export PATH=$PATH:/opt/pbs/default/bin

Or, alternatively, find the path to the executables (like using which qsub)

Note

If you need your remote .bashrc to be sourced before you execute the code (for instance to change the PATH), make sure the .bashrc file does not contain lines like:

[ -z "$PS1" ] && return

or:

case $- in
    *i*) ;;
    *) return;;
esac

in the beginning (these would prevent the bashrc to be executed when you ssh to the remote computer). You can check that e.g. the PATH variable is correctly set upon ssh, by typing (in your local computer):

ssh YOURCLUSTERADDRESS 'echo $PATH'

Note

If you need to ssh to a computer A first, from which you can then connect to computer B you wanted to connect to, you can use the proxy_command feature of ssh, that we also support in AiiDA. For more information, see Using the proxy_command option with ssh.

Computer setup and configuration

The configuration of computers happens in two steps.

Note

The commands use some readline extensions to provide default answers, that require an advanced terminal. Therefore, run the commands from a standard terminal, and not from embedded terminals as the ones included in text editors, unless you know what you are doing. For instance, the terminal embedded in emacs is known to give problems.

  1. Setup of the computer, using the:

    verdi computer setup
    

    command. This command allows to create a new computer instance in the DB.

    Tip

    The code will ask you a few pieces of information. At every prompt, you can type the ? character and press <enter> to get a more detailed explanation of what is being asked.

    Tip

    You can press <CTRL>+C at any moment to abort the setup process. Nothing will be stored in the DB.

    Note

    For multiline inputs (like the prepend text and the append text, see below) you have to press <CTRL>+D to complete the input, even if you do not want any text.

    Here is a list of what is asked, together with an explanation.

    • Computer name: the (user-friendly) name of the new computer instance which is about to be created in the DB (the name is used for instance when you have to pick up a computer to launch a calculation on it). Names must be unique. This command should be thought as a AiiDA-wise configuration of computer, independent of the AiiDA user that will actually use it.

    • Fully-qualified hostname: the fully-qualified hostname of the computer to which you want to connect (i.e., with all the dots: bellatrix.epfl.ch, and not just bellatrix). Type localhost for the local transport.

    • Description: A human-readable description of this computer; this is useful if you have a lot of computers and you want to add some text to distinguish them (e.g.: “cluster of computers at EPFL, installed in 2012, 2 GB of RAM per CPU”)

    • Enabled: either True or False; if False, the computer is disabled and calculations associated with it will not be submitted. This allows to disable temporarily a computer if it is giving problems or it is down for maintenance, without the need to delete it from the DB.

    • Transport type: The name of the transport to be used. A list of valid transport types can be obtained typing ?

    • Scheduler type: The name of the plugin to be used to manage the job scheduler on the computer. A list of valid scheduler plugins can be obtained typing ?. See here for a documentation of scheduler plugins in AiiDA.

    • AiiDA work directory: The absolute path of the directory on the remote computer where AiiDA will run the calculations (often, it is the scratch of the computer). You can (should) use the {username} replacement, that will be replaced by your username on the remote computer automatically: this allows the same computer to be used by different users, without the need to setup a different computer for each one. Example:

      /scratch/{username}/aiida_work/
      
    • mpirun command: The mpirun command needed on the cluster to run parallel MPI programs. You can (should) use the {tot_num_mpiprocs} replacement, that will be replaced by the total number of cpus, or the other scheduler-dependent fields (see the scheduler docs for more information). Some examples:

      mpirun -np {tot_num_mpiprocs}
      aprun -n {tot_num_mpiprocs}
      poe
      
    • Text to prepend to each command execution: This is a multiline string, whose content will be prepended inside the submission script before the real execution of the job. It is your responsibility to write proper bash code! This is intended for computer-dependent code, like for instance loading a module that should always be loaded on that specific computer. Remember to end the input by pressing <CTRL>+D. A practical example:

      export NEWVAR=1
      source some/file
      

      A not-to-do example:

      #PBS -l nodes=4:ppn=12
      

      (it’s the plugin that will do this!)

    • Text to append to each command execution: This is a multiline string, whose content will be appended inside the submission script after the real execution of the job. It is your responsibility to write proper bash code! This is intended for computer-dependent code. Remember to end the input by pressing <CTRL>+D.

At the end, you will get a confirmation command, and also the ID in the database (pk, i.e. the principal key, and uuid).
  1. Configuration of the computer, using the:

    verdi computer configure COMPUTERNAME
    

    command. This will allow to access more detailed configurations, that are often user-dependent and also depend on the specific transport (for instance, if the transport is SSH, it will ask for username, port, ...).

    The command will try to provide automatically default answers, mainly reading the existing ssh configuration in ~/.ssh/config, and in most cases one simply need to press enter a few times.

    Note

    At the moment, the in-line help (i.e., just typing ? to get some help) is not yet supported in verdi configure, but only in verdi setup.

    For local transport, you need to run the command, even if nothing will be asked to you. For ssh transport, the following will be asked:

    • username: your username on the remote machine
    • port: the port to connect to (the default SSH port is 22)
    • look_for_keys: automatically look for the private key in ~/.ssh. Default: True.
    • key_filename: the absolute path to your private SSH key. You can leave it empty to use the default SSH key, if you set look_for_keys to True.
    • timeout: A timeout in seconds if there is no response (e.g., the machine is down. You can leave it empty to use the default value.
    • allow_agent: If True, it will try to use an SSH agent.
    • proxy_command: Leave empty if you do not need a proxy command (i.e., if you can directly connect to the machine). If you instead need to connect to an intermediate computer first, you need to provide here the command for the proxy: see documentation here for how to use this option, and in particular the notes here for the format of this field.
    • compress: True to compress the traffic (recommended)
    • gss_auth: yes when using Kerberos token to connect
    • gss_kex: yes when using Kerberos token to connect, in some cases (depending on your .ssh/config file)
    • gss_deleg_creds: yes when using Kerberos token to connect, in some cases (depending on your .ssh/config file)
    • gss_host: hostname when using Kerberos token to connect (default to the remote computer hostname)
    • load_system_host_keys: True to load the known hosts keys from the default SSH location (recommended)
    • key_policy: What is the policy in case the host is not known. It is a string among the following:
      • RejectPolicy (default, recommended): reject the connection if the host is not known.
      • WarningPolicy (not recommended): issue a warning if the host is not known.
      • AutoAddPolicy (not recommended): automatically add the host key at the first connection to the host.
After these two steps have been completed, your computer is ready to go!

Note

If the cluster you are using requires authentication through a Kerberos token (that you need to obtain before using ssh), you typically need to install libffi (sudo apt-get install libffi-dev under Ubuntu), and make sure you install the ssh_kerberos optional dependencies during the installation process of AiiDA. Then, if your .ssh/config file is configured properly (in particular includes all the necessary GSSAPI options), verdi computer configure will contain already the correct suggestions for all the gss options needed to support Kerberos.

Note

To check if you set up the computer correctly, execute:

verdi computer test COMPUTERNAME

that will run a few tests (file copy, file retrieval, check of the jobs in the scheduler queue) to verify that everything works as expected.

Note

If you are not sure if your computer is already set up, use the command:

verdi computer list

to get a list of existing computers, and:

verdi computer show COMPUTERNAME

to get detailed information on the specific computer named COMPUTERNAME. You have also the:

verdi computer rename OLDCOMPUTERNAME NEWCOMPUTERNAME

and:

verdi computer delete COMPUTERNAME

commands, whose meaning should be self-explanatory.

Note

You can delete computers only if no entry in the database is using them (as for instance Calculations, or RemoteData objects). Otherwise, you will get an error message.

Note

It is possible to disable a computer.

Doing so will prevent AiiDA from connecting to the given computer to check the state of calculations or to submit new calculations. This is particularly useful if, for instance, the computer is under maintenance but you still want to use AiiDA with other computers, or submit the calculations in the AiiDA database anyway.

When the computer comes back online, you can re-enable it; at this point pending calculations in the TOSUBMIT state will be submitted, and calculations WITHSCHEDULER will be checked and possibly retrieved.

The relevant commands are:

verdi computer enable COMPUTERNAME
verdi computer disable COMPUTERNAME

Note that the above commands will disable the computer for all AiiDA users. If instead, for some reason, you want to disable the computer only for a given user, you can use the following command:

verdi computer disable COMPUTERNAME --only-for-user USER_EMAIL

(and the corresponding verdi computer enable command to re-enable it).

Code setup and configuration

Once you have at least one computer configured, you can configure the codes.

In AiiDA, for full reproducibility of each calculation, we store each code in the database, and attach to each calculation a given code. This has the further advantage to make very easy to query for all calculations that were run with a given code (for instance because I am looking for phonon calculations, or because I discovered that a specific version had a bug and I want to rerun the calculations).

In AiiDA, we distinguish two types of codes: remote codes and local codes, where the distinction between the two is described here below.

Remote codes

With remote codes we denote codes that are installed/compiled on the remote computer. Indeed, this is very often the case for codes installed in supercomputers for high-performance computing applications, because the code is typically installed and optimized on the supercomputer.

In AiiDA, a remote code is identified by two mandatory pieces of information:

  • A computer on which the code is (that must be a previously configured computer);
  • The absolute path of the code executable on the remote computer.
Local codes

With local codes we denote codes for which the code is not already present on the remote machine, and must be copied for every submission. This is the case if you have for instance a small, machine-independent Python script that you did not copy previously in all your clusters.

In AiiDA, a local code can be set up by specifying:

  • A folder, containing all files to be copied over at every submission
  • The name of executable file among the files inside the folder specified above
Setting up a code

The:

verdi code

command allows to manage codes in AiiDA.

To setup a new code, you execute:

verdi code setup

and you will be guided through a process to setup your code.

Tip

The code will ask you a few pieces of information. At every prompt, you can type the ? character and press <enter> to get a more detailed explanation of what is being asked.

You will be asked for:

  • label: A label to refer to this code. Note: this label is not enforced to be unique. However, if you try to keep it unique, at least within the same computer, you can use it later to refer and use to your code. Otherwise, you need to remember its ID or UUID.
  • description: A human-readable description of this code (for instance “Quantum Espresso v.5.0.2 with 5.0.3 patches, pw.x code, compiled with openmpi”)
  • default input plugin: A string that identifies the default input plugin to be used to generate new calculations to use with this code. This string has to be a valid string recognized by the CalculationFactory function. To get the list of all available Calculation plugin strings, use the verdi calculation plugins command. Note: if you do not want to specify a default input plugin, you can write the string “None”, but this is strongly discouraged, because then you will not be able to use the .new_calc method of the Code object.
  • local: either True (for local codes) or False (for remote codes). For the meaning of the distinction, see above. Depending on your choice, you will be asked for:
    • LOCAL CODES:
      • Folder with the code: The folder on your local computer in which there are the files to be stored in the AiiDA repository, and that will then be copied over to the remote computers for every submitted calculation. This must be an absolute path on your computer.
      • Relative path of the executable: The relative path of the executable file inside the folder entered in the previous step.
    • REMOTE CODES:
      • Remote computer name: The computer name as on which the code resides, as configured and stored in the AiiDA database
      • Remote absolute path: The (full) absolute path of the code executable on the remote machine

For any type of code, you will also be asked for:

  • Text to prepend to each command execution: This is a multiline string,

    whose content will be prepended inside the submission script before the real execution of the job. It is your responsibility to write proper bash code! This is intended for code-dependent code, like for instance loading the modules that are required for that specific executable to run. Example:

    module load intelmpi
    

    Remember to end the input by pressing <CTRL>+D.

  • Text to append to each command execution: This is a multiline string, whose content will be appended inside the submission script after the real execution of the job. It is your responsibility to write proper bash code! This is intended for code-dependent code. Remember to end the input by pressing <CTRL>+D.

At the end, you will get a confirmation command, and also the ID of the code in the database (the pk, i.e. the principal key, and the uuid).

Note

Codes are a subclass of the Node class, and as such you can attach any set of attributes to the code. These can be extremely useful for querying: for instance, you can attach the version of the code as an attribute, or the code family (for instance: “pw.x code of Quantum Espresso”) to later query for all runs done with a pw.x code and version more recent than 5.0.0, for instance. However, in the present AiiDA version you cannot add attributes from the command line using verdi, but you have to do it using Python code.

Note

You can change the label of a code by using the following command:

verdi code rename "ID"

(Without the quotation marks!) “ID” can either be the numeric ID (PK) of the code (preferentially), or possibly its label (or label@computername), if this string uniquely identifies a code.

You can also list all available codes (and their relative IDs) with:

verdi code list

The verdi code list accepts some flags to filter only codes on a given computer, only codes using a specific plugin, etc.; use the -h command line option to see the documentation of all possible options.

You can then get the information of a specific code with:

verdi code show "ID"

Finally, to delete a code use:

verdi code delete "ID"

(only if it wasn’t used by any calculation, otherwise an exception is raised)

And now, you are ready to launch your calculations!

Working with AiiDA

The verdi command line interface

The main way of interacting with AiiDA is through a command line interface tool called verdi. You have already used verdi when installing AiiDA, either through verdi quicksetup or verdi setup. But verdi is very versatile and provides a wealth of other functionalities; here is a list:

  • calculation: query and interact with calculations
  • code: setup and manage codes to be used
  • comment: manage general properties of nodes in the database
  • completioncommand: return the bash completion function to put in ~/.bashrc
  • computer: setup and manage computers to be used
  • daemon: manage the AiiDA daemon
  • data: setup and manage data specific types
  • devel: AiiDA commands for developers
  • export: export nodes and group of nodes
  • graph: create a graph from a given root node
  • group: setup and manage groups
  • import: export nodes and group of nodes
  • node: manage operations on AiiDA nodes
  • profile: list and manage AiiDA profiles
  • run: execute an AiiDA script
  • runserver: run the AiiDA webserver on localhost
  • setup: setup aiida for the current user/create a new profile
  • shell: run the interactive shell with the Django environment
  • user: list and configure new AiiDA users.
  • workflow: manage the AiiDA worflow manager

Each command above can be preceded by the -p <profile> or --profile=<profile> option, as in:

verdi -p <profile> calculation list

This allows one to select a specific AiiDA profile, and therefore a specific database, on which the command is executed. Thus several databases can be handled and accessed simultaneously by AiiDA. To install a new profile, use the install command.

Note

This profile selection has no effect on the verdi daemon commands.

Some ambiguity might arise when a certain verdi subcommand manages both positional arguments and at least one option which accepts an unspecified number of arguments. Make sure you insert the separator -- between the last optional argument and the first positional argument. As an example, instead of typing:

verdi export -g group1 group2 group3 export.aiida

rather type:

 verdi export -g group1 group2 group3 -- export.aiida

The previous command will export the nodes belonging to groups ``group1``, ``group2``, and ``group3`` (specified by the option ``-g``) into the file ``export.aiida``, which is taken as a positional argument.

Below is a list with all the available subcommands.

The verdi commands

For some of the most common operations in AiiDA, you can work directly from the command line using the a set of verdi commands. You already used verdi install when installing the software. There are quite some more functionalities attached to this command; here’s a list:

  • calculation: query and interact with calculations
  • code: setup and manage codes to be used
  • comment: manage general properties of nodes in the database
  • completioncommand: return the bash completion function to put in ~/.bashrc
  • computer: setup and manage computers to be used
  • daemon: manage the AiiDA daemon
  • data: setup and manage data specific types
  • devel: AiiDA commands for developers
  • export: export nodes and group of nodes
  • graph: create a graph from a given root node
  • group: setup and manage groups
  • import: export nodes and group of nodes
  • install: install/setup aiida for the current user/create a new profile
  • node: manage operations on AiiDA nodes
  • profile: list and manage AiiDA profiles
  • run: execute an AiiDA script
  • shell: run the interactive shell with the Django environment
  • user: list and configure new AiiDA users.
  • workflow: manage the AiiDA worflow manager

Each command above can be preceded by the -p <profile> or --profile=<profile> option, as in:

verdi -p <profile> calculation list

This allows to select a specific AiiDA profile, and therefore a specific database, on which the command is executed. Thus several databases can be handled and accessed simultaneously by AiiDA. To install a new profile, use the install command.

Note

This profile selection has no effect on the verdi daemon commands.

Some ambiguity might arise when a certain verdi subcommand manages both positional arguments and at least one option which accepts an unspecified number of arguments. Make sure you insert the separator -- between the last optional argument and the first positional argument. As an example, instead of typing:

verdi export -g group1 group2 group3 export.aiida

rather type:

 verdi export -g group1 group2 group3 -- export.aiida

The previous command will export the nodes belonging to groups ``group1``, ``group2``, and ``group3`` (specified by the option ``-g``) into the file ``export.aiida``, which is taken as a positional argument.

Below is a list with all the available subcommands.

verdi calculation
  • kill: stop the execution on the cluster of a calculation.
  • logshow: shows the logs/errors produced by a calculation
  • plugins: lists the supported calculation plugins
  • inputcat: shows an input file of a calculation node.
  • inputls: shows the list of the input files of a calculation node.
  • list: list the AiiDA calculations. By default, lists only the running calculations.
  • outputcat: shows an ouput file of a calculation node.
  • outputls: shows the list of the output files of a calculation node.
  • show: shows the database information related to the calculation: used code, all the input nodes and all the output nodes.
  • gotocomputer: open a shell to the calc folder on the cluster
  • label: view / set the label of a calculation
  • description: view / set the description of a calculation
  • res: shows the calculation results (from calc.res).
  • cleanworkdir: cleans the work directory (remote folder) of AiiDA calculations

Note

When using gotocomputer, be careful not to change any file that AiiDA created, nor to modify the output files or resubmit the calculation, unless you really know what you are doing, otherwise AiiDA may get very confused!

verdi code
  • show: shows the information of the installed code.
  • list: lists the installed codes
  • hide: hide codes from verdi code list
  • reveal: un-hide codes for verdi code list
  • setup: setup a new code
  • rename: change the label (name) of a code. If you like to load codes based on their labels and not on their UUID’s or PK’s, take care of using unique labels!
  • update: change (some of) the installation description of the code given at the moment of the setup.
  • delete: delete a code from the database. Only possible for disconnected codes (i.e. a code that has not been used yet)
verdi comment

There are various ways of attaching notes/comments to a node within AiiDA. In the first scripting examples, you might already have noticed the possibility of storing a label or a description to any AiiDA Node. However, these properties are defined when the Node is created, and it is not possible to modify them after the Node has been stored.

The Node comment provides a simple way to have a more dynamic management of comments, in which any user can write a comment on the Node, or modify it or delete it.

The verdi comment provides a set of methods that are used to manipulate the comments:

  • add: add a new comment to a Node.
  • update: modify a comment.
  • show: show the existing comments attached to the Node.
  • remove: remove a comment.
verdi completioncommand

Prints the string to be copied and pasted to the bashrc in order to allow for autocompletion of the verdi commands.

verdi computer
  • setup: creates a new computer object
  • configure: set up some extra info that can be used in the connection with that computer.
  • test: tests if the current user (or a given user) can connect to the computer and if basic operations perform as expected (file copy, getting the list of jobs in the scheduler queue, ...)
  • show: shows the details of an installed computer
  • list: list all installed computers
  • enable: to enable a computer. If the computer is disabled, the daemon will not try to connect to the computer, so it will not retrieve or launch calculations. Useful if a computer is under mantainance.
  • disable: disable a computer (see enable for a larger description)
  • rename: changes the name of a computer.
  • update: change configuration of a computer. Works only if the computer node is a disconnected node in the database (has not been used yet).
  • delete: deletes a computer node. Works only if the computer node is a disconnected node in the database (has not been used yet)
verdi daemon

Manages the daemon, i.e. the process that runs in background and that manages submission/retrieval of calculations.

  • status: see the status of the daemon. Typically, it will either show Daemon not running or you will see two processes with state RUNNING.
  • start: starts the daemon.
  • stop: stops the daemon
  • restart: restarts the daemon.
  • configureuser: sets the user which is running the daemon. See the installation guide for more details.
  • logshow: show the last lines of the daemon log (use for debugging)
verdi data

Manages database data objects.

  • upf: handles the Pseudopotential Datas
    • listfamilies: list presently stored families of pseudopotentials
    • uploadfamily: install a new family (group) of pseudopotentials
    • import: create or return (if already present) a database node, having the contents of a supplied file
    • exportfamily: export a family of pseudopotential files into a folder
  • structure: handles the StructureData
    • list: list currently saved nodes of StructureData kind
    • show: use a third-party visualizer (like vmd or xcrysden) to graphically show the StructureData
    • export: export the node as a string of a specified format
    • deposit: deposit the node to a remote database
  • parameter: handles the ParameterData objects
    • show: output the content of the python dictionary in different formats.
  • cif: handles the CifData objects
    • list: list currently saved nodes of CifData kind
    • show: use third-party visualizer (like jmol) to graphically show the CifData
    • import: create or return (if already present) a database node, having the contents of a supplied file
    • export: export the node as a string of a specified format
    • deposit: deposit the node to a remote database
  • trajectory: handles the TrajectoryData objects
    • list: list currently saved nodes of TrajectoryData kind
    • show: use third-party visualizer (like jmol) to graphically show the TrajectoryData
    • export: export the node as a string of a specified format
    • deposit: deposit the node to a remote database
  • label: view / set the label of a data
  • description: view / set the description of a data
  • array: handles aiida.orm.data.array.ArrayData objects
    • show: visualizes the data object
  • bands: handles aiida.orm.data.array.bands.BandsData objects (band structure object)
verdi devel

Here there are some functions that are in the development stage, and that might eventually find their way outside of this placeholder. As such, they are buggy, possibly difficult to use, not necessarily documented, and they might be subject to non back-compatible changes.

  • delproperty, describeproperties, getproperty, listproperties, setproperty: handle the properties, see here for more information.

Type in verdi devel listproperties to get a list of all set properties, and verdi devel describeproperties to get a description of all properties that you can possibly set. The command verdi devel getproperty [propertyname] will give you the set value for that propery, that can be changed with setproperty.

verdi export

Export data from the AiiDA database to a file. See also verdi import to import this data on another database.

verdi graph
  • generate: generates a graph from a given root node either in a graphical or a .dot format.
verdi group
  • list: list all the groups in the database.
  • description: show or change the description of a group
  • show: show the content of a group.
  • create: create a new empty group.
  • delete: delete an existing group (but not the nodes belonging to it).
  • addnodes: add nodes to a group.
  • removenodes: remove nodes from a group.
verdi import

Imports data (coming from other AiiDA databases) in the current database

verdi install

This command is deprecated, please use the setup command instead

verdi node
  • repo: shows files and their contents in the local repository
  • show: shows basic node information (PK, UUID, class, inputs and outputs)
  • tree: shows a tree of the nodes
verdi profile
  • list: Show the list of currently available profiles, indicating which one is the default one, and showing the current one with a > symbol
  • setdefault: Set the default profile, i.e. the one to be used when no -p option is specified before the verdi command
verdi run

Run a python script for AiiDA. This is the command line equivalent of the verdi shell. Has also features of autogroupin: by default, every node created in one a call of verdi run will be grouped together.

verdi runserver

Starts a lightweight Web server for development and also serves static files. Currently in ongoing development.

verdi setup

Used in the installation to configure the database. If it finds an already installed database, it updates the tables migrating them to the new schema.

Note

One can also create a new profile with this command:

verdi -p <new_profile_name> setup

The setup procedure then works as usual, and one can select there a new database. See also the profile command.

verdi shell

Runs a Python interactive interpreter. Tries to use IPython or bpython, if one of them is available. Loads on start a good part of the AiiDA infrastructure (see here for information on how to customize it).

verdi user

Manages the AiiDA users. Two valid subcommands.

  • list: list existing users configured for your AiiDA installation.
  • configure: configure a new AiiDA user.
verdi workflow

Manages the workflow. Valid subcommands:

  • report: display the information on how the workflow is evolving.
  • kill: kills a workflow.
  • list: lists the workflows present in the database. By default, shows only the running ones.
  • logshow: shows the log messages for the workflow.
Scripting

While many common functionalities are provided by either command-line tools (via verdi) or the web interface, for fine tuning (or automatization) it is useful to directly access the python objects and call their methods.

This is possible in two ways, either via an interactive shell, or writing and running a script. Both methods are described below.

Verdi shell

By running verdi shell on the terminal, a new interactive IPython shell will be opened (this requires that IPython is installed on your computer).

Note that simply opening IPython and loading the AiiDA modules will not work (unless you perform the operations described in the following section) because the database settings are not loaded by default and AiiDA does not know how to access the database.

Moreover, by calling verdi shell, you have the additional advantage that some classes and modules are automatically loaded. In particular the following modules/classes are already loaded and available:

from aiida.orm import (Node, Calculation, JobCalculation, Code, Data,
    Computer, Group, DataFactory, CalculationFactory)
from aiida.backends.djsite.db import models

Note

It is possible to customize the shell by adding modules to be loaded automatically, thanks to the verdi devel setproperty verdishell.modules command. See here for more information.

A further advantage is that bash completion is enabled, allowing to press the TAB key to see available submethods of a given object (see for instance the documentation of the ResultManager).

Python scripts

Alternatively, if you do not need an interactive shell but you prefer to write a script and then launch it from the command line, you can just write a standard python .py file. The only modification that you need to do is to add, at the beginning of the file and before loading any other AiiDA module, the following two lines:

from aiida import load_dbenv
load_dbenv()

that will load the database settings and allow AiiDA to reach your database. Then, you can load as usual python and AiiDA modules and classes, and use them. If you want to have the same environment of the verdi shell interactive shell, you can also add (below the load_dbenv call) the following lines:

from aiida.orm import Calculation, Code, Computer, Data, Node
from aiida.orm import CalculationFactory, DataFactory
from aiida.backends.djsite.db import models

or simply import the only modules that you will need in the script.

While this method will work, we strongly suggest to use instead the verdi run command, described here below.

The verdi run command and the runaiida executable

In order to simplify the procedure described above, it is possible to execute a python file using verdi run: this command will accept as parameter the name of a file, and will execute it after having loaded the modules described above.

The command verdi run has the additional advantage of adding all stored nodes to suitable special groups, of type autogroup.run, for later usage. You can get the list of all these groups with the command:

verdi group list -t autogroup.run

Some further command line options of verdi run allow the user to fine-tune the autogrouping behavior; for more details, refer to the output of verdi run -h. Note also that further command line parameters to verdi run are passed to the script as sys.argv.

Note

It is not possible to run multiple times the load_dbenv() command. Since calling verdi run will automatically call the load_dbenv() command, you cannot run a script that contains this call (this is instead needed if you want to run the script simply via python scriptname.py). If you want to allow for both options, use the following method to discover if the db environment was already loaded:

from aiida import load_dbenv, is_dbenv_loaded

if not is_dbenv_loaded():
    load_dbenv()

Finally, we also defined a runaiida command, that simply will pass all its parameters to verdi run. The reason for this is that one can define a new script to be run with verdi run, add as the first line the shebang command #!/usr/bin/env runaiida, and give to the file execution permissions, and the file will become an executable that is run using AiiDA. A simple example could be:

#!/usr/bin/env runaiida
import sys

pk = int(sys.argv[1])
node = load_node(pk)
print "Node {} is: {}".format(pk, repr(node))

import aiida
print "AiiDA version is: {}".format(aiida.get_version())
Plug-ins

AiiDA plug-ins are input generators and output parsers, enabling the integration of codes into AiiDA calculations and workflows.

The plug-ins are not meant to completely automatize the calculation of physical properties. An underlying knowledge of how each code works, which flags it requires, etc. is still required. A total automatization, if desired, has to be implemented at the level of a workflow.

Plugins live in different repositories than AiiDA. You can find a list of existing plugins on the AiiDA website or on the aiida-registry (check the JSON version or the human-readable version), the official location to register and list plugins.

Calculations

AiiDA calculations can be of two kinds:

  • JobCalculation: those who need to be run on a scheduler
  • InlineCalculation: rapid executions that are executed by the daemon itself, on your local machine.

In the following, we will refer to the JobCalculations as a Calculation for the sake of simplicity, unless we explicitly say otherwise. In the same way, the command verdi calculation refers to JobCalculations.

The calculation state

Once a calculation has been submitted to AiiDA, everything else will be managed by AiiDA:

  1. The inputs are checked for consistency
  2. The input files are prepared, sent to the cluster, and a job is submitted
  3. The status of the job is monitored by the AiiDA daemon
  4. When the calculation is finished, the outputs are retrieved and parsed

During these phases, it is useful to be able to check and verify the state of a calculation. The possible states and the different ways to access them are described in the following.

Possible states

The calculation could be in several states. The most common ones are the following:

  1. NEW: the calculation node has been created, but has not been submitted yet.

  2. WITHSCHEDULER: the job is in some queue on the remote computer. Note that this does not mean that the job is waiting in a queue, but it may be running or finishing, but it did not finish yet. AiiDA has to wait.

  3. FINISHED: the job on the cluster was finished, AiiDA already retrieved it and stored the results in the database. In most cases, this also means that the parser managed to parse the output file.

  4. FAILED: something went wrong, and AiiDA raised an exception. There are different reasons why this might happen:

    • there were not enough inputs, or they were not correct
    • the execution on the cluster failed
    • the code ended without completing successfully or producing a valid output file

    Other, more specific “failed” states are possible, including SUBMISSIONFAILED, RETRIEVALFAILED and PARSINGFAILED.

  5. For very short times, when the job completes on the remote computer and AiiDA retrieves and parses it, you may happen to see a calculation in the COMPUTED, RETRIEVING and PARSING states.

Eventually, when the calculation has finished, you will find the computed quantities in the database, and you will be able to query the database for the results that were parsed.

Determining the state

There are multiple ways to determine the state a calculation is in.

The verdi calculation command

The simplest way to check the state of submitted calculations is to use the verdi calculation list command from the command line. To get help on its use and command line options, run it with the -h or --help option:

verdi calculation list --help
Directly within python

If you prefer, you can check the state of a calculation from within python. For this, you need to specify the ID of the calculation you are interested in:

from aiida import load_dbenv
load_dbenv()

from aiida.orm import JobCalculation

## pk must be a valid integer pk
calc = load_node(pk)
## Alternatively, with the UUID (uuid must be a valid UUID string)
# calc = load_node(uuid)
print "AiiDA state:", calc.get_state()
print "Last scheduler state seen by the AiiDA deamon:", calc.get_scheduler_state()

Note that you can also get a code by knowing its UUID, as specified in the comments. The advantage of the UUID is that will be preserved across different AiiDA databases, while the numeric ID typically changes when databases are merged.

Note

calc.get_scheduler_state() returns the state on the scheduler (queued, held, running, ...) as seen the last time that the daemon connected to the remote computer. The time at which the last check was performed is returned by the calc.get_scheduler_lastchecktime() method. If no such check has been performed yet, this returns None.

The verdi calculation gotocomputer command

Sometimes it is useful to go directly to the folder where the calculation is running, for example to check if the output file has been created.

In this case, it is possible to run:

verdi calculation gotocomputer CALCULATIONPK

where CALCULATIONPK is the PK of the calculation. This will open a new connection to the computer (either simply a bash shell or a ssh connection, depending on the transport) and directly change directory to the appropriate folder where the code is running.

Warning

Be careful not to change any file that AiiDA created, modify the output files, or resubmit the calculation unless you really know what you are doing. Otherwise AiiDA may get very confused!

Setting calculation properties

There are various methods which specify the calculation properties. Here follows a brief documentation of their action. You can also find them in the AbstractJobCalculation API reference.

  • c.set_max_memory_kb: require explicitely the memory to be allocated to the scheduler job.
  • c.set_append_text: write a set of bash commands to be executed after the call to the executable. These commands are executed only for this instance of calculations. Look also at the computer and code append_text to write bash commands for any job run on that computer or with that code.
  • c.set_max_wallclock_seconds: set (as integer) the scheduler-job wall-time in seconds.
  • c.set_computer: set the computer on which the calculation is run. Unnecessary if the calculation has been created from a code.
  • c.set_mpirun_extra_params: set as a list of strings the parameters to be passed to the mpirun command. Example: mpirun -np 8 extra_params[0] extra_params[1] ... exec.x Note: the process number is set by the resources.
  • c.set_custom_scheduler_commands: set a string (even multiline) which contains personalized job-scheduling commands. These commands are set at the beginning of the job-scheduling script, before any non-scheduler command. (prepend_texts instead are set after all job-scheduling commands).
  • c.set_parser_name: set the name of the parser to be used on the output. Typically, a plugin will have already a default plugin set, use this command to change it.
  • c.set_environment_variables: set a dictionary, whose key and values will be used to set new environment variables in the job-scheduling script before the execution of the calculation. The dictionary is translated to: export 'keys'='values'.
  • c.set_prepend_text: set a string that contains bash commands, to be written in the job-scheduling script for this calculation, right before the call to the executable. (it is used for example to load modules). Note that there are also prepend text for the computer (that are used for any job-scheduling script on the given computer) and for the code (that are used for any scheduling script using the given code), the prepend_text here is used only for this instance of the calculation: be careful in avoiding duplication of bash commands.
  • c.set_extra: pass a key and a value, to be stored in the Extra attribute table in the database.
  • c.set_extras: like set extra, but you can pass a dictionary with multiple keys and values.
  • c.set_priority: set the job-scheduler priority of the calculation (AiiDA does not have internal priorities). The function accepts a value that depends on the scheduler. plugin (but typically is an integer).
  • c.set_queue_name: pass in a string the name of the queue to use on the job-scheduler.
  • c.set_import_sys_environment: default=True. If True, the job-scheduling script will load the environment variables.
  • c.set_resources: set the resources to be used by the calculation like the number of nodes, wall-time, ..., by passing a dictionary to this method. The keys of this dictionary, i.e. the resources, depend on the specific scheduler plugin that has to run them. Look at the documentation of the scheduler (type is given by: calc.get_computer().get_scheduler_type()).
  • c.set_withmpi: True or False, if True (the default) it will call the executable as a parallel run.
Retrieving results

In this section, we describe how to get the results of a calculation after it has been parsed by AiiDA, or the input and output of a generic Node. When a calculation is done on the remote computer, AiiDA will retrieve the results and try to parse the results with the default parser, if one is available for the given calculation. These results are stored in new nodes, and connected as output of the calculation. Of course, it is possible to directly check the output nodes for a given calculation and get their content. However, AiiDA provides a way to directly access the results, using the CalculationResultManager class, described in the next section.

The CalculationResultManager
Prerequisites

Before getting the calculation results, we need a correctly finished and parsed JobCalculation. For example this can be a Quantum ESPRESSO pw.x calculation. You can load such a calculation – we’ll call it calc – with the command

from aiida.orm import load_node
calc = load_node(YOURPK)

either in verdi shell, or in a python script (as described here). YOURPK should be substituted by a valid calculation PK in your database.

Using the CalculationResultManager instance

Each JobCalculation has a res attribute that is a CalculationResultManager instance and gives direct access to parsed data. You can access it as

calc.res

To get all the possible keys that were parsed, you can convert the instance into a list. For instance, if you type

print list(calc.res)

you will get something like this:

[u'rho_cutoff', u'energy', u'energy_units', ...]

(the list of keys has been cut for clarity: you will get many more keys).

Once you know which keys have been parsed, you can access the parsed value simply as an attribute of the res CalculationResultManager. For instance, to get the final total energy, you can use

print calc.res.energy

that will print the total energy in units of eV, as also stated in the energy_units key

print calc.res.energy_units

Similarly, you can get any other parsed value, for any code that provides a parser.

Hint

The CalculationResultManager is also integrated with the iPython/verdi shell completion mechanism: if calc is a valid JobCalculation, you can type

calc.res.

and then press the TAB key of the keyboard to get/complete the list of valid parsed properties for the calculation calc.

Node input and output

In the following, we will show the methods to access the input and output nodes of a given node.

Again, we start by loading a node from the database. Unlike before, this can be any type of node. For example, we can load the node with PK 17:

from aiida.orm import load_node
node = load_node(17)

Now, we want to find the nodes which have a direct link to this node. The node has several methods to extract this information: get_outputs(), get_outputs_dict(), get_inputs() and get_inputs_dict(). The most practical way to access this information, especially when working on the verdi shell, is by means of the inp and out attributes.

The inp attribute can be used to list and access the nodes with a direct link to node in input. The names of the input links can be printed by list(node.inp) or interactively by node.inp. + TAB. As an example, suppose that node has an input KpointsData object under the linkname kpoints. The command

node.inp.kpoints

returns the KpointsData object.

Similarly the out attribute can be used to display the names of links in output from node and access these nodes. Suppose that node has an output FolderData with linkname retrieved, then the command

node.out.retrieved

returns the FolderData object.

Note

For the input, there can be only one object for a given linkname. In contrast, there can be more than one output object with the same linkname. For example, a code object can be used by several calculations with the same linkname code. For this reason, we append the string _pk indicating the pk of the output code to the linkname. A linkname without _pk still exists, and refers to the oldest link.

As an example, imagine that node is a code, which is used by calculation #18 and #19. The linknames shown by node.out are

node.out.  >>
  * code
  * code_18
  * code_19

The attributes node.out.code_18 and node.out.code_19 will return two different calculation objects, and node.out.code will return the older one of the two.

Data types
AiiDA data types

There are a number of data types distributed with AiiDA.

We summarize here the most common, and some useful features/functionalities to work with them.

Most common datatypes

Here follows a short summary of common datatypes already provided with AiiDA. This list is not complete, see also inside aiida.orm.data for the list of all available plugins.

We also mention, when relevant, what is stored in the database (as attributes, so that it can be easily queried e.g. with the QueryBuilder) and what is stored in the file repository (providing access to the file contents, but not efficiently querable: this is useful for e.g. big data files that don’t need to be queried for).

For all data types, you can follow the link to the data class to read more about the methods provided, how to access them ...

If you need to work with some specific type of data, first check the list of data types/plugins below, and if you don’t find what you need, give a look to how to write a new data plugin.

Base types

In the aiida.orm.data.base module you find a number of useful classes that wrap base python datatypes (like Int, Float, Str, ...). These are particularly useful when you need to provide a single parameter to e.g. a workfunction.

Each of these classes can most often be used transparently (e.g. you can sum two Int objects, etc.). If you need to access the bare value and not the whole AiiDA class, use the .value property.

In the same module, there is also a List class to store a list of base data types.

ParameterData
  • Class: ParameterData
  • String to pass to the DataFactory(): parameter
  • Aim: store a dictionary of python base types in the database. It can store any dictionary where elements can be a base python type (strings, floats, integers, booleans, None type, datetime objects) and lists or dictionaries of them, at any depth level (e.g. a dictionary where a value is a list of dictionaries of strings and floats).
  • What is stored in the database: all key/values pairs as attributes
  • What is stored in the file repository: —
StructureData
UpfData
  • Class: UpfData
  • String to pass to the DataFactory(): upf
  • Aim: store a pseudopotential in the .UPF format (e.g. used by Quantum ESPRESSO - see also the AiiDA Quantum ESPRESSO plugin)
  • What is stored in the database: the MD5 of the UPF; the element the pseudopotential is associated to
  • What is stored in the file repository: the pseudopotential file
ArrayData
  • Class: ArrayData
  • String to pass to the DataFactory(): array
  • Aim: store generic numeric arrays
  • What is stored in the database: the shape of the arrays and the name of the arrays
  • What is stored in the file repository: the array data in numpy format
TrajectoryData
  • Class: TrajectoryData
  • String to pass to the DataFactory(): array.trajectory
  • Aim: store molecular trajectories (i.e. sequences of StructureData objects, where then number of atomic kinds and sites does not change over time). beside the coordinates, it can also optionally store velocities.
  • What is stored in the database: like ArrayData
  • What is stored in the file repository: the array data in numpy format: cells over time, integer indices over time, atomic positions over time, the list of kinds, ...
  • Additional functionality:
KpointsData
  • Class: KpointsData
  • String to pass to the DataFactory(): array.kpoints
  • Aim: store grids of k-points (in reciprocal space, for crystal structures), or explicit list of k-points (optionally with a weight associated to each one). Can also associate labels to (some of the) points, which is very useful for later plottings band structures (and store them in BandsData objects).
  • What is stored in the database: like ArrayData
  • What is stored in the file repository: the array data in numpy format
BandsData
XyData
  • Class: XyData
  • String to pass to the DataFactory(): array.xy
  • Aim: store data for a 2D (xy) plot
  • What is stored in the database: like ArrayData
  • What is stored in the file repository: the array data in numpy format
FolderData
  • Class: FolderData
  • String to pass to the DataFactory(): folder
  • Aim: store a set of files/folders (with possibly a folder/subfolder structure)
  • What is stored in the database: —
  • What is stored in the file repository: all files and folders
SinglefileData
  • Class: SinglefileData
  • String to pass to the DataFactory(): singlefile
  • Aim: the same as FolderData, but allows to store only one single file.
  • What is stored in the database: the filename
  • What is stored in the file repository: the file
RemoteData
  • Class: RemoteData
  • String to pass to the DataFactory(): remote
  • Aim: this basically represents a “symbolic link” to a specific folder on a remote computer. Its main use is to allow users to persist the provenance when e.g. a calculation produces data in a raw/scratch folder, and the whole folder needs to be provided to restart/continue.
  • What is stored in the database: the path of the folder (and the remote computer as a .computer property, not as an attribute)
  • What is stored in the file repository: —
Export data nodes to various formats

Each data node has a export() method that allows to export the given data node to file in a variety of available formats, e.g. to pass it to a visualization software.

The export() method asks for a filename, and it will write to file the result. It is possible that more than one file is written (example: if you produce a gnuplot script, the data will typically be in a different .dat file). The return value of the function is a list of files that have been created.

The list of export formats depends on the specific Data plugin. The export format is typically inferred from the file extension, but if this is not possible (or you want to specify a given format), you can pass an additional fileformat parameter to export(). The list of all valid export formats can be obtained calling Data.get_export_formats() method, that returns a list of strings with all valid formats.

If you don’t want to export directly to a file, but want to get simply the content of the file as a string, then you can call the _exportstring() method, passing also a fileformat parameter. The return value is a tuple of length 2: the first element is a string with the content of the “main” file, while the second is a dictionary (possibly empty) with a list of additional files that should be created/needed: the keys are filenames, and the values are the files content.

Exporting from the command line

Most data types expose the export functionality on the command line.

For instance, if you want to export a StructureData object with given PK, you can run on the command line:

verdi data structure export PK --format=FORMAT_NAME

that will export the node with PK=``PK`` in the format FORMAT_NAME. This will print on screen the file content; a few command line options allow to change this behaviour:

  • -o FILENAME asks to write directly on a file named FILENAME. This is compulsory in some cases, e.g. if more than one file needs to be created.
  • -y asks to overwrite the file(s), if present. If not specified, the call will fail if any of the files to create are present.

Additional options (often format-specific) exist, and can be discovered passing the -h option to the command line. For instance:

  • verdi data structure export accepts a number of formats including xsf, cif, xyz and tcod, and additional parameters like --no-reduce-symmetry (to be used in combination with the tcod format to tell AiiDA not to try to reduce simmetry in the output CIF file, etc.
  • verdi data trajectory export accepts a number of formats including xsf, cif and tcod, and additional parameters like --step NUM (to choose to export only a given trajectory step).
  • verdi data bands export accepts a number of formats including (see also below) and additional parameters like --prettify-format FORMATNAME, see valid formats below, or --y-min-lim, --y-max-lim to specify the y-axis limits.
Export formats for specific Data types

As the formats are specific to the data types, here is a list of some of the export formats available for some of the AiiDA data classes.

StructureData

The following export formats are available:

  • xsf (format supported by e.g. XCrySDen and other visualization software; supports periodic cells)
  • xyz (classical xyz format, does not typically support periodic cells (even if the cell is indicated in the comment line)
  • cif (export to CIF format, without symmetry reduction, i.e. always storing the structure as P1 symmetry)
  • tcod (extension to the CIF format, supports symmetry reduction, and typically adds in the CIF file a number of additional information, including the full provenance of the crystal structure node)
TrajectoryData

The following export formats are available:

  • xsf (format supported by e.g. XCrySDen and other visualization software; supports periodic cells)
  • cif (export to CIF format, without symmetry reduction, i.e. always storing the structures as P1 symmetry)
  • tcod (extension to the CIF format, supports symmetry reduction, and typically adds in the CIF file a number of additional information, including the full provenance of the crystal trajecotry node)
BandsData

The following export formats are available:

  • agr: export a Xmgrace .agr file with the band plot
  • agr_batch: export a Xmgrace batch file together with an independent .dat file
  • dat_blocks: export a .dat file, where each line has a data point (xy) and bands are separated in blocks with empty lines
  • dat_multicolumn: export a .dat file, where each line has all the values for a given x coordinate: x y1 y2 y3 y4 ... (x being a linear coordinate along the band path and yN being the band energies)
  • gnuplot: export a gnuplot file, together with a .dat file
  • json: export a json file with the bands divided into segments
  • mpl_singlefile: export a python file that when executed shows a plot using the matplotlib module. All data is included in the same python file as a multiline string containing the data in json format.
  • mpl_withjson: As above, but the json data is stored separately in a different file
  • mpl_pdf: As above, but after creating the .py file it runs it to export the band structure in a PDF file (vectorial). NOTE: it requires that you have the python matplotlib module installed. If use_latex is true, it requires that you have LaTeX installed on your system to typeset the labels, as well as the dvipng binary.
  • mpl_png: As above, but after creating the .py file it runs it to export the band structure in a PDF file (vectorial). NOTE: this format has the same dependencies as the mpl_pdf format above.
Label prettifiers

AiiDA provides a number of functions to “prettify” the labels of band structures (if labels are present in the data node), i.e., replace GAMMA with \Gamma or K_1 with K_{1} for instance. This makes sense for some output formats (e.g. Xmgrace, Gnuplot, matplotlib).

The prettifier functions are defined as methods of the Prettifier class. and can be obtained calling Prettifier.get_prettifiers().

The prettifiers should be chosen depending on two aspects:

  1. how the raw labels are stored in the database. Two types exist currently: seekpath, as used in the seekpath module, where Greek letters are written explicitly (e.g. GAMMA) and underscores are used to indicate a subscript (K_1); and the “old” simple format, where \Gamma is indicated with G and there is no underscore symbol).
  2. depending on the output format: xmgrace has a specific syntax for Greek letters and subscripts, matplotlib uses LaTeX syntax, etc.

Most export formats already decide which prettifier is best to use, but if you need to change it, you can do it passing the prettify_format parameter to the export() method. Valid prettifiers include:

  • 'agr_seekpath: format for Xmgrace, using seekpath raw label syntax
  • agr_simple: format for Xmgrace, using simple raw label syntax
  • latex_simple: format for LaTeX (including dollar signs), using seekpath raw label syntax
  • latex_seekpath: format for LaTeX (including dollar signs), using simple raw label syntax
  • gnuplot_simple: format for GNUPlot (Unicode for Greek letters, LaTeX syntax without dollar signs for underscores), using seekpath raw label syntax
  • gnuplot_seekpath``: format for GNUPlot (Unicode for Greek letters, LaTeX syntax without dollar signs for underscores), using simple raw label syntax
  • pass: no-op prettifier: leaves all strings unchanged to their raw value
Schedulers

As described in the section about calculations, JobCalculation instances are submitted by the daemon to an external scheduler. For this functionality to work, AiiDA needs to be able to interact with these schedulers. Interfaces have been written for some of the most used schedulers.

Supported schedulers

The list below describes the supported schedulers, i.e. the batch job schedulers that manage the job queues and execution on any given computer.

PBSPro

The PBSPro scheduler is supported (and it has been tested with version 12.1).

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like)

SLURM

The SLURM scheduler is supported (and it has been tested with version 2.5.4).

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like)

SGE

The SGE scheduler (Sun Grid Engine, now called Oracle Grid Engine) is supported (and it has been tested with version GE 6.2u3), together with some of the main variants/forks.

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the ParEnvJobResource (SGE-like)

LSF

The IBM LSF scheduler is supported and has been tested with version 9.1.3 on the CERN lxplus cluster.

PBS/Torque & Loadleveler

PBS/Torque and Loadleveler are not fully supported yet, even if their support is one of our top priorities. For the moment, you can try the PBSPro plugin instead of PBS/Torque, that may also work for PBS/Torque (even if there will probably be some small issues).

Direct execution (bypassing schedulers)

The direct scheduler, to be used mainly for debugging, is an implementation of a scheduler plugin that does not require a real scheduler installed, but instead directly executes a command, puts it in the background, and checks for its process ID (PID) to discover if the execution is completed.

Warning

The direct execution mode is very fragile. Currently, it spawns a separate Bash shell to execute a job and track each shell by process ID (PID). This poses following problems:

  • PID numeration is reset during reboots;
  • PID numeration is different from machine to machine, thus direct execution is not possible in multi-machine clusters, redirecting each SSH login to a different node in round-robin fashion;
  • there is no real queueing, hence, all calculation started will be run in parallel.

Warning

Direct execution bypasses schedulers, so it should be used with care in order not to disturb the functioning of machines.

All the main features are supported with this scheduler.

The JobResource class to be used when setting the job resources is the NodeNumberJobResource (PBS-like)

Job resources

When asking a scheduler to allocate some nodes/machines for a given job, we have to specify some job resources, such as the number of required nodes or the numbers of MPI processes per node.

Unfortunately, the way of specifying this information is different on different clusters. In AiiDA, this is implemented in different subclasses of the aiida.scheduler.datastructures.JobResource class. The subclass that should be used is given by the scheduler, as described in the previous section.

The interfaces of these subclasses are not all exactly the same. Instead, specifying the resources is similar to writing a scheduler script. All classes define at least one method, get_tot_num_mpiprocs, that returns the total number of MPI processes requested.

In the following, the different JobResource subclasses are described:

Note

you can manually load a specific JobResource subclass by directly importing it, e..g.

from aiida.scheduler.datastructures import NodeNumberJobResource

However, in general, you will pass the fields to set directly to the set_resources method of a JobCalculation object. For instance:

calc = JobCalculation(computer=...) # select here a given computer configured
                                    # in AiiDA

# This assumes that the computer is configured to use a scheduler with
# job resources of type NodeNumberJobResource
calc.set_resources({"num_machines": 4, "num_mpiprocs_per_machine": 16})
NodeNumberJobResource (PBS-like)

This is the way of specifying the job resources in PBS and SLURM. The class is aiida.scheduler.datastructures.NodeNumberJobResource.

Once an instance of the class is obtained, you have the following fields that you can set:

  • res.num_machines: specify the number of machines (also called nodes) on which the code should run
  • res.num_mpiprocs_per_machine: number of MPI processes to use on each machine
  • res.tot_num_mpiprocs: the total number of MPI processes that this job is requesting
  • res.num_cores_per_machine: specify the number of cores to use on each machine
  • res.num_cores_per_mpiproc: specify the number of cores to run each MPI process

Note that you need to specify only two among the first three fields above, for instance:

res = NodeNumberJobResource()
res.num_machines = 4
res.num_mpiprocs_per_machine = 16

asks the scheduler to allocate 4 machines, with 16 MPI processes on each machine. This will automatically ask for a total of 4*16=64 total number of MPI processes.

The same can be achieved passing the fields directly to the constructor:

res = NodeNumberJobResource(num_machines=4, num_mpiprocs_per_machine=16)

or, even better, directly calling the set_resources method of the JobCalculation class (assuming here that calc is your calculation object):

calc.set_resources({"num_machines": 4, "num_mpiprocs_per_machine": 16})

Note

If you specify res.num_machines, res.num_mpiprocs_per_machine, and res.tot_num_mpiprocs fields (not recommended), make sure that they satisfy:

res.num_machines * res.num_mpiprocs_per_machine = res.tot_num_mpiprocs

Moreover, if you specify res.tot_num_mpiprocs, make sure that this is a multiple of res.num_machines and/or res.num_mpiprocs_per_machine.

Note

When creating a new computer, you will be asked for a default_mpiprocs_per_machine. If you specify it, then you can avoid to specify num_mpiprocs_per_machine when creating the resources for that computer, and the default number will be used.

Of course, all the requirements between num_machines, num_mpiprocs_per_machine and tot_num_mpiprocs still apply.

Moreover, you can explicitly specify num_mpiprocs_per_machine if you want to use a value different from the default one.

The num_cores_per_machine and num_cores_per_mpiproc fields are optional. If you specify num_mpiprocs_per_machine and num_cores_per_machine fields, make sure that:

res.num_cores_per_mpiproc * res.num_mpiprocs_per_machine = res.num_cores_per_machine

If you want to specifiy single value in num_mpiprocs_per_machine and num_cores_per_machine, please make sure that res.num_cores_per_machine is multiple of res.num_cores_per_mpiproc and/or res.num_mpiprocs_per_machine.

Note

In PBSPro, the num_mpiprocs_per_machine and num_cores_per_machine fields are used for mpiprocs and ppn respectively.

Note

In Torque, the num_mpiprocs_per_machine field is used for ppn unless the num_mpiprocs_per_machine is specified.

ParEnvJobResource (SGE-like)

In SGE and similar schedulers, one has to specify a parallel environment and the total number of CPUs requested. The class is aiida.scheduler.datastructures.ParEnvJobResource.

Once an instance of the class is obtained, you have the following fields that you can set:

  • res.parallel_env: specify the parallel environment in which you want to run your job (a string)
  • res.tot_num_mpiprocs: the total number of MPI processes that this job is requesting

Remember to always specify both fields. No checks are done on the consistency between the specified parallel environment and the total number of MPI processes requested (for instance, some parallel environments may have been configured by your cluster administrator to run on a single machine). It is your responsibility to make sure that the information is valid, otherwise the submission will fail.

Some examples:

  • setting the fields one by one:

    res = ParEnvJobResource()
    res.parallel_env = 'mpi'
    res.tot_num_mpiprocs = 64
    
  • setting the fields directly in the class constructor:

    res = ParEnvJobResource(parallel_env='mpi', tot_num_mpiprocs=64)
    
  • even better, directly calling the set_resources method of the JobCalculation class (assuming here that calc is your calculation object):

    calc.set_resources({"parallel_env": 'mpi', "tot_num_mpiprocs": 64})
    
Querying data

The advantage of storing information in a database is that questions can be asked on the data, and an answer can be rapidly provided.

Here we describe different approaches to query the data in AiiDA.

Note

This section is still only a stub and will be significantly improved in the next versions.

Directly querying in Django

If you know how AiiDA stores the data internally in the database, you can directly use Django to query the database (or even use directly SQL commands, if you really feel the urge to do so). Documentation on how queries work in Django can be found on the official Django documentation. The models can be found in aiida.backends.djsite.db.models and is directly accessible as models in the verdi shell via verdi run.

Directly querying in SQLAlchemy

Check out the documentation on <http://www.sqlalchemy.org/>. Models are in aiida.backends.sqlalchemy.models

Using the QueryBuilder
Introduction

This section describes the use of the QueryBuilder, which is meant to help you query your AiiDA database independent of the employed backend. Before jumping into the specifics, let’s discuss what you should be clear about before writing a query:

  • You should know what you want to query for. In database-speek, you need to tell the backend what to project. For example, you might be interested in the label of a calculation and the pks of all its outputs.
  • In many use-cases, you will query for relationships between entities that are connected in a graph-like fashion, with links as edges and nodes as vertices. You have to know the relationships between these entities. A Node can be either input or output of another Node, but also an ancestor or a descendant.
  • In almost all cases, you will be interested in a subset of all possible entities that could be returned based on the joins between the entities of your graph. In other ways, you need to have an idea of how to filter the results.

If you are clear about what you want and how you can get it, you will have to provide this information to QueryBuilder, who will build an SQL-query for you. There is more than one possible API that you can use:

  1. The appender-method
  2. Using the queryhelp

What you will use depends on the specific use case. The functionalities are the same, so it’s up to you what to use.

The appender method
Introduction

Let’s first discuss the appender-method using some concrete examples. The first thing to know is how to chose entities that you want to query:

from aiida.orm.querybuilder import QueryBuilder
qb = QueryBuilder()       # Instantiating instance. One instance -> one query
qb.append(JobCalculation) # Setting first vertice of path
Retrieving results

So, let’s suppose that’s what we want to query for (all job calculations in the database). The question is how to get the results from the query:

from aiida.orm.querybuilder import QueryBuilder
qb = QueryBuilder()                 # Instantiating instance
qb.append(JobCalculation)           # Setting first vertice of path

first_row = qb.first()              # Returns a list (!)
                                    # of the results of the first row

all_results_d = qb.dict()           # Returns all results as
                                    # a list of dictionaries

all_results_l = qb.all()            # Returns a list of lists


# Also you can use generators:
all_res_d_gen = qb.iterdict()       # Return a generator of dictionaries
                                    # of all results
all_res_l_gen = qb.iterall()        # Returns a generator of lists

Note

Generators are useful if you have to retrieve a very large (>10000) number of results. This will retrieve the data in batches, and you can start working with the data before the query has completely finished. Be aware that if using generators, you should never commit (store) anything while iterating. The query is still going on, and might be compromised by new data in the database.

Filtering

Since we now know how to set an entity, we can start to filter by properties of that entity. Suppose we do not want to all JobCalculations, but only the ones in state ‘FINISHED’:

qb = QueryBuilder()                 # An empty QueryBuilder instances
qb.append(
    JobCalculation,                 # I am appending a JobCalculation
    filters={                       # Specifying the filters:
        'state':{'==':'FINISHED'},  # the calculation has to have finished
    },
)

How, can we have multiple filters? Suppose you are interested in all calculations in your database that are in state ‘FINISHED’ and were created in the last n days:

from datetime import timedelta
from aiida.utils import timezone
now = timezone.now()
time_n_days_ago = now - timedelta(days=n)

qb = QueryBuilder()                 # An empty QueryBuilder instances
qb.append(
    JobCalculation,                 # I am appending a JobCalculation
    filters={                       # Specifying the filters:
        'state':{'==':'FINISHED'},  # the calculation has to have finished AND
        'ctime':{'>':time_n_days_ago}     # created in the last n days
    },
)
result = qb.dict()                  # all results as a list of dictionaries

Let’s go through the above example. We have instantiated QueryBuilder instance. We appended to its path a JobCalculation (a remote calculation), and specified that we are only interested in calculations that have finished and that were created in the last n days.

What if we want calculations that have finished or were created in the last n days:

qb = QueryBuilder()
qb.append(
    JobCalculation,
    filters={
        'or':[
            {'state':{'==':'FINISHED'}},
            {'ctime':{'>': now - timedelta(days=n)}}
        ]
    },
)
res =vqb.dict()

If we had written and instead of or, we would have created the exact same query as in the first query, because and is the default behavior if you attach several filters. What if you want calculation in state ‘FINISHED’ or ‘RETRIEVING’? This will be the next example:

qb = QueryBuilder()
qb.append(
    JobCalculation,
    filters={
        'state':{'in':['FINISHED', 'RETRIEVING']}
    },
)
res = qb.all()

In order to negate a filter, that is to apply the not operator, precede the filter keyword with an exclamation mark. So, to ask for all calculations that are not in ‘FINISHED’ or ‘RETRIEVING’:

qb = QueryBuilder()
qb.append(
    JobCalculation,
    filters={
        'state':{'!in':['FINISHED', 'RETRIEVING']}
    },
)
res = qb.all()

Note

The above rule applies strictly! You check a non-equality with !==, since this is the equality operator (==) with a negation prepended.

This is a list of all implemented operators:

Operator Datatype Example Explanation
== All ‘id’:{‘==’:123} Checks equality
in All ‘name’:{‘in’:[‘foo’, ‘bar’]} equal to any element
>,<,<=,>= floats, integers, dates ‘ctime’:{‘<’:datetime(2016, 03, 03)} lower/greater (equal)
like Strings ‘name’:{‘like’:’lovely_calc%’} substring (% is wildcard)
ilike Strings ‘name’:{‘like’:’loVely_Calc%’} case insensitive ‘like’
or list of expressions ‘id’:{‘or’:[{‘<’:12}, {‘==’:199}]}  
and list of expressions ‘id’:{‘and’:[{‘<’:12}, {‘>’:1 }]}  

There are also some advanced operators:

Operator Datatype Example Explanation
has_key dicts
‘attributes.mykey’:{‘has_key’: ‘foo’}
‘extras’:{‘has_key’: ‘my_extra’}
Check that a dictionary (typically stored in the attributes or in the extras) has a given key. This can also be used to check if a given attribute or extra exists.
of_type any
‘attributes.mykey’:{‘of_type’: ‘bool’}
Check that an attribute or an extra is of a given type. Valid types are: object (meaning a dictionary), array (meaning a list), string, number (both for integers and floats), boolean or null) (currently implemented only in the SQLA backend)
of_length lists
‘attributes.mylist’: {‘of_length’: 4}
Check that a list (typically stored in the attributes or in the extras) has a given length (currently implemented only in the SQLA backend)
shorter lists
‘attributes.mylist’: {‘shorter’: 4}
Check that a list (typically stored in the attributes or in the extras) has a length shorter than the specified value (currently implemented only in the SQLA backend)
longer lists
‘attributes.mylist’: {‘longer’: 4}
Check that a list (typically stored in the attributes or in the extras) has a length longer than the specified value (currently implemented only in the SQLA backend)
contains lists
‘attributes.mykey’: {‘contains’: ‘a’}
Check that a list (typically stored in the attributes or in the extras) contains a specific element or value (currently implemented only in the SQLA backend)

This showed you how to ‘filter’ by properties of a node. So far we can do that for a single a single node in the database.

Joining entities

But we sometimes need to query relationships in graph-like database. Let’s join a node to its output, e.g. StructureData and JobCalculation (as output):

qb = QueryBuilder()
qb.append(StructureData, tag='structure')
qb.append(JobCalculation, output_of='structure')

In above example we are querying structures and calculations, with the predicate that the calculation is an output of the structure (the same as saying that the structure is an input to the calculation) In the above example, we have first appended StructureData to the path. So that we can refer to that vertice later, we tag it with a unique keyword of our choice, which can be used only once. When we append another vertice to the path, we specify the relationship to a previous entity by using one of the keywords in the above table and as a value the tag of the vertice that it has a relationship with. There are several relationships that entities in Aiida can have:

Entity from Entity to Relationship Explanation
Node Node input_of One node as input of another node
Node Node output_of One node as output of another node
Node Node ancestor_of One node as the ancestor of another node (Path)
Node Node descendant_of One node as descendant of another node (Path)
Node Group group_of The group of a node
Group Node member_of The node is a member of a group
Node Computer computer_of The computer of a node
Computer Node has_computer The node of a computer
Node User creator_of The creator of a node is a user
User Node created_by The node was created by a user
User Group belongs_to The node was created by a user
Group User owner_of The node was created by a user

Some more examples:

# StructureData as an input of a job calculation
qb = QueryBuilder()
qb.append(JobCalculation, tag='calc')
qb.append(StructureData, input_of='calc')

# StructureData and ParameterData as inputs to a calculation
qb = QueryBuilder()
qb.append(JobCalculation, tag='calc')
qb.append(StructureData, input_of='calc')
qb.append(ParameterDataData, input_of='calc')

# Filtering the remote data instance by the computer it ran on (name)
qb = QueryBuilder()
qb.append(RemoteData, tag='remote')
qb.append(Computer, computer_of='remote', filters={'name':{'==':'mycomputer'}})

# Find all descendants of a structure with a certain uuid
qb = QueryBuilder()
qb.append(StructureData, tag='structure', filters={'uuid':{'==':myuuid}})
qb.append(Node, descendant_of='structure')

The above QueryBuilder will join a structure to all its descendants via the transitive closure table.

Defining the projections

But what will the query return exactly? If you try any of the examples, you will find that the instances of the last appended vertice appear! That is the default behavior if nothing else was specified. We usually do not want everything returned because it might lead to a big overhead. You need to specify what you want to return using the keyword project.

Let’s stick to the previous example:

# Find all descendants of a structure with a certain uuid
qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['type', 'uuid'],  # returns type (string) and uuid (string)
)

In the above example, executing the query returns the type and the id of all Node that are descendants of the structure:

qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['type', 'id'],  # returns type (string) and id (string)
    tag='descendant'
)

# Return the dictionaries:
print "\n\nqb.iterdict()"
for d in qb.iterdict():
    print '>>>', d

# Return the lists:
print "\n\nqb.iterall()"
for l in qb.iterall():
    print '>>>', l

# Return the first result:
print "\n\nqb.first()"
print '>>>', qb.first()

results in the following output:

qb.iterdict()
>>> {'descendant': {'type': u'calculation.job.quantumespresso.pw.PwCalculation.', 'id': 7716}}
>>> {'descendant': {'type': u'data.remote.RemoteData.', 'id': 8510}}
>>> {'descendant': {'type': u'data.folder.FolderData.', 'id': 9090}}
>>> {'descendant': {'type': u'data.array.ArrayData.', 'id': 9091}}
>>> {'descendant': {'type': u'data.array.trajectory.TrajectoryData.', 'id': 9092}}
>>> {'descendant': {'type': u'data.parameter.ParameterData.', 'id': 9093}}


qb.iterall()
>>> [u'calculation.job.quantumespresso.pw.PwCalculation.', 7716]
>>> [u'data.remote.RemoteData.', 8510]
>>> [u'data.folder.FolderData.', 9090]
>>> [u'data.array.ArrayData.', 9091]
>>> [u'data.array.trajectory.TrajectoryData.', 9092]
>>> [u'data.parameter.ParameterData.', 9093]


qb.first()
>>> [u'calculation.job.quantumespresso.pw.PwCalculation.', 7716]

Asking only for the properties that you are interested in can result in much faster queries. If you want the Aiida-ORM instance, add ‘*’ to your list of projections:

qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['*'],      # returns the Aiida ORM instance
    tag='desc'
)

# Return the dictionaries:
print "\n\nqb.iterdict()"
for d in qb.iterdict():
    print '>>>', d

# Return the lists:
print "\n\nqb.iterall()"
for l in qb.iterall():
    print '>>>', l

# Return the first result:
print "\n\nqb.first()"
print '>>>', qb.first()

Output:

qb.iterdict()
>>> {'desc': {'*': <PwCalculation: uuid: da720712-3ca3-490b-abf4-b0fb3174322e (pk: 7716)>}}
>>> {'desc': {'*': <RemoteData: uuid: 13a378f8-91fa-42c7-8d7a-e469bbf02e2d (pk: 8510)>}}
>>> {'desc': {'*': <FolderData: uuid: 91d5a5e8-6b88-4e43-9652-9efda4adb4ce (pk: 9090)>}}
>>> {'desc': {'*': <ArrayData: uuid: 7c34c219-f400-42aa-8bf2-ee36c7c1dd40 (pk: 9091)>}}
>>> {'desc': {'*': <TrajectoryData: uuid: 09288a5f-dba5-4558-b115-1209013b6b32 (pk: 9092)>}}
>>> {'desc': {'*': <ParameterData: uuid: 371677e1-d7d4-4f2e-8a41-594aace02759 (pk: 9093)>}}


qb.iterall()
>>> [<PwCalculation: uuid: da720712-3ca3-490b-abf4-b0fb3174322e (pk: 7716)>]
>>> [<RemoteData: uuid: 13a378f8-91fa-42c7-8d7a-e469bbf02e2d (pk: 8510)>]
>>> [<FolderData: uuid: 91d5a5e8-6b88-4e43-9652-9efda4adb4ce (pk: 9090)>]
>>> [<ArrayData: uuid: 7c34c219-f400-42aa-8bf2-ee36c7c1dd40 (pk: 9091)>]
>>> [<TrajectoryData: uuid: 09288a5f-dba5-4558-b115-1209013b6b32 (pk: 9092)>]
>>> [<ParameterData: uuid: 371677e1-d7d4-4f2e-8a41-594aace02759 (pk: 9093)>]


qb.first()
>>> [<PwCalculation: uuid: da720712-3ca3-490b-abf4-b0fb3174322e (pk: 7716)>]

Note

Be aware that, for consistency, QueryBuilder.all / iterall always returns a list of lists, and first always a list, even if you project on one entity!

If you are not sure which keys to ask for, you can project with ‘**’, and the QueryBuilder instance will return all column properties:

qb = QueryBuilder()
qb.append(
    StructureData,
    project=['**']
)

Output:

qb.limit(1).dict()
>>> {'StructureData': {
        u'user_id': 2,
        u'description': u'',
        u'ctime': datetime.datetime(2016, 2, 3, 18, 20, 17, 88239),
        u'label': u'',
        u'mtime': datetime.datetime(2016, 2, 3, 18, 20, 17, 116627),
        u'id': 3028,
        u'dbcomputer_id': None,
        u'nodeversion': 1,
        u'type': u'data.structure.StructureData.',
        u'public': False,
        u'uuid': u'93c0db51-8a39-4a0d-b14d-5a50e40a2cc4'
    }}
Attributes and extras

You should know by now that you can define additional properties of nodes in the attributes and the extras of a node. There will be many cases where you will either want to filter or project on those entities. The following example gives us a PwCalculation where the cutoff for the wavefunctions has a value above 30.0 Ry:

qb = QueryBuilder()
qb.append(PwCalculation, project=['*'], tag='calc')
qb.append(
    ParameterData,
    input_of='calc',
    filters={'attributes.SYSTEM.ecutwfc':{'>':30.0}},
    project=[
        'attributes.SYSTEM.ecutwfc',
        'attributes.SYSTEM.ecutrho',
    ]
)

The above examples filters by a certain attribute. Notice how you expand into the dictionary using the dot (.). That works the same for the extras.

Note

Comparisons in the attributes (extras) are also implicitly done by type.

Let’s do a last example. You are familiar with the Quantum Espresso PWscf tutorial? Great, because this will be our use case here. (If not, you can find it on the documentation of the aiida-quantumespresso package. We will query for calculations that were done on a certain structure (mystructure), that fulfill certain requirements, such as a cutoff above 30.0. In our case, we have a structure (an instance of StructureData) and an instance of ParameterData that are both inputs to a PwCalculation. You need to tell the QueryBuilder that:

qb = QueryBuilder()
qb.append(
    StructureData,
    filters={'uuid':{'==':mystructure.uuid}},
    tag='strucure'
)
qb.append(
    PwCalculation,
    output_of='strucure',
    project=['*'],
    tag='calc'
)
qb.append(
    ParameterData,
    filters={'attributes.SYSTEM.ecutwfc':{'>':30.0}},
    input_of='calc',
    tag='params'
)
Cheats

A few cheats to save some typing:

  • The default edge specification, if no keyword is provided, is always output_of the previous vertice.
  • Equality filters (‘==’) can be shortened, as will be shown below.
  • Tags are not necessary, you can simply use the class as a label. This works as long as the same Aiida-class is not used again

A shorter version of the previous example:

qb = QueryBuilder()
qb.append(
    StructureData,
    filters={'uuid':mystructure.uuid},
)
qb.append(
    PwCalculation,
    project='*',
)
qb.append(
    ParameterData,
    filters={'attributes.SYSTEM.ecutwfc':{'>':30.0}},
    input_of=PwCalculation
)
Advanced usage

Let’s proceed to some more advanced stuff. If you’ve understood everything so far you’re in good shape to query the database, so you can skip the rest if you want.

Working with edges

Another feature that had to be added are projections, filters and labels on the edges of the graphs, that is to say links or paths between nodes. It works the same way, just that the keyword is preceeded by ‘link‘. Let’s take the above example, but put a filter on the label of the link, project the label and label:

qb = QueryBuilder()
qb.append(
        JobCalculation,
        filters={'ctime':{'>': now - timedelta(days=3)}},
        project={'id':{'func':'count'}}
    )
qb.append(
        ParameterData,
        filters={'attributes.energy':{'>':-5.0}},
        edge_filters={'label':{'like':'output_%'}},
        edge_project='label'
     )
Ordering results

You can also order by properties of the node, although ordering by attributes or extras is not implemented yet. Assuming you want to order the above example by the time of the calculations:

qb = QueryBuilder()
qb.append(
        JobCalculation,
        project=['*']
    )
qb.append(
        ParameterData,
        filters={'attributes.energy':{'>':-5.0}},
     )

qb.order_by({JobCalculation:{'ctime':'asc'}}) # 'asc' or 'desc' (ascending/descending)
Limiting the number of results

You can also limit the number of rows returned with the method limit:

qb = QueryBuilder()
qb.append(
    JobCalculation,
    filters={'ctime':{'>': now - timedelta(days=3)}},
    project=['*']
)
qb.append(
    ParameterData,
    filters={'attributes.energy':{'>':-5.0}},
 )

# order by time descending
qb.order_by({JobCalculation:{'ctime':'desc'}})

# Limit to results to the first 10 results:
qb.limit(10)

The above query returns the latest 10 calculation that produced a final energy above -5.0.

The queryhelp

As mentioned above, there are two possibilities to tell the QueryBuilder what to do. The second uses one big dictionary that we can call the queryhelp in the following. It has the same functionalities as the appender method. But you could save this dictionary in a JSON or in the database and use it over and over. Using the queryhelp, you have to specify the path, the filter and projections beforehand and instantiate the QueryBuilder with that dictionary:

qb = Querybuilder(**queryhelp)

What do you have to specify:

  • Specifying the path: Here, the user specifies the path along which to join tables as a list, each list item being a vertice in your path. You can define the vertice in two ways: The first is to give the Aiida-class:

    queryhelp = {
        'path':[Data]
    }
    
    # or  (better)
    
    queryhelp = {
        'path':[
            {'cls': Data}
        ]
    }
    

    Another way is to give the polymorphic identity of this class, in our case stored in type:

    queryhelp = {
        'path':[
            {'type':"data."}
        ]
    }
    

    Note

    In Aiida, polymorphism is not strictly enforced, but done with type specification. Type-discrimination when querying is achieved by attaching a filter on the type every time a subclass of Node is given.

    Each node has to have a unique tag. If not given, the tag is chosen to be equal to the name of the class. This will not work if the user chooses the same class twice. In this case he has to provide the tag:

    queryhelp = {
        'path':[
            {
                'cls':Node,
                'tag':'node_1'
            },
            {
                'cls':Node,
                'tag':'node_2'
            }
        ]
    }
    

    There also has to be some information on the edges, in order to join correctly. There are several redundant ways this can be done:

    • You can specify that this node is an input or output of another node preceding the current one in the list. That other node can be specified by an integer or the class or type. The following examples are all valid joining instructions, assuming there is a structure defined at index 2 of the path with tag “struc1”:

      edge_specification = queryhelp['path'][3]
      edge_specification['output_of'] = 2
      edge_specification['output_of'] = StructureData
      edge_specification['output_of'] = 'struc1'
      edge_specification['input_of']  = 2
      edge_specification['input_of']  = StructureData
      edge_specification['input_of']  = 'struc1'
      
    • queryhelp_item[‘direction’] = integer

      If any of the above specs (“input_of”, “output_of”) were not specified, the key “direction” is looked for. Directions are defined as distances in the tree. 1 is defined as one step down the tree along a link. This means that 1 joins the node specified in this dictionary to the node specified on list-item before as an output. Direction defaults to 1, which is why, if nothing is specified, this node is joined to the previous one as an output by default. A minus sign reverse the direction of the link. The absolute value of the direction defines the table to join to with respect to your own position in the list. An absolute value of 1 joins one table above, a value of 2 to the table defined 2 indices above. The two following queryhelps yield the same query:

      qh1 = {
          'path':[
              {
                  'cls':PwCalculation
              },
              {
                  'cls':Trajectory
              },
              {
                  'cls':ParameterData,
                  'direction':-2
              }
          ]
      }
      
      # returns same query as:
      
      qh2 = {
          'path':[
              {
                  'cls':PwCalculation
              },
              {
                  'cls':Trajectory
              },
              {
                  'cls':ParameterData,
                  'input_of':PwCalculation
              }
          ]
      }
      
      # Shorter version:
      
      qh3 = {
          'path':[
              ParameterData,
              PwCalculation,
              Trajectory,
          ]
      }
      
  • Project: Determing which columns the query will return:

    queryhelp = {
        'path':[Relax],
        'project':{
            Relax:['state', 'id'],
        }
    }
    

    If you are using JSONB columns, you can also project a value stored inside the json:

    queryhelp = {
        'path':[
            Relax,
            StructureData,
        ],
        'project':{
            Relax:['state', 'id'],
        }
    }
    

    Returns the state and the id of all instances of Relax where a structures is linked as output of a relax-calculation. The strings that you pass have to be name of the columns. If you pass a star (‘*’), the query will return the instance of the AiidaClass.

  • Filters: What if you want not every structure, but only the ones that were added after a certain time t and have an id higher than 50:

    queryhelp = {
        'path':[
            {'cls':Relax}, # Relaxation with structure as output
            {'cls':StructureData}
        ],
        'filters':{
            StructureData:{
                'time':{'>': t},
                'id':{'>': 50}
            }
        }
    }
    

If you want to include filters and projections on links between nodes, you will have to add these to filters and projections in the queryhelp. Let’s take an example that we had and add a few filters on the link:

queryhelp = {
    'path':[
        {'cls':Relax, 'tag':'relax'}, # Relaxation with structure as output
        {'cls':StructureData, 'tag':'structure'}
    ],
    'filters':{
        'structure':{
            'time':{'>': t},
            'id':{'>': 50}
        },
        'relax--structure':{
            'time':{'>': t},
            'label':{'like':'output_%'},
        }
    },
    'project':{
        'relax--structure':['label'],
        'structure':['label'],
        'relax':['label', 'state'],
    }
}

Notice that the label for the link, by default, is the labels of the two connecting nodes delimited by two dashes ‘–’. The order does not matter, the following queryhelp would results in the same query:

queryhelp = {
    'path':[
        {'cls':Relax, 'label':'relax'},         # Relaxation with structure as output
        {'cls':StructureData, 'label':'structure'}
    ],
    'filters':{
        'structure':{
            'time':{'>': t},
            'id':{'>': 50}
        },
        'relax--structure':{
            'time':{'>': t},
            'label':{'like':'output_%'},
        }
    },
    'project':{
        'relax--structure':['label'],
        'structure':['label'],
        'relax':['label', 'state'],
    }
}

If you dislike that way to label the link, you can choose the linklabel in the path when definining the entity to join:

queryhelp = {
    'path':[
        {'cls':Relax, 'label':'relax'},         # Relaxation with structure as output
        {
            'cls':StructureData,
            'label':'structure',
            'edge_tag':'ThisIsMyLinkLabel'     # Definining the linklabel
        }
    ],
    'filters':{
        'structure':{
            'time':{'>': t},
            'id':{'>': 50}
        },
        'ThisIsMyLinkLabel':{                  # Using this linklabel
            'time':{'>': t},
            'label':{'like':'output_%'},
        }
    },
    'project':{
        'ThisIsMyLinkLabel':['label'],
        'structure':['label'],
        'relax':['label', 'state'],
    }
}

You can set a limit and an offset in the queryhelp:

queryhelp = {
    'path':[Node],
    'limit':10,
    'offset':20
}

That queryhelp would tell the QueryBuilder to return 10 rows after the first 20 have been skipped.

Workflows
AiiDA workflows

The AiiDA workflow system tries to strike a balance between providing the user as much power to automate tasks while adding features essential for carrying out high-throughput computation such as error recovery, the ability suspend and resume, ability to run on remote resources, logging, etc.

Overview

The workflow system allows the user to define one or more Processes that (optionally) take some inputs and (optionally) produce some outputs. By now, this concept should be familiar and, in fact, each time you execute a Process it generates a AbstractCalculation node along with the corresponding inputs and ouputs to keep the provenance of what happened.

At the moment there are two ways that you can define a Process: workfunction s or WorkChain s. Let’s start with the former as it’s the easier of the two.

Workfunctions

A workfunction is simply a python function with a decorator and a couple of constraints on its inputs and return value. Let’s dive in.

>>> from aiida.orm.data.base import Int
>>> from aiida.work.workfunction import workfunction as wf
>>>
>>> @wf
>>> def sum(a, b):
>>>    return a + b
>>>
>>> r = sum(Int(4), Int(5))
>>> print(r)
9
>>> r.get_inputs_dict() 
{u'_return': <WorkCalculation: uuid: ce0c63b3-1c84-4bb8-ba64-7b70a36adf34 (pk: 3567)>}
>>> r.get_inputs_dict()['_return'].get_inputs()
[4, 5]

In the above example you can see a workfunction being declare, called and then the provenance being explored.

This is a good point to highlight the constraints that workfunctions must conform to:

  • All of the input parameters must be of type Data

  • The return value can be either

    1. a single value of type Data in which case there will be a single output link with label _return, or,
    2. a dictionary with string keys and Data values where the strings are used as the label for the output link from the calculation.

Now, let’s try making a slightly more complex workflow by composing workfunctions

>>> @wf
>>> def prod(a, b):
>>>    return a * b
>>>
>>> @wf
>>> def prod_sum(a, b, c):
>>>   return prod(sum(a, b), c)
>>>
>>> r = prod_sum(Int(2), Int(3), Int(4))
>>>
>>> from aiida.utils.ascii_vis import draw_parents
>>> draw_parents(r, dist=4) 
                       /-4 [3582]
-- /20 [3588]prod [3587]
                      |                  /-2 [3581]
                       \5 [3586]sum [3585]
                                         \-3 [3583]

Above we see the workflow that was executed with the outputs and the PKs of all the nodes along the way.

Let’s look at a slightly more complex example, that of performing an Equation of State calculation.

Note

The following example workflows use the Quantum ESPRESSO plugins that are hosted in the aiida-quantumespresso plugin repository.

Here is the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from aiida.orm.utils import DataFactory
import ase

@wf
def rescale(structure, scale):
    the_ase = structure.get_ase()
    new_ase = the_ase.copy()
    new_ase.set_cell(the_ase.get_cell() * float(scale), scale_atoms=True)
    new_structure = DataFactory('structure')(ase=new_ase)
    return new_structure

from aiida_quantumespresso.calculations.pw import PwCalculation
from aiida.orm.data.base import Float
from aiida.work.run import run

@wf
def eos(structure, codename, pseudo_family):
    Proc = PwCalculation.process()
    results = {}
    for s in (0.98, 0.99, 1.0, 1.02, 1.04):
        rescaled = rescale(structure, Float(s))
        inputs = generate_scf_input_params(rescaled, codename, pseudo_family)
        outputs = run(Proc, **inputs)
        res = outputs['output_parameters'].dict
        results[str(s)] = res

    return results

eos(my_structure, Str('pw-5.1@localhost'), Str('GBRV_lda')) # doctest: +SKIP

In the above we define a workfunction to rescale a structure by some scale factors. Then the main work is carried out by the eos workfunction. On line 17 we get a Process class for the Quantum ESPRESSO calculation. This is only necessary because the Quantum ESPRESSO plugin was written before the new plugin system hence we get a class compatible with the new system using the PwCalculation.process() call.

On line 21-23 we first use a standard python function (not shown) to get a set of Quantum ESPRESSO inputs parameters for our structure. Then we use the run() method to launch the calculation. This is a blocking call and will wait until the calculation has completed.

Upon completion on lines 24-25 we get the outputs dictionary from the calculation and store it for returning when our workfunction completes.

This way of writing the workflow is fairly straightforward and easy to read, but it does have some drawbacks, namely:

  • If, say, the 4th calculation crashes, we cannot restart and continue from that point
  • We do not get any output until the workfunction has completed
  • Any checking of input/return values being of a specific type (beyond being Data) has to be done manually by the user.

To overcome these problems and add additional functionality we introduced the concept of Workchains.

Workchains

A workchain represents a series of instructions used to carry out a process with checkpoints being taken between each instruction such that the process can be paused/stopped/resumed, even if the computer crashes. The most obvious practical difference between workchains and workfunctions is that workchains are classes as opposed to functions.

Let’s start by creating a workchain for the product sum workflow from before:

from aiida.work.workchain import WorkChain

class ProdSum(WorkChain):
    @classmethod
    def define(cls, spec):
        super(ProdSum, cls).define(spec)
        spec.outline(cls.sum, cls.prod)

    def sum(self):
        self.ctx.sum = self.inputs.a + self.inputs.b

    def prod(self):
        self.out("result", self.ctx.sum * self.inputs.c)

On lines 4-6 we see use of the define function which is used to describe the workchain. Other than calling the superclass which is obligatory we define the outline of our workchain by calling the corresponding method on the spec object. Here we have just two simple steps and between them the workchain will checkpoint.

Next on lines 9-13 we actually define what the steps do as familiar python functions. Note on line 10 we use the inputs which will be passed to us by the user. We haven’t explicitly stated what inputs we expect in this workflow so the user is free to pass in anything they want (so long as it’s a aiida.orm.data.Data).

The other new concept we have used can be seen on line 10, namely self.ctx. This is known as the context, and is used to store any data that should be persisted between step. The reason for this is that each time a step finishes a checkpoint is created, this can be used to continue in the case of a crash or suspension. However, the checkpoint only stores data in the context and therefore any local variables are liable to disappear between steps if the workchain is resumed.

Note

context
A data store used for variables that are used betweeen steps.

To run the workflow locally we call

>>> res = ProdSum.run(a=Int(2), b=Int(3), c=Int(4))
>>> print res
{'result': 20}
>>> draw_parents(res['result']) 
                          /-2 [3594]
                         |
-- /20 [3598]ProdSum [3597]-3 [3596]
                         |
                          \-4 [3595]

We see that there is one output node with value 20 and the input nodes that we supplied to the calculation. Of course the names of the inputs we supplied have to match up with those used in the workchain but we can make this connection explicit, as well as specifying what type they should be:

class ProdSumEx(ProdSum):
    @classmethod
    def define(cls, spec):
        super(ProdSumEx, cls).define(spec)
        spec.input('a', valid_type=Int, required=True)
        spec.input('b', valid_type=Int, required=True)
        spec.input('c', valid_type=Int, required=True)

Now the input types and their names are enforced.

>>> ProdSumEx.run(a=Int(2), b=Int(3))
TypeError: Cannot run process 'ProdSumEx' because required value was not provided for 'c'
>>> ProdSumEx.run(a=Float(2), b=Int(3), c=Int(4))
TypeError: Cannot run process 'ProdSumEx' because value 'a' is not of the right type. Got '<class 'aiida.orm.data.base.Float'>', expected '<class 'aiida.orm.data.base.Int'>'

This an example of the additional power of workchains.

Now, let’s go back to the equation of state example and see what else is possible with workchains. Let’s start, as usual, with the outline:

from aiida.orm.data.structure import StructureData
from aiida.work.workchain import while_

class EquationOfState(WorkChain):
    @classmethod
    def define(cls, spec):
        super(EquationOfState, cls).define(spec)
        spec.input("structure", valid_type=StructureData)
        spec.input("code", valid_type=Str)
        spec.input("pseudo_family", valid_type=Str)
        spec.outline(
            cls.init,
            while_(cls.not_finished)(
                cls.run_pw
            )
        )

Here we’re using a while loop instruction, by doing this we can make sure that a checkpoint is automatically created after each iteration. Now all that remains is to define the contents of the steps themselves:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def init(self):
    self.ctx.scales = (0.96, 0.98, 1., 1.02, 1.04)
    self.ctx.i = 0

def not_finished(self):
    return self.ctx.i < len(self.ctx.scales)

def run_pw(self):
    scale = self.ctx.scales[self.ctx.i]
    scaled = rescale(self.inputs.structure, Float(scale))

    inputs = generate_scf_input_params(
        scaled, self.inputs.code, self.inputs.pseudo_family)
    outputs = run(Proc, **inputs)
    res = outputs['output_parameters']
    self.out(str(scale), res)

    self.ctx.i += 1

This new implementation is already safer than the workfunction approach because it is checkpointed, however we can do even better. On line 14 we effectively call Quantum ESPRESSO to carry out the calculation which could take some time. During this period the code waits and we cannot shutdown our computer without loosing the progress of that calculation. To overcome this we allow the user to return special objects from a step to indicate that the workchain is waiting for something to complete. In the meantime the workchain can be suspended and be resumed later:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class WaitingEquationOfState(EquationOfState):
    @classmethod
    def define(cls, spec):
        super(EquationOfState, cls).define(spec)
        spec.outline(
            cls.launch_calculations,
            cls.process_results
        )

    def launch_calculations(self):
        l = []
        for s in (0.96, 0.98, 1., 1.02, 1.04):
            scaled = rescale(self.inputs.structure, Float(s))
            inputs = generate_scf_input_params(
                scaled, self.inputs.code, self.inputs.pseudo_family)
            pid = submit(Proc, **inputs)
            l.append(pid)

        return ToContext(s_0_96=l[0], s_0_98=l[1], s_1=l[2], s_1_02=l[3], s_1_04=l[4])

    def process_results(self):
        for key, outputs in self.ctx.iteritems():
            if key.startswith("s_"):
                scale = key[2:].replace("_", ".")
                self.out(Float(scale), outputs['output_parameters'].dict)

Here, on line 19 we use a so called interstep command. These are objects you return from a step that can perform actions at the end fo the step and just before the beginning of the next. In this case we use ToContext, the constructor takes keyword arguments of [name]=[pid], it will then take insert barriers into the workchain to make sure it does not continue until all of the specified processes have finished. Then, before the next step, it will place the corresponding AbstractCalculation nodes in the specified [name] variables in the context.

On lines 22-25, we iterate the context looking for those entries that start with s_ and emit the results from these calculations.

Converting from old workflows

This section details some of the changes that need to be made to convert old workflows to the new system.

We begin with changes to the nomenclature where the rough correspondence in terms is as follows:

workflows -> workchain inline function -> workfunction

Old AiiDA workflows

Workflows are one of the most important components for real high-throughput calculations, allowing the user to scale well defined chains of calculations on any number of input structures, both generated or acquired from an external source.

Instead of offering a limited number of automatization schemes, crafted for some specific functions (equation of states, phonons, etc...) in AiiDA a complete workflow engine is present, where the user can script in principle any possible interaction with all the AiiDA components, from the submission engine to the materials databases connections. In AiiDA a workflow is a python script executed by a daemon, containing several user defined functions called steps. In each step all the AiiDA functions are available and calculations and launched and retrieved, as well as other sub-workflows.

In this document we’ll introduce the main workflow infrastructure from the user perspective, discussing and presenting some examples that will cover all the features implemented in the code. A more detailed description of each function can be found in the developer documentation.

How it works

The rationale of the entire workflow infrastructure is to make efficient, reproducible and scriptable anything a user can do in the AiiDA shell. A workflow in this sense is nothing more than a list of AiiDA commands, split in different steps that depend one on each other and that are executed in a specific order. A workflow step is written with the same python language, using the same commands and libraries you use in the shell, stored in a file as a python class and managed by a daemon process.

Before starting to analyze our first workflow we should summarize very shortly the main working logic of a typical workflow execution, starting with the definition of the management daemon. The AiiDA daemon handles all the operations of a workflow, script loading, error handling and reporting, state monitoring and user interaction with the execution queue.

The daemon works essentially as an infinite loop, iterating several simple operations:

  1. It checks the running step in all the active workflows, if there are new calculations attached to a step it submits them.
  2. It retrieves all the finished calculations. If one step of one workflow exists where all the calculations are correctly finished it reloads the workflow and executes the next step as indicated in the script.
  3. If a workflow’s next step is the exit one, the workflow is terminated and the report is closed.

This simplified process is the very heart of the workflow engine, and while the process loops a user can submit a new workflow to be managed from the Verdi shell (or through a script loading the necessary Verdi environment). In the next chapter we’ll initialize the daemon and analyze a simple workflow, submitting it and retrieving the results.

Note

The workflow engine of AiiDA is now fully operational but will undergo major improvements in a near future. Therefore, some of the methods or functionalities described in the following might change.

The AiiDA daemon

As explained the daemon must be running to allow the execution of workflows, so the first thing needed to start it to launch the daemon. We can use the verdi script facility from your computer’s shell:

>> verdi daemon start

This command will launch a background job (a daemon in fact) that will continuously check for new or running workflow to manage. Thanks to the asynchronous structure of AiiDA if the daemon gets interrupted (or the computer running the daemon restarted for example), once it will be restarted all the workflow will proceed automatically without any problem. The only thing you need to do to restart the workflow it’s exactly the same command above. To stop the daemon instead we use the same command with the stop directive, and to have a very fast check about the execution we can use the state directive to obtain more information.

A workflow demo

Now that the daemon is running we can focus on how to write our first workflow. As explained a workflow is essentially a python class, stored in a file accessible by AiiDA (in the same AiiDA path). By convention workflows are stored in .py files inside the aiida/workflows directory; in the distribution you’ll find some examples (some of them analyzed here) and a user directory where user defined workflows can be stored. Since the daemon is aware only of the classes present at the time of its launch, remember to restart the daemon (verdi daemon restart) every time you add a new workflow to let AiiDA see it.

We can now study a very first example workflow, contained in the wf_demo.py file inside the distribution’s workflows directory. Even if this is just a toy model, it helps us to introduce all the features and details on how a workflow works, helping us to understand the more sophisticated examples reported later.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
    import aiida.common
    from aiida.common import aiidalogger
    from aiida.orm.workflow import Workflow
    from aiida.orm import Code, Computer

    logger = aiidalogger.getChild('WorkflowDemo')

    class WorkflowDemo(Workflow):

      def __init__(self,**kwargs):

          super(WorkflowDemo, self).__init__(**kwargs)

      def generate_calc(self):

          from aiida.orm import Code, Computer, CalculationFactory
          from aiida.common.datastructures import calc_states

          CustomCalc = CalculationFactory('simpleplugins.templatereplacer')

          computer = Computer.get("localhost")

          calc = CustomCalc(computer=computer,withmpi=True)
          calc.set_resources(num_machines=1, num_mpiprocs_per_machine=1)
          calc._set_state(calc_states.FINISHED)
          calc.store()

          return calc

      @Workflow.step
      def start(self):

          from aiida.orm.node import Node

          # Testing parameters
          p = self.get_parameters()

          # Testing calculations
          self.attach_calculation(self.generate_calc())
          self.attach_calculation(self.generate_calc())

          # Testing report
          self.append_to_report("Starting workflow with params: {0}".format(p))

          # Testing attachments
          n = Node()
          attrs = {"a": [1,2,3], "n": n}
          self.add_attributes(attrs)

          # Test process
          self.next(self.second_step)

      @Workflow.step
      def second_step(self):

          # Test retrieval
          calcs = self.get_step_calculations(self.start)
          self.append_to_report("Retrieved calculation 0 (uuid): {0}".format(calcs[0].uuid))

          # Testing report
          a = self.get_attributes()
          self.append_to_report("Execution second_step with attachments: {0}".format(a))

          # Test results
          self.add_result("scf_converged", calcs[0])

          self.next(self.exit)

As discussed before this is native python code, meaning that a user can load any library or script accessible from their PYTHONPATH and interacting with any database or service of preference inside the workflow. We’ll now go through all the details of the first workflow, line by line, discussing the most important methods and discovering along the way all the features available.

lines 1-7 Module imports. Some are necessary for the Workflow objects but many more can be added for user defined functions and libraries.

lines 8-12 Superclass definition, a workflow MUST extend the Workflow class from the aiida.orm.workflow. This is a fundamental requirement, since the subclassing is the way AiiDA understand if a class inside the file is an AiiDA workflow or a simple utility class. Note that for back-compatibility with python 2.7 also the explicit initialization of line 12 is necessary to make things work correctly.

lines 14-28 Once the class is defined a user can add as many methods as he wishes, to generate calculations or to download structures or to compute new ones starting form a query in previous AiiDA calculations present in the DB. In the script above the method generate_calc will simply prepare a dummy calculation, setting it’s state to finished and returning the object after having it stored in the repository. This utility function will allow the dummy workflow run without the need of any code or machine except for localhost configured. In real cases, as we’ll see, a calculation will be set up with parameters and structures defined in more sophisticated ways, but the logic underneath is identical as far as the workflow inner working is concerned.

lines 30-51 This is the first step, one of the main components in the workflow logic. As you can see the start method is decorated as a Workflow.step making it a very unique kind of method, automatically stored in the database as a container of calculations and sub-workflows. Several functions are available to the user when coding a workflow step, and in this method we can see most of the basic ones:

  • line 36 self.get_parameters(). With this method we can retrieve the parameters passed to the workflow when it was initialized. Parameters cannot be modified during an execution, while attributes can be added and removed.

  • lines 39-40 self.attach_calculation(JobCalculation). This is a key point in the workflow, and something possible only inside a step method. JobCalculations, generated in the methods or retrieved from other utility methods, are attached to the workflow’s step, launched and executed completely by the daemon, without the need of user interaction. Failures, re-launching and queue management are all handled by the daemon, and thousands of calculations can be attached. The daemon will poll the servers until all the step calculations will be finished, and only after that it will pass to the next step.

  • line 43 self.append_to_report(string). Once the workflow will be launched, the user interactions are limited to some events (stop, relaunch, list of the calculations) and most of the times is very useful to have custom messages during the execution. For this each workflow is equipped with a reporting facility, where the user can fill with any text and can retrieve both live and at the end of the execution.

  • lines 45-48 self.add_attributes(dict). Since the workflow is instantiated every step from scratch, if a user wants to pass arguments between steps he must use the attributes facility, where a dictionary of values (accepted values are basic types and AiiDA nodes) can be saved and retrieved from other steps during future executions.

  • line 52 self.next(Workflow.step). This is the final part of a step, where the user points the engine about what to do after all the calculations in the steps (on possible sub-workflows, as we’ll see later) are terminated. The argument of this function has to be a Workflow.step decorated method of the same workflow class, or in case this is the last step to be executed you can use the common method self.exit, always present in each Workflow subclass.

    Note

    make sure to store() all input nodes for the attached calculations, as unstored nodes will be lost during the transition from one step to another.

lines 53-67 When the workflow will be launched through the start method, the AiiDA daemon will load the workflow, execute the step, launch all the calculations and monitor their state. Once all the calculations in start will be finished the daemon will then load and execute the next step, in this case the one called second_step. In this step new features are shown:

  • line 57 self.get_step_calculations(Workflow.step). Anywhere after the first step we may need to retrieve and analyze calculations executed in a previous steps. With this method we can have access to the list of calculations of a specific workflows step, passed as an argument.
  • line 61 self.get_attributes(). With this call we can retrieve the attributes stored in previous steps. Remember that this is the only way to pass arguments between different steps, adding them as we did in line 48.
  • line 65 self.add_result(). When all the calculations are done it’s useful to tag some of them as results, using custom string to be later searched and retrieved. Similarly to the get_step_calculations, this method works on the entire workflow and not on a single step.
  • line 67 self.next(self.exit). This is the final part of each workflow, setting the exit. Every workflow inheritate a fictitious step called exit that can be set as a next to any step. As the names suggest, this implies the workflow execution to finish correctly.
Running a workflow

After saving the workflow inside a python file located in the aiida/workflows directory, we can launch the workflow simply invoking the specific workflow class and executing the start() method inside the Verdi shell. It’s important to remember that all the AiiDA framework needs to be accessible for the workflow to be launched, and this can be achieved either with the verdi shell or by any other python environment that has previously loaded the AiiDA framework (see the developer manual for this).

To launch the verdi shell execute verdi shell from the command line; once inside the shell we have to import the workflow class we want to launch (this command depends on the file location and the class name we decided). In this case we expect we’ll launch the WorkflowDemo presented before, located in the wf_demo.py file in the clean AiiDA distribution. In the shell we execute:

>> from aiida.workflows.wf_demo import WorkflowDemo
>> params = {"a":[1,2,3]}
>> wf = WorkflowDemo(params=params)
>> wf.start()

Note

If you want to write the above script in a file, remember to run it with verdi run and not simply with python, or otherwise to use the other techniques described here.

In these four lines we loaded the class, we created some fictitious parameter and we initialized the workflow. Finally we launched it with the start() method, a lazy command that in the backgroud adds the workflow to the execution queue monitored by the verdi daemon. In the backgroud the daemon will handle all the workflow processes, stepping each method, launching and retrieving calculations and monitoring possible errors and problems.

Since the workflow is now managed by the daemon, to interact with it we need special methods. There are basically two ways to see how the workflows are running: by printing the workflow list or its report.

  • Workflow list

    From the command line we run:

    >> verdi workflow list
    

    This will list all the running workflows, showing the state of each step and each calculation (and, when present, each sub-workflow - see below). It is the fastest way to have a snapshot of what your AiiDA workflow daemon is working on. An example output right after the WorkflowDemo submission should be

    + Workflow WorkflowDemo (pk: 1) is RUNNING [0h:05m:04s]
    |-* Step: start [->second_step] is RUNNING
    | | Calculation (pk: 1) is FINISHED
    | | Calculation (pk: 2) is FINISHED
    

    For each workflow is reported the pk number, a unique id identifying that specific execution of the workflow, something necessary to retrieve it at any other time in the future (as explained in the next point).

    Note

    You can also print the list of any individual workflow from the verdi shell (here in the shell where you defined your workflow as wf, see above):

    >> import aiida.orm.workflow as wfs
    >> print "\n".join(wfs.get_workflow_info(wf._dbworkflowinstance))
    
  • Workflow report

    As explained, each workflow is equipped with a reporting facility the user can use to log any important intermediate information, useful to debug the state or show some details. Moreover the report is also used by AiiDA as an error reporting tool: in case of errors encountered during the execution, the AiiDA daemon will copy the entire stack trace in the workflow report before halting it’s execution. To access the report we need the specific pk of the workflow. From the command line we would run:

    >> verdi workflow report PK_NUMBER
    

    while from the verdi shell the same operation requires to use the get_report() method:

    >> load_workflow(PK_NUMBER).get_report()
    

    In both variants, PK_NUMBER is the pk number of the workflow we want the report of. The load_workflow function loads a Workflow instance from its pk number, or from its uuid (given as a string).

    Note

    It’s always recommended to get the workflow instance from load_workflow (or from the Workflow.get_subclass_from_pk method) without saving this object in a variable. The information generated in the report may change and the user calling a get_report method of a class instantiated in the past will probably lose the most recent additions to the report.

Once launched, the workflows will be handled by the daemon until the final step or until some error occurs. In the last case, the workflow gets halted and the report can be checked to understand what happened.

  • Killing a workflow

A user can also kill a workflow while it’s running. This can be done with the following verdi command:

>> verdi workflow kill PK_NUMBER_1 PK_NUMBER_2 PK_NUMBER_N

where several pk numbers can be given. A prompt will ask for a confirmation; this can be avoided by using the -f option.

An alternative way to kill an individual workflow is to use the kill method. In the verdi shell type:

>> load_workflow(PK_NUMBER).kill()

or, equivalently:

>> Workflow.get_subclass_from_pk(PK_NUMBER).kill()

Note

Sometimes the kill operation might fail because one calculation cannot be killed (e.g. if it’s running but not in the WITHSCHEDULER, TOSUBMIT or NEW state), or because one workflow step is in the CREATED state. In that case the workflow is put to the SLEEP state, such that no more workflow steps will be launched by the daemon. One can then simply wait until the calculation or step changes state, and try to kill it again.

A more sophisticated workflow

Note

This workflow uses the Quantum ESPRESSO plugins that are hosted in the aiida-quantumespresso plugin repository.

In the previous chapter we’ve been able to see almost all the workflow features, and we’re now ready to work on some more sophisticated examples, where real calculations are performed and common real-life issues are solved. As a real case example we’ll compute the equation of state of a simple class of materials, XTiO3; the workflow will accept as an input the X material, it will build several structures with different crystal parameters, run and retrieve all the simulations, fit the curve and run an optimized final structure saving it as the workflow results, aside to the final optimal cell parameter value.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
      ## ===============================================
      ##    WorkflowXTiO3_EOS
      ## ===============================================

      class WorkflowXTiO3_EOS(Workflow):

          def __init__(self,**kwargs):

              super(WorkflowXTiO3_EOS, self).__init__(**kwargs)

          ## ===============================================
          ##    Object generators
          ## ===============================================

          def get_structure(self, alat = 4, x_material = 'Ba'):

              cell = [[alat, 0., 0.,],
                      [0., alat, 0.,],
                      [0., 0., alat,],
                     ]

              # BaTiO3 cubic structure
              s = StructureData(cell=cell)
              s.append_atom(position=(0.,0.,0.),symbols=x_material)
              s.append_atom(position=(alat/2.,alat/2.,alat/2.),symbols=['Ti'])
              s.append_atom(position=(alat/2.,alat/2.,0.),symbols=['O'])
              s.append_atom(position=(alat/2.,0.,alat/2.),symbols=['O'])
              s.append_atom(position=(0.,alat/2.,alat/2.),symbols=['O'])
              s.store()

              return s

          def get_pw_parameters(self):

              parameters = ParameterData(dict={
                          'CONTROL': {
                              'calculation': 'scf',
                              'restart_mode': 'from_scratch',
                              'wf_collect': True,
                              },
                          'SYSTEM': {
                              'ecutwfc': 30.,
                              'ecutrho': 240.,
                              },
                          'ELECTRONS': {
                              'conv_thr': 1.e-6,
                              }}).store()

              return parameters

          def get_kpoints(self):

              kpoints = KpointsData()
              kpoints.set_kpoints_mesh([4,4,4])
              kpoints.store()

              return kpoints

          def get_pw_calculation(self, pw_structure, pw_parameters, pw_kpoint):

              params = self.get_parameters()

              pw_codename            = params['pw_codename']
              num_machines           = params['num_machines']
              num_mpiprocs_per_machine   = params['num_mpiprocs_per_machine']
              max_wallclock_seconds  = params['max_wallclock_seconds']
              pseudo_family          = params['pseudo_family']

              code = Code.get_from_string(pw_codename)
              computer = code.get_remote_computer()

              QECalc = CalculationFactory('quantumespresso.pw')

              calc = QECalc(computer=computer)
              calc.set_max_wallclock_seconds(max_wallclock_seconds)
              calc.set_resources({"num_machines": num_machines, "num_mpiprocs_per_machine": num_mpiprocs_per_machine})
              calc.store()

              calc.use_code(code)

              calc.use_structure(pw_structure)
              calc.use_pseudos_from_family(pseudo_family)
              calc.use_parameters(pw_parameters)
              calc.use_kpoints(pw_kpoint)

              return calc


          ## ===============================================
          ##    Workflow steps
          ## ===============================================

          @Workflow.step
          def start(self):

              params = self.get_parameters()
              x_material             = params['x_material']

              self.append_to_report(x_material+"Ti03 EOS started")
              self.next(self.eos)

          @Workflow.step
          def eos(self):

              from aiida.orm import Code, Computer, CalculationFactory
              import numpy as np

              params = self.get_parameters()

              x_material             = params['x_material']
              starting_alat          = params['starting_alat']
              alat_steps             = params['alat_steps']


              a_sweep = np.linspace(starting_alat*0.85,starting_alat*1.15,alat_steps).tolist()

          aiidalogger.info("Storing a_sweep as "+str(a_sweep))
          self.add_attribute('a_sweep',a_sweep)

          for a in a_sweep:

              self.append_to_report("Preparing structure {0} with alat {1}".format(x_material+"TiO3",a))

              calc = self.get_pw_calculation(self.get_structure(alat=a, x_material=x_material),
                                        self.get_pw_parameters(),
                                        self.get_kpoints())

              self.attach_calculation(calc)


          self.next(self.optimize)

      @Workflow.step
      def optimize(self):

          from aiida.orm.data.parameter import ParameterData

          x_material   = self.get_parameter("x_material")
          a_sweep      = self.get_attribute("a_sweep")

          aiidalogger.info("Retrieving a_sweep as {0}".format(a_sweep))

          # Get calculations
          start_calcs = self.get_step_calculations(self.eos) #.get_calculations()

          #  Calculate results
          #-----------------------------------------

          e_calcs = [c.res.energy for c in start_calcs]
          v_calcs = [c.res.volume for c in start_calcs]

          e_calcs = zip(*sorted(zip(a_sweep, e_calcs)))[1]
          v_calcs = zip(*sorted(zip(a_sweep, v_calcs)))[1]

          #  Add to report
          #-----------------------------------------
          for i in range (len(a_sweep)):
              self.append_to_report(x_material+"Ti03 simulated with a="+str(a_sweep[i])+", e="+str(e_calcs[i]))

          #  Find optimal alat
          #-----------------------------------------

          murnpars, ier = Murnaghan_fit(e_calcs, v_calcs)

          # New optimal alat
          optimal_alat  = murnpars[3]** (1 / 3.0)
          self.add_attribute('optimal_alat',optimal_alat)

          #  Build last calculation
          #-----------------------------------------

          calc = self.get_pw_calculation(self.get_structure(alat=optimal_alat, x_material=x_material),
                                        self.get_pw_parameters(),
                                        self.get_kpoints())
          self.attach_calculation(calc)


          self.next(self.final_step)

      @Workflow.step
      def final_step(self):

          from aiida.orm.data.parameter import ParameterData

          x_material   = self.get_parameter("x_material")
          optimal_alat = self.get_attribute("optimal_alat")

          opt_calc = self.get_step_calculations(self.optimize)[0] #.get_calculations()[0]
          opt_e = opt_calc.get_outputs(type=ParameterData)[0].get_dict()['energy']

          self.append_to_report(x_material+"Ti03 optimal with a="+str(optimal_alat)+", e="+str(opt_e))

          self.add_result("scf_converged", opt_calc)

          self.next(self.exit)

Before getting into details, you’ll notice that this workflow is devided into sections by comments in the script. This is not necessary, but helps the user to differentiate the main parts of the code. In general it’s useful to be able to recognize immediately which functions are steps and which are instead utility or support functions that either generate structure, modify them, add special parameters for the calculations, etc. In this case the support functions are reported first, under the Object generators part, while Workflow steps are reported later in the soundy Workflow steps section. Lets now get in deeper details for each function.

  • __init__ Usual initialization function, notice again the necessary super class initialization for back compatibility.

  • start The workflow tries to get the X material from the parameters, called in this case x_material. If the entry is not present in the dictionary an error will be thrown and the workflow will hang, reporting the error in the report. After that a simple line in the report is added to notify the correct start and the eos step will be chained to the execution.

  • eos This step is the heart of this workflow. At the beginning parameters needed to investigate the equation of states are retrieved. In this case we chose a very simple structure with only one interesting cell parameter, called starting_alat. The code will take this alat as the central point of a linear mesh going from 0.85 alat to 1.15 alat where only a total of alat_steps will be generated. This decision is very much problem dependent, and your workflows will certanly need more parameters or more sophisticated meshes to run a satisfactory equation of state analysis, but again this is only a tutorial and the scope is to learn the basic concepts.

    After retrieving the parameters, a linear interpolation is generated between the values of interest and for each of these values a calculation is generated by the support function (see later). Each calculation is then attached to the step and finally the step chains optimize as the step. As told, the manager will handle all the job execution and retrieval for all the step’s calculation before calling the next step, and this ensures that no optimization will be done before all the alat steps are computed with success.

  • optimize In the first lines the step will retrieve the initial parameters, the a_sweep attribute computed in the previous step and all the calculations launched and succesfully retrieved. Energy and volume in each calculation is retrieved thanks to the output parser functions mentioned in the other chapters, and a simple message is added to the report for each calculation.

    Having the volume and the energy for each simulation we can run a Murnaghan fit to obtain the optimal cell parameter and expected energy, to do this we use a simple fitting function Murnaghan_fit defined at the bottom of the workflow file wf_XTiO3.py. The optimal alat is then saved in the attributes and a new calculation is generated for it. The calculation is attached to the step and the final_step is attached to the execution.

  • final_step In this step the main result is collected and stored. Parameters and attributes are retrieved, a new entry in the report is stored pointing to the optimal alat and to the final energy of the structure. Finally the calculation is added to the workflow results and the exit step is chained for execution.

  • get_pw_calculation (get_kpoints, get_pw_parameters, get_structure) As you noticed to let the code clean all the functions needed to generate AiiDA Calculation objects have been factored in the utility functions. These functions are highly specific for the task needed, and unrelated to the workflow functions. Nevertheless they’re a good example of best practise on how to write clean and reusable workflows, and we’ll comment the most important feature.

    get_pw_calculation is called in the workflow’s steps, and it handles the entire Calculation object creation. First it extracts the parameters from the workflow initialization necessary for the execution (the machine, the code, and the number of core, pseudos, etc..) and then it generates and stores the JobCalculation objects, returning it for later use.

    get_kpoints genetates a k-point mesh suitable for the calculation, in this case a fixed MP mesh 4x4x4. In a real case scenario this needs much more sophisticated calculations to ensure a correct convergence, not necessary for the tutorial.

    get_pw_parameters builds the minimum set of parameters necessary to run the Quantum Espresso simulations. In this case as well parameters are not for production.

    get_structure generates the real atomic arrangement for the specific calculation. In this case the configuration is extremely simple, but in principle this can be substituted with an external funtion, implementing even very sophisticated approaches such as genetic algorithm evolution or semi-randomic modifications, or any other structure evolution function the user wants to test.

As you noticed this workflow needs several parameters to be correctly executed, something natural for real case scenarios. Nevertheless the launching procedure is identical as for the simple example before, with just a little longer dictionary of parameters:

>> from aiida.workflows.wf_XTiO3 import WorkflowXTiO3_EOS
>> params = {'pw_codename':'PWcode', 'num_machines':1, 'num_mpiprocs_per_machine':8, 'max_wallclock_seconds':30*60, 'pseudo_family':'PBE', 'alat_steps':5, 'x_material':'Ba','starting_alat':4.0}
>> wf = WorkflowXTiO3_EOS(params=params)
>> wf.start()

To run this workflow remember to update the params dictionary with the correct values for your AiiDA installation (namely pw_codename and pseudo_family).

Chaining workflows

After the previous chapter we’re now able to write a real case workflow that runs in a fully automatic way EOS analysis for simple structures. This covers almost all the workflow engine’s features implemented in AiiDA, except for workflow chaining.

Thanks to their modular structure a user can write task-specific workflows very easly. An example is the EOS before, or an energy convergence procedure to find optimal cutoffs, or any other necessity the user can code. These self contained workflows can easily become a library of result-oriented scripts that a user would be happy to reuse in several ways. This is exactly where sub-workflows come in handy.

Workflows, in an abstract sense, are in fact calculations, that accept as input some parameters and that produce results as output. The way this calculations are handled is competely transparent for the user and the engine, and if a workflow could launch other workflows it would just be a natural extension of the step’s calculation concept. This is in fact how workflow chaining has been implemented in AiiDA. Just as with calculations, in each step a workflow can attach another workflow for executions, and the AiiDA daemon will handle its execution waiting for its successful end (in case of errors in any subworkflow, such errors will be reported and the entire workflow tree will be halted, exactly as when a calculation fails).

To introduce this function we analyze our last example, where the WorkflowXTiO3_EOS is used as a sub workflow. The general idea of this new workflow is simple: if we’re now able to compute the EOS of any XTiO3 structure we can build a workflow to loop among several X materials, obtain the relaxed structure for each material and run some more sophisticated calculation. In this case we’ll compute phonon vibrational frequncies for some XTiO3 materials, namely Ba, Sr and Pb.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
      ## ===============================================
      ##    WorkflowXTiO3
      ## ===============================================

      class WorkflowXTiO3(Workflow):

          def __init__(self,**kwargs):

              super(WorkflowXTiO3, self).__init__(**kwargs)

          ## ===============================================
          ##    Calculations generators
          ## ===============================================

          def get_ph_parameters(self):

              parameters = ParameterData(dict={
                  'INPUTPH': {
                      'tr2_ph' : 1.0e-8,
                      'epsil' : True,
                      'ldisp' : True,
                      'nq1' : 1,
                      'nq2' : 1,
                      'nq3' : 1,
                      }}).store()

              return parameters

          def get_ph_calculation(self, pw_calc, ph_parameters):

              params = self.get_parameters()

              ph_codename            = params['ph_codename']
              num_machines           = params['num_machines']
              num_mpiprocs_per_machine   = params['num_mpiprocs_per_machine']
              max_wallclock_seconds  = params['max_wallclock_seconds']

              code = Code.get_from_string(ph_codename)
              computer = code.get_remote_computer()

              QEPhCalc = CalculationFactory('quantumespresso.ph')
              calc = QEPhCalc(computer=computer)

              calc.set_max_wallclock_seconds(max_wallclock_seconds) # 30 min
              calc.set_resources({"num_machines": num_machines, "num_mpiprocs_per_machine": num_mpiprocs_per_machine})
              calc.store()

              calc.use_parameters(ph_parameters)
              calc.use_code(code)
              calc.use_parent_calculation(pw_calc)

              return calc

          ## ===============================================
          ##    Workflow steps
          ## ===============================================

          @Workflow.step
          def start(self):

              params = self.get_parameters()
              elements_alat = [('Ba',4.0),('Sr', 3.89), ('Pb', 3.9)]

              for x in elements_alat:

                  params.update({'x_material':x[0]})
                  params.update({'starting_alat':x[1]})

                  aiidalogger.info("Launching workflow WorkflowXTiO3_EOS for {0} with alat {1}".format(x[0],x[1]))

                  w = WorkflowXTiO3_EOS(params=params)
                  w.start()
                  self.attach_workflow(w)

              self.next(self.run_ph)

          @Workflow.step
          def run_ph(self):

              # Get calculations
              sub_wfs = self.get_step(self.start).get_sub_workflows()

              for sub_wf in sub_wfs:

                  # Retrieve the pw optimized calculation
                  pw_calc = sub_wf.get_step("optimize").get_calculations()[0]

                  aiidalogger.info("Launching PH for PW {0}".format(pw_calc.get_job_id()))
                  ph_calc = self.get_ph_calculation(pw_calc, self.get_ph_parameters())
                  self.attach_calculation(ph_calc)

              self.next(self.final_step)

          @Workflow.step
          def final_step(self):

              #self.append_to_report(x_material+"Ti03 EOS started")
              from aiida.orm.data.parameter import ParameterData
              import aiida.tools.physics as ps

              params = self.get_parameters()

              # Get calculations
              run_ph_calcs = self.get_step_calculations(self.run_ph) #.get_calculations()

              for c in run_ph_calcs:
                  dm = c.get_outputs(type=ParameterData)[0].get_dict()['dynamical_matrix_1']
                  self.append_to_report("Point q: {0} Frequencies: {1}".format(dm['q_point'],dm['frequencies']))

              self.next(self.exit)


  Most of the code is now simple adaptation of previous examples, so we're going to comment only the most relevant differences where
  workflow chaining plays an important role.

  * **start** This workflow accepts the same input as the WorkflowXTiO3_EOS, but right at the beginning the workflow a list of X materials
is defined, with their respective initial alat. This list is iterated and for each material a new Workflow is both generated, started and
attached to the step. At the end ``run_ph`` is chained as the following step.
  • run_ph Only after all the subworkflows in start are succesfully completed this step will be executed, and it will immediately retrieve all the subworkflow, and from each of them it will get the result calculations. As you noticed the result can be stored with any user defined key, and this is necessary when someone wants to retrieve it from a completed workflow. For each result a phonon calculation is launched and then the final_step step is chained.

To launch this new workflow we have only to add a simple entry in the previous parameter dictionary, specifing the phonon code, as reported here:

>> from aiida.workflows.wf_XTiO3 import WorkflowXTiO3
>> params = {'pw_codename':'PWcode', 'ph_codename':'PHcode', 'num_machines':1, 'num_mpiprocs_per_machine':8, 'max_wallclock_seconds':30*60, 'pseudo_family':'PBE', 'alat_steps':5 }
>> wf = WorkflowXTiO3(params=params)
>> wf.start()
Compatibility with new workflows

As part of the deprecation process of the old workflows to ease the transition we support the ability to launch old workflows from WorkChain s. The ToContext object can be used in conjunction with legacy_workflow which takes a legacy workflow pk and builds an object that tells ToContext how to wait for it to be done and store it in the context on completion. An example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from aiida.work.run import legacy_workflow
from aiida.work.workchain import WorkChain, ToContext, Outputs

class MyWf(WorkChain):
    @classmethod
    def define(cls, spec):
        super(MyWf, cls).define(spec)
        spec.outline(cls.step1, cls.step2)

    def step1(self):
        wf = OldEquationOfState()
        wf.start()
        return ToContext(eos=legacy_workflow(wf.pk))

    def step2(self):
        # Now self.ctx.eos contains the terminated workflow
        pass

similarly if you just want the outputs of an old workflow rather than the workflow object itself replace line 12 with:

return ToContext(eos=Outputs(legacy_workflow(wf.pk)))
Backups

In this page you will find useful information on how to backup your database, how to move it to a different location and how to backup your repository.

Create database backup

It is strongly advised to backup the content of your database daily. Below are instructions to set this up for the PostgreSQL database, under Ubuntu (tested with version 12.04).

Note

Perform the following operation after having set up AiiDA. Only then the ~/.aiida folder (and the files within) will be created.

The database files are not put in the .aiida folder but in the system directories which typically are not backed up. Moreover, the database is spread over lots of files that, if backed up as they are at a given time, cannot be re-used to restore the database.

So you need to periodically (typically once a day) dump the database contents in a file that will be backed up. This can be done by the following bash script backup_postgresql.sh:

#!/bin/bash
AIIDAUSER=aiida
AIIDADB=aiidadb
AIIDAPORT=5432
## STORE THE PASSWORD, IN THE PROPER FORMAT, IN THE ~/.pgpass file
## see http://www.postgresql.org/docs/current/static/libpq-pgpass.html
AIIDALOCALTMPDUMPFILE=~/.aiida/${AIIDADB}-backup.psql.gz


if [ -e ${AIIDALOCALTMPDUMPFILE} ]
then
    mv ${AIIDALOCALTMPDUMPFILE} ${AIIDALOCALTMPDUMPFILE}~
fi

# NOTE: password stored in ~/.pgpass, where pg_dump will read it automatically
pg_dump -h localhost -p $AIIDAPORT -U $AIIDAUSER $AIIDADB | gzip > $AIIDALOCALTMPDUMPFILE || rm $AIIDALOCALTMPDUMPFILE

Before launching the script you need to create the file ~/.pgpass to avoid having to enter your database password each time you use the script. It should look like (.pgpass):

localhost:5432:aiidadb:aiida:YOUR_DATABASE_PASSWORD

where YOUR_DATABASE_PASSWORD is the password you set up for the database.

Note

Do not forget to put this file in ~/ and to name it .pgpass. Remember also to give it the right permissions (read and write): chmod u=rw .pgpass.

To dump the database in a file automatically everyday, you can add the following script backup-aiidadb-USERNAME in /etc/cron.daily/, which will launch the previous script once per day:

#!/bin/bash
su USERNAME -c "/home/USERNAME/.aiida/backup_postgresql.sh"

where all instances of USERNAME are replaced by your actual user name. The su USERNAME makes the dumped file be owned by you rather than by root. Remember to give the script the right permissions:

sudo chmod +x /etc/cron.daily/backup-aiidadb-USERNAME

Finally make sure your database folder (/home/USERNAME/.aiida/) containing this dump file and the repository directory, is properly backed up by your backup software (under Ubuntu, Backup -> check the “Folders” tab).

Note

If your database is very large (more than a few hundreds of thousands of nodes and workflows), a standard backup of your repository folder will be very slow (up to days), thus slowing down your computer dramatically. To fix this problem you can set up an incremental backup of your repository by following the instructions here.

Restore database backup

In order to retrieve the database from a backup, you have first to create a empty database following the instructions described above in “Setup instructions: PostgreSQL” except the verdi install phase. Once that you have created your empty database with the same names of the backuped one, type the following command:

psql -h localhost -U aiida -d aiidadb -f aiidadb-backup.psql
Move database

It might happen that you need to move the physical location of the database files on your hard-drive (for instance, due to the lack of space in the partition where it is located). Below we explain how to do it.

First, make sure you have a backup of the full database (see instructions here), and that the AiiDA daemon is not running. Then, become the UNIX postgres user, typing as root:

su - postgres

(or, equivalently, type sudo su - postgres, depending on your distribution).

Then enter the postgres shell:

psql

and look for the current location of the data directory:

SHOW data_directory;

Typically you should get something like /var/lib/postgresql/9.1/main.

Note

In the above, 9.1 is replaced by the actual version number of your postgres distribution (the same applies to the remainder of the section).

Note

If you are experiencing memory problems and cannot enter the postgres shell, you can look directly into the file /etc/postgresql/9.1/main/postgresql.conf and check out the line defining the variable data_directory.

Then exit the shell with \q, and stop the postgres database daemon:

service postgresql stop

Copy all the files and folders from the postgres data directory into the new directory:

cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY

where SOURCE_DIRECTORY is the directory you got from the SHOW data_directory; command, and DESTINATION_DIRECTORY is the new directory for the database files.

Note

The behaviour of the cp -a command is to create a directory into DESTINATION_DIRECTORY, e.g.:

cp -a OLD_DIR/main/ NEW_DIR/

will create the directory main into NEW_DIR.

Make sure the permissions, owner and group are the same in the old and new directory (including all levels above the DESTINATION_DIRECTORY). The owner and group should be both postgres, at the notable exception of some symbolic links in server.crt and server.key (these files might be absent, depending on your postgresql version number).

Note

If the permissions of these links need to be changed, use the -h option of chown to avoid changing the permissions of the destination of the links. In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!):

-rw-r--r-- 1 root root 989 Mar  1  2012 /etc/ssl/certs/ssl-cert-snakeoil.pem
-rw-r----- 1 root ssl-cert 1704 Mar  1  2012 /etc/ssl/private/ssl-cert-snakeoil.key

Then you can change the postgres configuration file, that should typically be located here:

/etc/postgresql/9.1/main/postgresql.conf

Make a backup version of this file, then look for the line defining data_directory and replace it with the new data directory path:

data_directory = 'NEW_DATA_DIRECTORY'

Then start again the database daemon:

service postgresql start

You can check that the data directory has indeed changed:

psql
SHOW data_directory;
\q

Before removing definitely the previous location of the database files, first rename it and test AiiDA with the new database location (e.g. do simple queries like verdi code list or create a node and store it). If everything went fine, you can delete the old database location.

Setup repository backup

Apart from the database backup, you should also backup the AiiDA repository. For small repositories, this can be easily done by a simple directory copy or, even better, with the use of the rsync command which can copy only the differences. However, both of the aforementioned approaches are not efficient in big repositories where even a partial recursive directory listing may take significant time, especially for filesystems where accessing a directory has a constant (and significant) latency time. Therefore, we provide scripts for making efficient backups of the AiiDA repository.

Before running the backup script, you will have to configure it. Therefore you should execute the backup_setup.py which is located under MY_AIIDA_FOLDER/aiida/common/additions/backup_script. For example:

verdi -p PROFILENAME run MY_AIIDA_FOLDER/aiida/common/additions/backup_script/backup_setup.py

where PROFILENAME is the name of the profile you want to use (if you don’t specify the -p option, the default profile will be used). This will ask a set of questions. More precisely, it will initially ask for:

  • The backup folder. This is the destination of the backup configuration file. By default a folder named backup in your .aiida directory is proposed to be created.
  • The destination folder of the backup. This is the destination folder of the files to be backed up. By default it is a folder inside the aforementioned backup folder (e.g. /home/aiida_user/.aiida/backup/backup_dest).

Note

You should backup the repository on a different disk than the one in which you have the AiiDA repository! If you just use the same disk, you don’t have any security against the most common data loss cause: disk failure. The best option is to use a destination folder mounted over ssh. For this you need to install sshfs (under ubuntu: sudo apt-get install sshfs).

E.g. Imagine that you run your calculations on server_1 and you would like to take regular repository backups to server_2. Then, you could mount a server_2 directory via sshfs on server_1 using the following command on server_1:

sshfs -o idmap=user -o rw backup_user@server_2:/home/backup_user/backup_destination_dir/ /home/aiida_user/remote_backup_dir/

You should put this line into the actions performed at start-up (under gnome you can access them by typing gnome-session-properties in a terminal), so that the remote directory is mounted automatically after a reboot (but do not put it in your .bashrc file otherwise each time you open a new terminal, your computer will complain that the mount point is not empty...).

A template backup configuration file (backup_info.json.tmpl) will be copied in the backup folder. You can set the backup variables by yourself after renaming the template file to backup_info.json, or you can answer the questions asked by the script, and then backup_info.json will be created based on you answers.

The main script backs up the AiiDA repository that is referenced by the current AiiDA database. The script will start from the oldest_object_backedup date or the date of the oldest node/workflow object found and it will periodically backup (in periods of periodicity days) until the ending date of the backup specified by end_date_of_backup or days_to_backup

The backup parameters to be set in the backup_info.json are:

  • periodicity (in days): The backup runs periodically for a number of days defined in the periodicity variable. The purpose of this variable is to limit the backup to run only on a few number of days and therefore to limit the number of files that are backed up at every round. e.g. "periodicity": 2 Example: if you have files in the AiiDA repositories created in the past 30 days, and periodicity is 15, the first run will backup the files of the first 15 days; a second run of the script will backup the next 15 days, completing the backup (if it is run within the same day). Further runs will only backup newer files, if they are created.
  • oldest_object_backedup (timestamp or null): This is the timestamp of the oldest object that was backed up. If you are not aware of this value or if it is the first time that you start a backup up for this repository, then set this value to null. Then the script will search the creation date of the oldest workflow or node object in the database and it will start the backup from that date. E.g. "oldest_object_backedup": "2015-07-20 11:13:08.145804+02:00"
  • end_date_of_backup: If set, the backup script will backup files that have a modification date until the value specified by this variable. If not set, the ending of the backup will be set by the following variable (days_to_backup) which specifies how many days to backup from the start of the backup. If none of these variables are set (end_date_of_backup and days_to_backup), then the end date of backup is set to the current date. E.g. "end_date_of_backup": null or "end_date_of_backup": "2015-07-20 11:13:08.145804+02:00"
  • days_to_backup: If set, you specify how many days you will backup from the starting date of your backup. If it set to null and also end_date_of_backup is set to null, then the end date of the backup is set to the current date. You can not set days_to_backup & end_date_of_backup at the same time (it will lead to an error). E.g. "days_to_backup": null or "days_to_backup": 5
  • backup_length_threshold (in hours): The backup script runs in rounds and on every round it backs-up a number of days that are controlled primarily by periodicity and also by end_date_of_backup / days_to_backup, for the last backup round. The backup_length_threshold specifies the lowest acceptable round length. This is important for the end of the backup.
  • backup_dir: The destination directory of the backup. e.g. "backup_dir": "/home/aiida_user/.aiida/backup/backup_dest"

To start the backup, run the start_backup.py script. Run as often as needed to complete a full backup, and then run it periodically (e.g. calling it from a cron script, for instance every day) to backup new changes.

Note

You can set up a cron job using the following command:

sudo crontab -u aiida_user -e

It will open an editor where you can add a line of the form:

00 03 * * * /home/aiida_user/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" aiida_user_email@domain.net

or (if you need to backup a different profile than the default one):

00 03 * * * verdi -p PROFILENAME run /home/aiida_user/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" aiida_user_email@domain.net

This will launch the backup of the database every day at 3 AM, and send the output (or any error message) to the email address of the user (provided the mail command – from mailutils – is configured appropriately).

Finally, do not forget to exclude the repository folder from the normal backup of your home directory!

Tutorials

Here you will find all sorts of tutorials that show how to work with AiiDA and the various plug-ins that are available for it

StructureData
General comments

This section contains an example of how you can use the StructureData object to create complex crystals.

With the StructureData class we did not try to have a full set of features to manipulate crystal structures. Indeed, other libraries such as ASE exist, and we simply provide easy ways to convert between the ASE and the AiiDA formats. On the other hand, we tried to define a “standard” format for structures in AiiDA, that can be used across different codes.

Tutorial

Take a look at the following example:

alat = 4. # angstrom
cell = [[alat, 0., 0.,],
        [0., alat, 0.,],
        [0., 0., alat,],
       ]
s = StructureData(cell=cell)
s.append_atom(position=(0.,0.,0.), symbols='Fe')
s.append_atom(position=(alat/2.,alat/2.,alat/2.), symbols='O')

With the commands above, we have created a crystal structure s with a cubic unit cell and lattice parameter of 4 angstrom, and two atoms in the cell: one iron (Fe) atom in the origin, and one oxygen (O) at the center of the cube (this cell has been just chosen as an example and most probably does not exist).

Note

As you can see in the example above, both the cell coordinates and the atom coordinates are expressed in angstrom, and the position of the atoms are given in a global absolute reference frame.

In this way, any periodic structure can be defined. If you want to import from ASE in order to specify the coordinates, e.g., in terms of the crystal lattice vectors, see the guide on the conversion to/from ASE below.

When using the append_atom() method, further parameters can be passed. In particular, one can specify the mass of the atom, particularly important if you want e.g. to run a phonon calculation. If no mass is specified, the mass provided by NIST (retrieved in October 2014) is going to be used. The list of masses is stored in the module aiida.common.constants, in the elements dictionary.

Moreover, in the StructureData class of AiiDA we also support the storage of crystal structures with alloys, vacancies or partial occupancies. In this case, the argument of the parameter symbols should be a list of symbols, if you want to consider an alloy; moreover, you must pass a weights list, with the same length as symbols, and with values between 0. (no occupancy) and 1. (full occupancy), to specify the fractional occupancy of that site for each of the symbols specified in the symbols list. The sum of all occupancies must be lower or equal to one; if the sum is lower than one, it means that there is a given probability of having a vacancy at that specific site position.

As an example, you could use:

s.append_atom(position=(0.,0.,0.),symbols=['Ba','Ca'],weights=[0.9,0.1])

to add a site at the origin of a structure s consisting of an alloy of 90% of Barium and 10% of Calcium (again, just an example).

The following line instead:

s.append_atom(position=(0.,0.,0.),symbols='Ca',weights=0.9)

would create a site with 90% probability of being occupied by Calcium, and 10% of being a vacancy.

Utility methods s.is_alloy() and s.has_vacancies() can be used to verify, respectively, if more than one element if given in the symbols list, and if the sum of all weights is smaller than one.

Note

if you pass more than one symbol, the method s.is_alloy() will always return True, even if only one symbol has occupancy 1. and all others have occupancy zero:

>>> s = StructureData(cell=[[4,0,0],[0,4,0],[0,0,4]])
>>> s.append_atom(position=(0.,0.,0.), symbols=['Fe', 'O'], weights=[1.,0.])
>>> s.is_alloy()
True
Internals: Kinds and Sites

Internally, the append_atom() method works by manipulating the kinds and sites of the current structure. Kinds are instances of the Kind class and represent a chemical species, with given properties (composing element or elements, occupancies, mass, ...) and identified by a label (normally, simply the element chemical symbol).

Sites are instances of the Site class and represent instead each single site. Each site refers to a Kind to identify its properties (which element it is, the mass, ...) and to its three spatial coordinates.

The append_atom() works in the following way:

  • It creates a new Kind class with the properties passed as parameters (i.e., all parameters except position).

  • It tries to identify if an identical Kind already exists in the list of kinds of the structure (e.g., in the same atom with the same mass was already previously added). Comparison of kinds is performed using aiida.orm.data.structure.Kind.compare_with(), and in particular it returns True if the mass and the list of symbols and of weights are identical (within a threshold). If an identical kind k is found, it simply adds a new site referencing to kind k and with the provided position. Otherwise, it appends k to the list of kinds of the current structure and then creates the site referencing to k. The name of the kind is chosen, by default, equal to the name of the chemical symbol (e.g., “Fe” for iron).

  • If you pass more than one species for the same chemical symbol, but e.g. with different masses, a new kind is created and the name is obtained postponing an integer to the chemical symbol name. For instance, the following lines:

    s.append_atom(position = [0,0,0], symbols='Fe', mass = 55.8)
    s.append_atom(position = [1,1,1], symbols='Fe', mass = 57)
    s.append_atom(position = [1,1,1], symbols='Fe', mass = 59)
    

    will automatically create three kinds, all for iron, with names Fe, Fe1 and Fe2, and masses 55.8, 57. and 59. respecively.

  • In case of alloys, the kind name is obtained concatenating all chemical symbols names (and a X is the sum of weights is less than one). The same rules as above are used to append a digit to the kind name, if needed.

  • Finally, you can simply specify the kind_name to automatically generate a new kind with a specific name. This is the case if you want a name different from the automatically generated one, or for instance if you want to create two different species with the same properties (same mass, symbols, ...). This is for instance the case in Quantum ESPRESSO in order to describe an antiferromagnetic cyrstal, with different magnetizations on the different atoms in the unit cell.

    In this case, you can for instance use:

    s.append_atom(position = [0,0,0], symbols='Fe', mass = 55.845, name='Fe1')
    s.append_atom(position = [2,2,2], symbols='Fe', mass = 55.845, name='Fe2')
    

    To create two species Fe1 and Fe2 for iron, with the same mass.

    Note

    You do not need to specify explicitly the mass if the default one is ok for you. However, when you pass explicitly a name and it coincides with the name of an existing species, all properties that you specify must be identical to the ones of the existing species, or the method will raise an exception.

    Note

    If you prefer to work with the internal Kind and Site classes, you can obtain the same result of the two lines above with:

    from aiida.orm.data.structure import Kind, Site
    s.append_kind(Kind(symbols='Fe', mass=55.845, name='Fe1'))
    s.append_kind(Kind(symbols='Fe', mass=55.845, name='Fe1'))
    s.append_site(Site(kind_name='Fe1', position=[0.,0.,0.]))
    s.append_site(Site(kind_name='Fe2', position=[2.,2.,2.]))
    
Conversion to/from ASE

If you have an AiiDA structure, you can get an ase.Atom object by just calling the get_ase method:

ase_atoms = aiida_structure.get_ase()

Note

As we support alloys and vacancies in AiiDA, while ase.Atom does not, it is not possible to export to ASE a structure with vacancies or alloys.

If instead you have as ASE Atoms object and you want to load the structure from it, just pass it when initializing the class:

StructureData = DataFactory('structure')
# or:
# from aiida.orm.data.structure import StructureData
aiida_structure = StructureData(ase = ase_atoms)
Creating multiple species

We implemented the possibility of specifying different Kinds (species) in the ase.atoms and then importing them.

In particular, if you specify atoms with different mass in ASE, during the import phase different kinds will be created:

>>> import ase
>>> StructureData = DataFactory("structure")
>>> asecell = ase.Atoms('Fe2')
>>> asecell[0].mass = 55.
>>> asecell[1].mass = 56.
>>> s = StructureData(ase=asecell)
>>> for kind in s.kinds:
>>>     print kind.name, kind.mass
Fe 55.0
Fe1 56.0

Moreover, even if the mass is the same, but you want to get different species, you can use the ASE tags to specify the number to append to the element symbol in order to get the species name:

>>> import ase
>>> StructureData = DataFactory("structure")
>>> asecell = ase.Atoms('Fe2')
>>> asecell[0].tag = 1
>>> asecell[1].tag = 2
>>> s = StructureData(ase=asecell)
>>> for kind in s.kinds:
>>>     print kind.name
Fe1
Fe2

Note

in complicated cases (multiple tags, masses, ...), it is possible that exporting a AiiDA structure to ASE and then importing it again will not perfectly preserve the kinds and kind names.

Conversion to/from pymatgen

AiiDA structure can be converted to pymatgen’s Molecule and Structure objects by using, accordingly, get_pymatgen_molecule and get_pymatgen_structure methods:

pymatgen_molecule  = aiida_structure.get_pymatgen_molecule()
pymatgen_structure = aiida_structure.get_pymatgen_structure()

A single method get_pymatgen can be used for both tasks: converting periodic structures (periodic boundary conditions are met in all three directions) to pymatgen’s Structure and other structures to pymatgen’s Molecule:

pymatgen_object = aiida_structure.get_pymatgen()

It is also possible to convert pymatgen’s Molecule and Structure objects to AiiDA structures:

StructureData = DataFactory("structure")
from_mol      = StructureData(pymatgen_molecule=mol)
from_struct   = StructureData(pymatgen_structure=struct)

Also in this case, a generic converter is provided:

StructureData = DataFactory("structure")
from_mol      = StructureData(pymatgen=mol)
from_struct   = StructureData(pymatgen=struct)

Note

Converters work with version 3.0.13 or later of pymatgen. Earlier versions may cause errors.

Pseudopotentials
Introduction: Pseudopotential families

As you might have seen in the previous PWscf tutorial, the procedure of attaching a pseudopotential file to each atomic species could be a bit tedious. In many situations, you will not produce a different pseudopotential file for every calculation you do. More likely, when you start a project you will stick to a pseudopotential file for as long as possible. Moreover, in a high-throughput calculation, you will like to do calculation over several elements keeping the same functional. That’s also part of the reason why there are several projects (like PSLibrary or GBRV to name a few), that intend to develop a set of pseudopotentials that covers most of the periodic table for different functionals.

That’s why we added the pseudopotential families. Each family is a set of pseudopotentials that are grouped together in a special type of AiiDA Group of nodes. Within each family, at most one pseudopotential can be present for a given chemical element.

Of course, a pseudopotential family does not have to completely cover the periodic table (also because such pseudopotential sets do not exist). This means that you can create a pseudopotential family containing only the pseudopotentials for a few elements that you are interested in.

Note

In principle, you can group different kinds of pseudopotentials into the same family. It is your responsibility to group only those with the same type, or obtained using the same functionals, approximations and / or levels of theory.

Creating a pseudopotential family

Note

The following commands are specific to the Quantum ESPRESSO interface. For interfaces to other codes, please refer to the respective plugin documentation.

In the following, we will go through creating a pseudopotential family. First, you need to collect the pseudopotential files which should go into the family in a single folder – we’ll call it path/to/folder. You can then add the family to the AiiDA database with verdi:

verdi data upf uploadfamily path/to/folder name_of_the_family "some description for your convenience"

where name_of_the_family should be a unique name for the family, and the final parameter is a string that is set in the description field of the group.

If the a pseudopotential family with the same name_of_the_family exists already, the pseudopotentials in the folder will be added to the existing group. The code will raise an error if you try to add two (different) pseudopotentials for the same element.

After the upload (which may take some seconds, so please be patient) the upffamily will be ready to use.

Hint

If you upload pseudopotentials which are already present in your database, AiiDA will use the existing UPFData node instead of creating a duplicate one. You can use the optional flag --stop-if-existing to instead abort (without changing anything in the database) if an existing pseudopotential is found.

Getting the list of existing families

To see wich pseudopotential families already exist in the database, type

verdi data upf listfamilies

Add a -d (or --with-description) flag if you also want to read the description of each family.

You can also filter the groups to get only a list of those containing a given set of elements using the -e option. For instance, if you want to get only the families containing the elements Ba, Ti and O, use

verdi data upf listfamilies -e Ba Ti O

For more help on the command line options, type

verdi data upf listfamilies -h
Manually adding pseudopotentials

If you do not want to use pseudopotentials from a family, it is also possible to manually add them to the database (even though we discourage this in general).

A possible way of doing it is the following: we start by creating a list of pseudopotential filenames that we need to use:

raw_pseudos = [
   "Ba.pbesol-spn-rrkjus_psl.0.2.3-tot-pslib030.UPF",
   "Ti.pbesol-spn-rrkjus_psl.0.2.3-tot-pslib030.UPF",
   "O.pbesol-n-rrkjus_psl.0.1-tested-pslib030.UPF"]

In this simple example, we expect the pseudopotentials to be in the same folder of the script. Then, we loop over the filenames and add them to the AiiDA database. The get_or_create method checks if the pseudopotential is already in the database and either stores it, or just returns the node already present in the database. The second value returned is a boolean and tells us if the pseudopotential was already present or not. We also store the returned nodes in a list (pseudos_to_use).

UpfData = DataFactory('upf')
pseudos_to_use = []

for filename in raw_pseudos:
    absname = os.path.abspath(filename)
    pseudo, created = UpfData.get_or_create(absname,use_first=True)
    pseudos_to_use.append(pseudo)

As the last step, we make a loop over the pseudopotentials, and attach its pseudopotential object to the calculation:

for pseudo in pseudos_to_use:
    calc.use_pseudo(pseudo, kind=pseudo.element)

Note

When the pseudopotential is created, it is parsed and the elements to which it refers is stored in the database and can be accessed using the pseudo.element property, as shown above.

Plugin-specific tutorials

There are a number of additional tutorials you might want to follow. These are typically included in the documentation of each plugin. Below we mention a few notable examples, but feel free to check the documentation of your favourite plugin (and don’t hesitate to ask the developer to write a tutorial, if there isn’t any!)

Quantum ESPRESSO tutorials

The AiiDA plugins for Quantum ESPRESSO provide a set of tutorials for some of the main Quantum ESPRESSO codes (PW, PH, CP, and one of the post-processing tools PROJWFC). You can find them on the plugin documentation page.

Import and Export

AiiDA data

AiiDA allows its users to exchange parts of their graphs containing already executed calculations but also related nodes like their inputs & outputs. Exchanging such information among AiiDA instances, or even users of the same instance, is not a simple task. Two tools are provided that facilitate the exchange of AiiDA information.

Export

The export tool can take as input various parameters allowing the user to export specific nodes based on their identifier or nodes belonging to a specific group. Given a set of nodes, the export function automatically selects all the parents and the direct outputs of the selected calculations (this can be overridden by the user).

The idea behind this automatic selection is that when a node is exported, very likely, we would like to know how we arrived at the generation of this node. The same stands for calculation nodes. When a calculation is exported, it doesn’t make a lot of sense to be exported without providing also the results of that calculation. The exported data (database information but also files) is stored to a single file which can also be compressed if the user provides the corresponding option during the export.

Import

Import is less parameterizable than export. The user has just to provide a path to the file to be imported (file-system path or URL) and AiiDA will import the needed information by also checking, and avoiding, possible identifier collisions and or node duplication.

File format

The result of the export is a single file which contains all the needed information for a successful import. Let’s see more closely what is inside this file. If you extract it, you will find the following files and directories:

-rw-rw-r--  1 aiida aiida 165336 Nov 29 16:39 data.json
-rw-rw-r--  1 aiida aiida   1848 Nov 29 16:39 metadata.json
drwxrwx--- 70 aiida aiida   4096 Nov 29 16:39 nodes/
  • metadata.json - information about the schema of the database information that is exported.
  • data.json - information about the exported database nodes that follows the format mentioned in the metadata.json. In this files the links between the nodes are stored too.
  • nodes directory - the repository files that correspond to the exported nodes.
metadata.json

This file contains important information and it is necessary for the correct interpretation of the data.json. Apart from the data schema, the AiiDA code and the export file version are also mentioned. This is used to avoid any incompatibilities among different versions of AiiDA. It should be noted that the schema described in metadata.json is related to the data itself - abstracted schema focused on the extracted information - and not how the data is stored in the database (database schema). This makes the import/export mechanism to be transparent to the database system used, backend selected and how the data is organised in the database (database schema).

Let’s have a look at the contents of the metadata.json:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
{
    "export_version": "0.2",
    "aiida_version": "0.6.0",
    "unique_identifiers": {
        "aiida.backends.djsite.db.models.DbComputer": "uuid",
        "aiida.backends.djsite.db.models.DbGroup": "uuid",
        "aiida.backends.djsite.db.models.DbUser": "email",
        "aiida.backends.djsite.db.models.DbNode": "uuid",
        "aiida.backends.djsite.db.models.DbAttribute": null,
        "aiida.backends.djsite.db.models.DbLink": null
    },
    "all_fields_info": {
        "aiida.backends.djsite.db.models.DbComputer": {
            "description": {},
            "transport_params": {},
            "hostname": {},
            "enabled": {},
            "name": {},
            "transport_type": {},
            "metadata": {},
            "scheduler_type": {},
            "uuid": {}
        },
        "aiida.backends.djsite.db.models.DbLink": {
            "input": {
                "related_name": "output_links",
                "requires": "aiida.backends.djsite.db.models.DbNode"},
            "label": {},
            "output": {
                "related_name": "input_links",
                "requires": "aiida.backends.djsite.db.models.DbNode"}
        },
        "aiida.backends.djsite.db.models.DbUser": {
            "first_name": {},
            "last_name": {},
            "email": {},
            "institution": {}
        },
        "aiida.backends.djsite.db.models.DbNode": {
            "nodeversion": {},
            "description": {},
            "dbcomputer": {
                "related_name": "dbnodes",
                "requires": "aiida.backends.djsite.db.models.DbComputer"},
            "ctime": {
                "convert_type": "date"},
            "user": {
                "related_name": "dbnodes",
                "requires": "aiida.backends.djsite.db.models.DbUser"},
            "mtime": {
                "convert_type": "date"},
            "label": {},
            "type": {},
            "public": {},
            "uuid": {}
        },
        "aiida.backends.djsite.db.models.DbAttribute": {
            "dbnode": {
                "related_name": "dbattributes",
                "requires": "aiida.backends.djsite.db.models.DbNode"
            },
            "dval": {
                "convert_type": "date"},
            "datatype": {},
            "fval": {},
            "tval": {},
            "key": {},
            "ival": {},
            "bval": {}
        },
        "aiida.backends.djsite.db.models.DbGroup": {
            "description": {},
            "name": {},
            "user": {
                "related_name": "dbgroups",
                "requires": "aiida.backends.djsite.db.models.DbUser"},
            "time": {
                "convert_type": "date"},
            "type": {},
            "uuid": {}
        }
    }
}

At the beginning of the file, we see that the version of the export file and the versions of the AiiDA code.

The entities that are exported are mentioned in the sequel with their unique identifiers. Knowing the unique IDs is useful for duplicate avoidance (in order to avoid the insertion of the node multiple times).

Then in the all_fields_info, the properties of each entity are mentioned. It is also mentioned the correlations with other entities. For example, the entity aiida.backends.djsite.db.models.DbNode is related to a computer and a user. The corresponding entity names appear nested next to the properties to show this correlation.

data.json

A sample of the data.json file follows:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
{
    "links_uuid": [
        {
            "output": "c208c9da-23b4-4c32-8f99-f9141ab28363",
            "label": "parent_calc_folder",
            "input": "eaaa114d-3d5b-42eb-a269-cf0e7a3a935d"
        },
        ...
    ],
    "export_data": {
        "aiida.backends.djsite.db.models.DbUser": {
            "2": {
                "first_name": "AiiDA",
                "last_name": "theossrv2",
                "institution": "EPFL, Lausanne",
                "email": "aiida@theossrv2.epfl.ch"
            },
            ...
        },
        "aiida.backends.djsite.db.models.DbComputer": {
            "1": {
                "name": "theospc14-direct_",
                "transport_params": "{}",
                "description": "theospc14 (N. Mounet's PC) with direct scheduler",
                "hostname": "theospc14.epfl.ch",
                "enabled": true,
                "transport_type": "ssh",
                "metadata": "{\"default_mpiprocs_per_machine\": 8, \"workdir\": \"/scratch/{username}/aiida_run/\", \"append_text\": \"\", \"prepend_text\": \"\", \"mpirun_command\": [\"mpirun\", \"-np\", \"{tot_num_mpiprocs}\"]}",
                "scheduler_type": "direct",
                "uuid": "fb7729ff-8254-4bc0-bbec-acbdb573cfe2"
            },
            ...
        },
        "aiida.backends.djsite.db.models.DbNode": {
            "5921143": {
                "uuid": "628ba258-ccc1-47bf-bab7-8aee64b563ea",
                "description": "",
                "dbcomputer": null,
                "label": "",
                "user": 2,
                "mtime": "2016-08-21T11:55:53.132925",
                "nodeversion": 1,
                "type": "data.parameter.ParameterData.",
                "public": false,
                "ctime": "2016-08-21T11:55:53.118306"
            },
            "20063": {
                "uuid": "1024e35e-166b-4104-95f6-c1706df4ce15",
                "description": "",
                "dbcomputer": 1,
                "label": "",
                "user": 2,
                "mtime": "2016-02-16T10:33:54.095973",
                "nodeversion": 16,
                "type": "calculation.job.codtools.ciffilter.CiffilterCalculation.",
                "public": false,
                "ctime": "2015-10-02T20:08:06.628472"
            },
            ...
        }
    },
    "groups_uuid": {

    },
    "node_attributes_conversion": {
        "5921143": {
            "CONTROL": {
                "calculation": null,
                "restart_mode": null,
                "max_seconds": null
            },
            "ELECTRONS": {
                "electron_maxstep": null,
                "conv_thr": null
            },
            "SYSTEM": {
                "ecutwfc": null,
                "input_dft": null,
                "occupations": null,
                "degauss": null,
                "smearing": null,
                "ecutrho": null
            }
        },
        "20063": {
            "retrieve_list": [
                null,
                null,
                null,
                null
            ],
            "last_jobinfo": null,
            "scheduler_state": null,
            "parser": null,
            "linkname_retrieved": null,
            "jobresource_params": {
                "num_machines": null,
                "num_mpiprocs_per_machine": null,
                "default_mpiprocs_per_machine": null
            },
            "remote_workdir": null,
            "state": null,
            "max_wallclock_seconds": null,
            "retrieve_singlefile_list": [

            ],
            "scheduler_lastchecktime": "date",
            "job_id": null
        },
        ...
    },
    "node_attributes": {
        "5921143": {
            "CONTROL": {
                "calculation": "vc-relax",
                "restart_mode": "from_scratch",
                "max_seconds": 83808
            },
            "ELECTRONS": {
                "electron_maxstep": 100,
                "conv_thr": 3.6e-10
            },
            "SYSTEM": {
                "ecutwfc": 90.0,
                "input_dft": "vdw-df2-c09",
                "occupations": "smearing",
                "degauss": 0.02,
                "smearing": "cold",
                "ecutrho": 1080.0
            }
        },
        "20063": {
            "retrieve_list": [
                "aiida.out",
                "aiida.err",
                "_scheduler-stdout.txt",
                "_scheduler-stderr.txt"
            ],
            "last_jobinfo": "{\"job_state\": \"DONE\", \"detailedJobinfo\": \"AiiDA MESSAGE: This scheduler does not implement the routine get_detailed_jobinfo to retrieve the information on a job after it has finished.\", \"job_id\": \"13489\"}",
            "scheduler_state": "DONE",
            "parser": "codtools.ciffilter",
            "linkname_retrieved": "retrieved",
            "jobresource_params": {
                "num_machines": 1,
                "num_mpiprocs_per_machine": 1,
                "default_mpiprocs_per_machine": 8
            },
            "remote_workdir": "/scratch/aiida/aiida_run/10/24/e35e-166b-4104-95f6-c1706df4ce15",
            "state": "FINISHED",
            "max_wallclock_seconds": 900,
            "retrieve_singlefile_list": [

            ],
            "scheduler_lastchecktime": "2015-10-02T20:30:36.481951",
            "job_id": "13489"
        "6480111": {
        },
        ...
    }
}

At the start of the json file the links among the various AiiDA nodes are stated (links_uuid field). For every link the UUID (Universal unique identifiers) of the connected nodes but also the name of the link is mentioned.

Then the export data follows where for every entity the data appear. It is worth observing the references between the instances of the various entities. For example the DbNode with identifier 5921143 belongs to the user with identifier 2 and was generated by the computer with identifier 1.

The name of the entities is, for the moment, a reference to the model class of the Django backend. This stands for both backends (Django and SQLAlchemy) ensuring that the export files are cross-backend compatible. These names will change in the future to more abstract names.

If any groups are extracted, then they are mentioned in corresponding field (groups_uuid).

Attributes of the extracted nodes, are described in the ending part of the json file. The identifier of the corresponding node is used as a key for the attribute. The field node_attributes_conversion contains information regarding the type of the attribute. For example the dates are not inherently supported by JSON, so it is specified explicitly in the schema if the value of an attribute is of that specific type. After the node_attributes_conversion the node_attributes section follows with the actual values.

Import structures from external databases

We support the import of structures from external databases. The base class that defines the API for the importers can be found here: DbImporter. Below, you can find a list of existing plugins that have already been implemented.

Available plugins
ICSD database importer

In this section we explain how to import CIF files from the ICSD database using the IcsdDbImporter class.

Before being able to query ICSD, provided by FIZ Karlsruhe, you should have the intranet database installed on a server (http://www.fiz-karlsruhe.de/icsd_intranet.html). Follow the installation as decsribed in the manual.

It is necessary to know the webpage of the icsd web interface and have access to the full database from the local machine.

You can either query the mysql database or the web page, the latter is restricted to a maximum of 1000 search results, which makes it unsuitable for data mining. So better set up the mysql connection.

Setup

An instance of the IcsdDbImporter can be created as follows:

importer = aiida.tools.dbimporters.plugins.icsd.IcsdDbImporter(server="http://ICSDSERVER.com/", host= "127.0.0.1")

Here is a list of the most important input parameters with an explanation.

For both connection types (web and SQL):

  • server: address of web interface of the icsd database; it should contain both the protocol and the domain name and end with a slash; example:

    server = "http://ICSDSERVER.com/"
    

The following parameters are required only for the mysql query:

  • host: database host name address.

    Tip

    If the database is not hosted on your local machine, it can be useful to create an ssh tunnel to the 3306 port of the database host:

    ssh -L 3306:localhost:3306 username@icsddbhostname.com
    

    If you get an URLError with Errno 111 (Connection refused) when you query the database, try to use instead:

    ssh -L 3306:localhost:3306 -L 8010:localhost:80 username@icsddbhostname.com
    

    The database can then be accessed using “127.0.0.1” as host:

    host = "127.0.0.1"
    
  • user / pass_wd / db / port: Login username, password, name of database and port of your mysql database.

    If the standard installation of ICSD intranet version has been followed, the default values should work. Otherwise contact your system administrator to get the required information:

    user = "dba", pass_wd = "sql", db = "icsd", port = 3306
    

Other settings:

  • querydb: If True (default) the mysql database is queried, otherwise the web page is queried.

A more detailed documentation and additional settings are found under IcsdDbImporter.

How to do a query

If the setup worked, you can do your first query:

cif_nr_list = ["50542","617290","35538"]

queryresults = importer.query(id= cif_nr_list)

All supported keywords can be obtained using:

importer.get_supported_keywords()

More information on the keywords are found under http://www.fiz-karlsruhe.de/fileadmin/be_user/ICSD/PDF/sci_man_ICSD_v1.pdf

A query returns an instance of IcsdSearchResults

The IcsdEntry at position i can be accessed using:

queryresults.at(i)

You can also iterate through all query results:

for entry in query_results:
    do something

Instances of IcsdEntry have following methods:

  • get_cif_node(): Return an instance of CifData, which can be used in an AiiDA workflow.
  • get_aiida_structure(): Return an AiiDA structure
  • get_ase_structure(): Return an ASE structure

The most convenient format can be chosen for further processing.

Full example

Here is a full example how the icsd importer can be used:

import aiida.tools.dbimporters.plugins.icsd

cif_nr_list = [
"50542",
"617290",
"35538 ",
"165226",
"158366"
]

importer = aiida.tools.dbimporters.plugins.icsd.IcsdDbImporter(server="http://ICSDSERVER.com/",
    host= "127.0.0.1")
query_results = importer.query(id=cif_nr_list)
for result in query_results:
    print result.source['db_id']
    aiida_structure = result.get_aiida_structure()
    #do something with the structure
Troubleshooting: Testing the mysql connection

To test your mysql connection, first make sure that you can connect to the 3306 port of the machine hosting the database. If the database is not hosted by your local machine, use the local port tunneling provided by ssh, as follows:

ssh -L 3306:localhost:3306 username@icsddbhostname.com

Note

If you get an URLError with Errno 111 (Connection refused) when you query the database, try to use instead:

ssh -L 3306:localhost:3306 -L 8010:localhost:80 username@icsddbhostname.com

Note

You need an account on the host machine.

Note

There are plenty of explanations online explaining how to setup an tunnel over a SSH connection using the -L option, just google for it in case you need more information.

Then open a new verdi shell and type:

import MySQLdb

db = MySQLdb.connect(host = "127.0.0.1", user ="dba", passwd = "sql", db = "icsd", port=3306)

If you do not get an error and it does not hang, you have successfully established your connection to the mysql database.

COD database importer

COD database importer is used to import crystal structures from the Crystallography Open Database (COD) to AiiDA.

Setup

An instance of CodDbImporter is created as follows:

from aiida.tools.dbimporters.plugins.cod import CodDbImporter
importer = CodDbImporter()

No additional parameters are required for standard queries on the main COD server.

How to do a query

A search is initiated by supplying query statements using keyword=value syntax:

results = importer.query(chemical_name="caffeine")

List of possible keywords can be listed using:

importer.get_supported_keywords()

Values for the most of the keywords can be list. In that case the query will return entries, that match any of the values (binary OR) from the list. Moreover, in the case of multiple keywords, entries, that match all the conditions imposed by the keywords, will be returned (binary AND).

Example:

results = importer.query(chemical_name=["caffeine","serotonin"],
                         year=[2000,2001])

is equivalent to the following SQL statement:

results = SELECT * FROM data WHERE
            ( chemical_name == "caffeine" OR chemical_name == "serotonin" ) AND
            ( year = 2000 OR year = 2001 )

A query returns an instance of CodSearchResults, which can be used in a same way as a list of CodEntry instances:

print len(results)

for entry in results:
    print entry
Using data from CodEntry

CodEntry has a few functions to access the contents of it’s instances:

CodEntry.get_aiida_structure()
CodEntry.get_ase_structure()
CodEntry.get_cif_node()
CodEntry.get_parsed_cif()
CodEntry.get_raw_cif()
Export data to external databases

We support the export of data to external databases. In the most general way, the export to external databases can be viewed as a subworkflow, taking data as input and resulting in the deposition of it to external database(s). Below is a list of supported databases with deposition routines described in comments-type style.

Supported databases
TCOD database exporter

TCOD database exporter is used to export computation results of StructureData, CifData and TrajectoryData (or any other data type, which can be converted to them) to the Theoretical Crystallography Open Database (TCOD).

Setup

To be able to export data to TCOD, one has to install the atomic_tools dependencies to enable CIF manipulation as well as the aiida-codtools plugin, and set up an AiiDA Code for cif_cod_deposit script from cod-tools.

How to deposit a structure

Best way to deposit data is to use the command line interface:

verdi data DATATYPE deposit --database tcod
                                 [--type {published,prepublication,personal}]
                                 [--username USERNAME] [--password]
                                 [--user-email USER_EMAIL] [--title TITLE]
                                 [--author-name AUTHOR_NAME]
                                 [--author-email AUTHOR_EMAIL] [--url URL]
                                 [--code CODE_LABEL]
                                 [--computer COMPUTER_NAME]
                                 [--replace REPLACE] [-m MESSAGE]
                                 [--reduce-symmetry] [--no-reduce-symmetry]
                                 [--parameter-data PARAMETER_DATA]
                                 [--dump-aiida-database]
                                 [--no-dump-aiida-database]
                                 [--exclude-external-contents]
                                 [--no-exclude-external-contents] [--gzip]
                                 [--no-gzip]
                                 [--gzip-threshold GZIP_THRESHOLD]
                                 PK

Where:

  • DATATYPE – one of AiiDA structural data types (at the moment of writing, these are structure for StructureData, cif for CifData and trajectory for TrajectoryData);

  • TITLE – the title of the publication, where the exported data is/will be published; in case of personal communication, the title should be chosen so as to reflect the exported dataset the best;

  • CODE_LABEL – label of AiiDA Code, associated with cif_cod_deposit;

  • COMPUTER_NAME – name of AiiDA Computer, where cif_cod_deposit script is to be launched;

  • REPLACETCOD ID of the replaced entry in the event of redeposition;

  • MESSAGE – string to describe changes for redeposited structures;

  • --reduce-symmetry, --no-reduce-symmetry – turn on/off symmetry reduction of the exported structure (on by default);

  • --parameter-data – specify the PK of ParameterData object, describing the result of the final (or single) calculation step of the workflow;

  • --dump-aiida-database, --no-dump-aiida-database – turn on/off addition of relevant AiiDA database dump (on by default).

    Warning

    Be aware that TCOD is an open database, thus no copyright-protected data should be deposited unless permission is given by the owner of the rights.

    Note

    Data, which is deposited as pre-publication material, will be kept private on TCOD server and will not be disclosed to anyone without depositor’s permission.

  • --exclude-external-contents, --no-exclude-external-contents – exclude contents of initial input files, that contain source property with definitions on how to obtain the contents from external resources (on by default);

  • --gzip, –no-gzip` – turn on/off gzip compression for large files (off by default); --gzip-threshold sets the minimum file size to be compressed.

Other command line options correspond to the options of cifcoddeposit class of the aiida-codtools plugin. To ease the use of TCOD exporter, one can define persistent parameters in :doc:`AiiDA properties <../verdi/properties>. Corresponding command line parameters and AiiDA properties are presented in the table:

Command line parameter AiiDA property
--author-email tcod.depositor_author_email
--author-name tcod.depositor_author_name
--user-email tcod.depositor_email
--username tcod.depositor_password
--password tcod.depositor_username

Note

--password does not accept any value; instead, the option will prompt the user to enter one’s password in the terminal.

Note

Command line parameters can be used to override AiiDA properties even if properties are set.

Return values

The deposition process, which is of JobCalculation type, returns the output of cif_cod_deposit, wrapped in ParameterData.

Citing

If you use the TCOD database exporter, please cite the following work:

Andrius Merkys, Nicolas Mounet, Andrea Cepellotti, Nicola Marzari, Saulius Gražulis and Giovanni Pizzi, A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD, Journal of Cheminformatics 9, 56 (2017); http://doi.org/10.1186/s13321-017-0242-y.

REST API for AiiDA

AiiDA provides a REST API to access the information of the AiiDA objects stored in the database. There are four types of AiiDA objects: Computer, Node, User, and Group. The Node type has three subtypes: Calculation, Data, and Code. Different REST urls are provided to get the list of objects, the details of a specific object as well as its inputs/outputs/attributes/extras.

The AiiDA REST API is implemented using Flask RESTFul framework. For the time being, it only supports GET methods. The response data are always returned in JSON format.

In this document, the paths of the file systems are defined with respect to the AiiDA installation folder. The source files of the API are contained in the folder aiida/restapi. To start the REST server open a terminal and type

$ verdi restapi

This command will hook up a REST api with the default parameters, namely on port 5000 of localhost, connecting to the AiiDA default profile and assuming the default folder for the REST configuration files, namely common. For an overview of options accepted by verdi restapi you can type

$ verdi restapi --help

As all the verdi commands the AiiDA profile can be changed by putting the option -p PROFILE right after verdi.

The base url for your REST API is be:

http://localhost:5000/api/v2

where the last field identifies the version of the API. This field enables running multiple versions of the API simultaneously, so that the clients should not be obliged to update immediately the format of their requests when a new version of the API is deployed. The current latest version is v2.

An alternative way to hook up the Api is to run the script run_api.py from folder aiida/restapi. Move to the latter and type

$ python run_api.py

This script has the same options as the verdi command (they actually invoke the same function) with the addition of --aiida-profile=AIIDA_PROFILE to set the AiiDA profile to which the Api should connect.

The default configuration file is config.py, which by default is looked for in the folder aiida/restapi`. The path of config.py can be overwritten by the the option --config-dir=CONFIG_DIR . All the available configuration options of the REST Api are documented therein.

In order to send requests to the REST API you can simply type the url of the request in the address bar of your browser or you can use command line tools such as curl or wget.

Let us now introduce the urls supported by the API.

General form of the urls

A generic url to send requests to the REST API is formed by:

  1. the base url. It specifies the host and the version of the API. Example:

    http://localhost:5000/api/v2
    
  2. the path. It defines the kind of resource requested by the client and the type of query.

  3. the query string (not mandatory). It can be used for any further specification of the request, e.g. to introduce query filters, to give instructions for ordering, to set how results have to be paginated, etc.

The query string is introduced by the question mark character ?. Here are some examples:

http://localhost:5000/api/v2/users/
http://localhost:5000/api/v2/computers?scheduler_type="slurm"
http://localhost:5000/api/v2/nodes/?id>45&type=like="%data%"

The trailing slash at the end of the path is not mandatory.

How to set the number of results

Before exploring in details the functionalities of the API it is important to know that the AiiDA RESTAPI provides two different ways to limit the number of results returned by the server: using pagination, or specifying explicitly limit and offset.

Setting limit and offset

You can specify two special fields in the query string:

  • limit=(LIMIT): field that specifies the largest number of results that will be returned, ex: “limit=20”. The default and highest allowed LIMIT is 400.
  • offset=(OFFSET): field that specifies how many entries are skipped before returning results, ex: offset=20. By default no offset applies.

Example:

http://localhost:5000/api/v2/computers/?limit=3&offset=2
How to build the path

There are two type of paths: those that request the list of objects of a specific resource, namely, the AiiDA object type you are requesting, and those that inquire a specific object of a certain resource. In both cases the path has to start with the name of the resource. The complete list of resources is:

  • nodes
  • computers
  • codes
  • calculations,
  • data, structures, kpoints, bands
  • users, groups,

If no specific endpoint is appended to the name of the resource, the Api will return the full list of objects of that resource (the Api default limit applies nevertheless to the number of results). Appending the endpoint schema to a resource will give the list of fields that are normally returned by the Api for an object of a specific resource, whereas the endpoint statistics returns a list of statistical facts concerning a resource. Here are few examples of valid URIs:

http://localhost:5000/api/v2/nodes/statistics
http://localhost:5000/api/v2/users/
http://localhost:5000/api/v2/groups/schema
If you request informations of a specific object, in general you have to append its entire uuid or the starting pattern of its uuid to the path.

Here are two examples that should return the same object:

http://localhost:5000/api/v2/nodes/338357f4-f236-4f9c-8fbe-cd550dc6b858
http://localhost:5000/api/v2/nodes/338357f4-f2

In the first URL, we have specified the full uuid, whereas in the second only a chunk of its first characters that is sufficiently long to match only one uuid in the database. Il the uuid pattern is not long enough to identify a unique object, the API will raise an exception. The only exception to this rule is the resource users since the corresponding AiiDA``User`` class has no uuid attribute. In this case, you have to specify the pk (integer) of the object. Here is an example:

http://localhost:5000/api/v2/users/2

When you ask for a single object (and only in that case) you can construct more complex requests, namely, you can ask for its inputs/outputs or for its attributes/extras. In the first case you have to append to the path the string /io/inputs or io/outputs depending on the desired relation between the nodes, whereas in the second case you have to append content/attributes or content/extras depending on the kind of content you want to access. Here are some examples:

http://localhost:5000/api/v2/calculations/338357f4-f2/io/inputs
http://localhost:5000/api/v2/nodes/338357f4-f2/io/inputs
http://localhost:5000/api/v2/data/338357f4-f2/content/attributes
http://localhost:5000/api/v2/nodes/338357f4-f2/content/extras

Note

As you can see from the last examples, a Node object can be accessed requesting either a generic nodes resource or requesting the resource corresponding to its specific type (data, codes, calculations, kpoints, ... ). This is because in AiiDA the classes Data, Code, and Calculation are derived from the class Node. In turn, Data is the baseclass of a number of built-in and custom classes, e.g. KpointsData, StructureData, BandsData, ...

How to build the query string

The query string is formed by one or more fields separated by the special character &. Each field has the form (key)(operator)(value). The same constraints that apply to the names of python variables determine what are the valid keys, namely, only alphanumeric characters plus _ are allowed and the first character cannot be a number.

Special keys

There are several special keys that can be specified only once in a query string. All of them must be followed by the operator =. Here is the complete list:

limit:

This key only supports integer values.

offset:

Same format as limit.

perpage:

Same format as limit.

orderby:

This key is used to impose a specific ordering to the results. Two orderings are supported, ascending or descending. The value for the orderby key must be the name of the property with respect to which to order the results. Additionally, + or - can be pre-pended to the value in order to select, respectively, ascending or descending order. Specifying no leading character is equivalent to select ascending order. Ascending (descending) order for strings corresponds to alphabetical (reverse-alphabetical) order, whereas for datetime objects it corresponds to chronological (reverse-chronological order). Examples:

http://localhost:5000/api/v2/c=+id
http://localhost:5000/api/v2/computers=+name
http://localhost:5000/api/v2/computers/orderby=-uuid
alist:

This key is used to specify which attributes of a specific object have to be returned. The desired attributes have to be provided as a comma-separated list of values. It requires that the path contains the endpoint /content/attributes. Example:

http://localhost:5000/api/v2/codes/4fb10ef1-1a/content/attributes?
                            alist=append_text,prepend_text
nalist:

(incompatible with alist) This key is used to specify which attributes of a specific object should not be returned. The syntax is identical to alist. The system returns all the attributes except those specified in the list of values.

elist:

Similar to alist but for extras. It requires that the path contains the endpoint /content/extras.

nelist:

(incompatible with elist) Similar to nalist but for extras. It requires that the path contains the endpoint /content/extras.

Filters

All the other fields composing a query string are filters, that is, conditions that have to be fulfilled by the retrieved objects. When a query string contains multiple filters, those are applied as if they were related by the AND logical clause, that is, the results have to fulfill all the conditions set by the filters (and not any of them). Each filter key is associated to a unique value type. The possible types are:

string:

Text enclosed in double quotes. If the string contains double quotes those have to be escaped as "" (two double quotes). Note that in the unlikely occurrence of a sequence of double quotes you will have to escape it by writing twice as many double quotes.

integer:

Positive integer numbers.

datetime:

Datetime objects expressed in the format (DATE)T(TIME)(SHIFT) where (SHIFT) is the time difference with respect to the UTC time. This is required to avoid any problem arising from comparing datetime values expressed in different time zones. The formats of each field are:

  1. YYYY-MM-DD for (DATE) (mandatory).
  2. HH:MM:SS for (TIME) (optional). The formats HH and HH:MM are supported too.
  3. +/-HH:MM for (SHIFT) (optional, if present requires (TIME) to be specified). The format +/-HH is allowed too. If no shift is specified UTC time is assumed. The shift format follows the general convention that eastern (western) shifts are positive (negative). The API is unaware of daylight saving times so the user is required to adjust the shift to take them into account.

This format is ISO-8601 compliant. Note that date and time fields have to be separated by the character T. Examples:

ctime>2016-04-23T05:45+03:45
ctime<2016-04-23T05:45
mtime>=2016-04-23
bool:

It can be either true or false (lower case).

The following table reports what is the value type and the supported resources associated to each key. .. note:: In the following id is a synonym for pk (often used in other sections of the documentation).

Note

If a key is present in the resource data, it will be also in the derived resources: kpoints, structures, bands

key value type resources
id integer users, computers, groups, nodes, calculations, codes, data
user_id integer groups
uuid string computers, groups, nodes, calculations, codes, data
name string computers, groups
first_name string users
last_name string users
institution string users
email * string users
label string nodes, calculations, codes, data
description string computers, groups
transport_type string computers
transport_params string computers
scheduler_type string computers
enabled bool computers
is_active * bool users
ctime datetime nodes, calculations, codes, data
mtime datetime nodes, calculations, codes, data
last_login * datetime users
date_joined datetime users
type string groups, nodes, calculations, codes, data
state string nodes, calculations, codes, data
hostname string computers

* Key not available via the /users/ endpoint for reasons of privacy.

The operators supported by a specific key are uniquely determined by the value type associated to that key. For example, a key that requires a boolean value admits only the identity operator =, whereas an integer value enables the usage of the relational operators =, <, <=, >, >= plus the membership operator =in=. Please refer to the following table for a comprehensive list.

operator meaning accepted value types
= identity integers, strings, bool, datetime
> greater than integers, strings, datetime
< lower than integers, strings, datetime
>= greater than or equal to integers, strings, datetime
<= lower than or equal to integers, strings, datetime
=like= pattern matching strings
=ilike= case-insensitive pattern matching strings
=in=
identity with one
element of a list
integers, strings, datetime

The pattern matching operators =like= and =ilike= must be followed by the pattern definition, namely, a string where two characters assume special meaning:

  1. % is used to replace an arbitrary sequence of characters, including no characters.
  2. _ is used to replace one or zero characters.

Differently from =like=, =ilike= assumes that two characters that only differ in the case are equal.

To prevent interpreting special characters as wildcards, these have to be escaped by pre-pending the character \.

Examples:

Filter Matched string Non-matched string
name=like="a%d_" “aiida” “AiiDA”
name=ilike="a%d_" “aiida”, “AiiDA”  
name=like="a_d_"   “aiida”
name=like="aii%d_a" “aiida”  
uuid=like="cdfd48%"
“cdfd48f9-7ed2-4969
-ba06-09c752b83d2”
 
description=like="This calculation is %\% useful"
“This calculation is
100% useful”
 

The membership operator =in= has to be followed by a comma-separated list of values of the same type. The condition is fulfilled if the column value of an object is an element of the list.

Examples:

http://localhost:5000/api/v2/nodes?id=in=45,56,78
http://localhost:5000/api/v2/computers/?
scheduler_type=in="slurm","pbs"&state="FINISHED"

The relational operators ‘<’, ‘>’, ‘<=’, ‘>=’ assume natural ordering for integers, (case-insensitive) alphabetical ordering for strings, and chronological ordering for datetime values.

Examples:

  • http://localhost:5000/api/v2/nodes?id>578 selects the nodes having an id larger than 578.
  • http://localhost:5000/api/v2/users/?last_login>2014-04-07 selects only the user that logged in for the last time after April 7th, 2014.
  • http://localhost:5000/api/v2/users/?last_name<="m" selects only the users whose last name begins with a character in the range [a-m].

Note

Object types have to be specified by a string that defines their position in the AiiDA source tree ending with a dot. Examples:

  • type="data.Data." selects only objects of Data type
  • type="data.remote.RemoteData." selects only objects of RemoteData type

Note

If you use in your request the endpoint io/input (io/outputs) together with one or more filters, the latter are applied to the input (output) nodes of the selected pk. For example, the request:

http://localhost:5000/api/v2/nodes/a67fba41-8a/io/outputs/?
                  type="data.folder.FolderData."

would first search for the outputs of the node with uuid starting with “a67fba41-8a” and then select only those objects of type FolderData.

The HTTP response

The HTTP response of the REST API consists in a JSON object, a header, and a status code. Possible status are:

  1. 200 for successful requests.
  2. 400 for bad requests. In this case, the JSON object contains only an error message describing the problem.
  3. 500 for a generic internal server error. The JSON object contains only a generic error message.
  4. 404 for invalid url. Differently from the 400 status, it is returned when the REST API does not succeed in directing the request to a specific resource. This typically happens when the path does not match any of the supported format. No JSON is returned.

The header is a standard HTTP response header with the additional custom field X-Total-Counts and, only if paginated results are required, a non-empty Link field, as described in the Pagination section.

The JSON object mainly contains the list of the results returned by the API. This list is assigned to the key data. Additionally, the JSON object contains several informations about the request (keys method, url, url_root, path, query_string, resource_type, and pk).

How to run the REST API through Apache

By default verdi restapi hooks up the REST API through the HTTP server (Werkzeug) that is usually bundled with Python distributions. However, to deploy real web applications the server of choice is in most cases Apache. in fact, you can instruct Apache to run Python applications by employing the WSGI module and the AiiDA REST API is inherently structured so that you can easily realize the pipeline AiiDA->WSGI->Apache. Moreover, one single Apache service can support multiple apps so that you can, for instance, hook up multiple APIs using as many different sets of configurations. For example, one might have several apps connecting to different AiiDA profiles. We’ll go through an example to explain how to achieve this result.

We assume you have a working installation of Apache that includes mod_wsgi.

The goal of the example is to hookup the APIs django and sqlalchemy pointing to two AiiDA profiles, called for simplicity django and sqlalchemy.

All the relevant files are enclosed under the path <aiida.source.code.path>/docs/wsgi/. In each of the folders app1/ and app2/, there is a file named rest.wsgi containing a Pytyhon script that instantiates and configures a python web app called application, according to the rules of mod_wsgi. For how the script is written, the object application is configured through the file config.py contained in the same folder. Indeed, in app1/config.py the variable aiida-profile is set to "django", whereas in app2/config.py its value is "sqlalchemy".

Anyway, the path where you put the .wsgi file as well as its name are irrelevant as long as they are correctly referred to in the Apache configuration file, as shown later on. Similarly, you can place config.py in a custom path, provided you change the variable config_file_path in the wsgi file accordingly.

In rest.wsgi probably the only options you might want to change is catch_internal_server. When set to True, it lets the exceptions thrown during the execution of the app propagate all the way through until they reach the logger of Apache. Especially when the app is not entirely stable yet, one would like to read the full python error traceback in the Apache error log.

Finally, you need to setup the Apache site through a proper configuration file. We provide two template files: one.conf or many.conf. The first file tells Apache to bundle both apps in a unique Apache daemon process. Apache usually creates multiple process dynamically and with this configuration each process will handle both apps.

The script many.conf, instead, defines two different process groups, one for each app. So the processes created dynamically by Apache will always be handling one app each. The minimal number of Apache daemon processes equals the number of apps, contrarily to the first architecture, where one process is enough to handle two or even a larger number of apps.

Let us call the two apps for this example django and sqlalchemy. In both one.conf and many.conf, the important directives that should be updated if one changes the paths or names of the apps are:

  • WSGIProcessGroup to define the process groups for later reference. In one.conf this directive appears only once to define the generic group profiles, as there is only one kind of process handling both apps. In many.conf this directive appears once per app and is embedded into a “Location” tag, e.g.:

    <Location /django>
        WSGIProcessGroup sqlalchemy
    <Location/>
    
  • WSGIDaemonProcess to define the path to the AiiDA virtual environment. This appears once per app in both configurations.

  • WSGIScriptAlias to define the absolute path of the .wsgi file of each app.

  • The <Directory> tag mainly used to grant Apache access to the files used by each app, e.g.:

    <Directory "<aiida.source.code.path>/aiida/restapi/wsgi/app1">
            Require all granted
    </Directory>
    

The latest step is to move either one.conf or many.conf into the Apache configuration folder and restart the Apache server. In Ubuntu, this is usually done with the commands:

$ cp <conf_file>.conf /etc/apache2/sites-enabled/000-default.conf
$ sudo service apache2 restart

We believe the two basic architectures we have just explained can be successfully applied in many different deployment scenarios. Nevertheless, we suggest users who need finer tuning of the deployment setup to look into to the official documentation of Apache and, more importantly, WSGI.

The URLs of the requests handled by Apache must start with one of the paths specified in the directives WSGIScriptAlias. These paths identify uniquely each app and allow Apache to route the requests to their correct apps. Examples of well-formed URLs are:

$ curl http://localhost/django/api/v2/computers -X GET
$ curl http://localhost/sqlalchemy/api/v2/computers -X GET

The first (second)request will be handled by the app django (sqlalchemy), namely will serve results fetched from the profile django (sqlalchemy). Notice that we haven’t specified any port in the URLs since Apache listens conventionally to port 80, where any request lacking the port is automatically redirected.

Examples
Computers
  1. Get a list of the Computers objects.

    REST url:

    http://localhost:5000/api/v2/computers?limit=3&offset=2&orderby=id
    

    Description:

    returns the list of three Computer objects (limit=3) starting from the 3rd row (offset=2) of the database table and the list will be ordered by ascending values of id.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    {
      "data": {
        "computers": [
          {
            "description": "Alpha Computer",
            "enabled": true,
            "hostname": "alpha.aiida.net",
            "id": 3,
            "name": "Alpha",
            "scheduler_type": "slurm",
            "transport_params": "{}",
            "transport_type": "ssh",
            "uuid": "9b5c84bb-4575-4fbe-b18c-b23fc30ec55e"
          },
          {
            "description": "Beta Computer",
            "enabled": true,
            "hostname": "beta.aiida.net",
            "id": 4,
            "name": "Beta",
            "scheduler_type": "slurm",
            "transport_params": "{}",
            "transport_type": "ssh",
            "uuid": "5d490d77-638d-4d4b-8288-722f930783c8"
          },
          {
            "description": "Gamma Computer",
            "enabled": true,
            "hostname": "gamma.aiida.net",
            "id": 5,
            "name": "Gamma",
            "scheduler_type": "slurm",
            "transport_params": "{}",
            "transport_type": "ssh",
            "uuid": "7a0c3ff9-1caf-405c-8e89-2369cf91b634"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/computers",
      "pk": null,
      "query_string": "limit=3&offset=2&orderby=id",
      "resource_type": "computers",
      "url": "http://localhost:5000/api/v2/computers?limit=3&offset=2&orderby=id",
      "url_root": "http://localhost:5000/"
    }
    
  2. Get details of a single Computer object:

    REST url:

    http://localhost:5000/api/v2/computers/5d490d77-638d
    

    Description:

    returns the details of the Computer object uuid="5d490d77-638d...".

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    {
      "data": {
        "computers": [
          {
            "description": "Beta Computer",
            "enabled": true,
            "hostname": "beta.aiida.net",
            "id": 4,
            "name": "Beta",
            "scheduler_type": "slurm",
            "transport_params": "{}",
            "transport_type": "ssh",
            "uuid": "5d490d77-638d-4d4b-8288-722f930783c8"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/computers/5d490d77-638d",
      "pk": 4,
      "query_string": "",
      "resource_type": "computers",
      "url": "http://localhost:5000/api/v2/computers/5d490d77-638d",
      "url_root": "http://localhost:5000/"
    }
    
Nodes
  1. Get a list of Node objects

    REST url:

    http://localhost:5000/api/v2/nodes?limit=2&offset=8&orderby=-id
    

    Description:

    returns the list of two Node objects (limit=2) starting from 9th row (offset=8) of the database table and the list will be ordered by id in descending order.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    {
      "data": {
        "nodes  ": [
          {
            "ctime": "Fri, 29 Apr 2016 19:24:12 GMT",
            "id": 386913,
            "label": "",
            "mtime": "Fri, 29 Apr 2016 19:24:13 GMT",
            "state": null,
            "type": "calculation.inline.InlineCalculation.",
            "uuid": "68d2ed6c-6f51-4546-8d10-7fe063525ab8"
          },
          {
            "ctime": "Fri, 29 Apr 2016 19:24:00 GMT",
            "id": 386912,
            "label": "",
            "mtime": "Fri, 29 Apr 2016 19:24:00 GMT",
            "state": null,
            "type": "data.parameter.ParameterData.",
            "uuid": "a39dc158-fedd-4ea1-888d-d90ec6f86f35"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/nodes",
      "pk": null,
      "query_string": "limit=2&offset=8&orderby=-id",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes?limit=2&offset=8&orderby=-id",
      "url_root": "http://localhost:5000/"
    }
    
  2. Get the details of a single Node object:

    REST url:

    http://localhost:5000/api/v2/nodes/e30da7cc
    

    Description:

    returns the details of the Node object with uuid="e30da7cc...".

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    {
      "data": {
        "nodes  ": [
          {
            "ctime": "Fri, 14 Aug 2015 13:18:04 GMT",
            "id": 1,
            "label": "",
            "mtime": "Mon, 25 Jan 2016 14:34:59 GMT",
            "state": "IMPORTED",
            "type": "data.parameter.ParameterData.",
            "uuid": "e30da7cc-af50-40ca-a940-2ac8d89b2e0d"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/nodes/e30da7cc",
      "pk": 1,
      "query_string": "",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes/e30da7cc",
      "url_root": "http://localhost:5000/"
    }
    
  3. Get the list of inputs of a specific node.

    REST url:

    http://localhost:5000/api/v2/nodes/de83b1/io/inputs?limit=2
    

    Description:

    returns the list of the first two input nodes (limit=2) of the Node object with uuid="de83b1...".

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    {
      "data": {
        "inputs": [
          {
            "ctime": "Fri, 24 Jul 2015 18:49:23 GMT",
            "id": 10605,
            "label": "",
            "mtime": "Mon, 25 Jan 2016 14:35:00 GMT",
            "state": "IMPORTED",
            "type": "data.remote.RemoteData.",
            "uuid": "16b93b23-8629-4d83-9259-de2a947b43ed"
          },
          {
            "ctime": "Fri, 24 Jul 2015 14:33:04 GMT",
            "id": 9215,
            "label": "",
            "mtime": "Mon, 25 Jan 2016 14:35:00 GMT",
            "state": "IMPORTED",
            "type": "data.array.kpoints.KpointsData.",
            "uuid": "1b4d22ec-9f29-4e0d-9d68-84ddd18ad8e7"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/nodes/de83b1/io/inputs",
      "pk": 6,
      "query_string": "limit=2",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes/de83b1/io/inputs?limit=2",
      "url_root": "http://localhost:5000/"
    }
    
  4. Filter the inputs/outputs of a node by their type.

    REST url:

    http://localhost:5000/api/v2/nodes/de83b1/io/inputs?type="data.array.kpoints.KpointsData."
    

    Description:

    returns the list of the *KpointsData* input nodes of the *Node* object with ``uuid=”de83b1...”`.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    {
      "data": {
        "inputs": [
          {
            "ctime": "Fri, 24 Jul 2015 14:33:04 GMT",
            "id": 9215,
            "label": "",
            "mtime": "Mon, 25 Jan 2016 14:35:00 GMT",
            "state": "IMPORTED",
            "type": "data.array.kpoints.KpointsData.",
            "uuid": "1b4d22ec-9f29-4e0d-9d68-84ddd18ad8e7"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/nodes/de83b1/io/inputs",
      "pk": 6,
      "query_string": "type=\"data.array.kpoints.KpointsData.\"",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes/de83b1/io/inputs?type=\"data.array.kpoints.KpointsData.\"",
      "url_root": "http://localhost:5000/"
    }
    

    REST url:

    http://localhost:5000/api/v2/nodes/de83b1/io/outputs?type="data.remote.RemoteData."
    

    Description:

    returns the list of the RemoteData output nodes of the Node object with uuid="de83b1...".

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    {
      "data": {
        "outputs": [
          {
            "ctime": "Fri, 24 Jul 2015 20:35:02 GMT",
            "id": 2811,
            "label": "",
            "mtime": "Mon, 25 Jan 2016 14:34:59 GMT",
            "state": "IMPORTED",
            "type": "data.remote.RemoteData.",
            "uuid": "bd48e333-da8a-4b6f-8e1e-6aaa316852eb"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/nodes/de83b1/io/outputs",
      "pk": 6,
      "query_string": "type=\"data.remote.RemoteData.\"",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes/de83b1/io/outputs?type=\"data.remote.RemoteData.\"",
      "url_root": "http://localhost:5000/"
    }
    
  5. Getting the list of the attributes/extras of a specific node

    REST url:

    http://localhost:5000/api/v2/nodes/ffe11/content/attributes
    

    Description:

    returns the list of all attributes of the Node object with uuid="ffe11...".

    Response:

    {
      "data": {
        "attributes": {
          "append_text": "",
          "input_plugin": "quantumespresso.pw",
          "is_local": false,
          "prepend_text": "",
          "remote_exec_path": "/project/espresso-5.1-intel/bin/pw.x"
        }
      },
      "method": "GET",
      "path": "/api/v2/nodes/ffe11/content/attributes",
      "pk": 1822,
      "query_string": "",
      "resource_type": "nodes",
      "url": "http://localhost:5000/api/v2/nodes/ffe11/content/attributes",
      "url_root": "http://localhost:5000/"
    }
    

    REST url:

    http://localhost:5000/api/v2/nodes/ffe11/content/extras
    

    Description:

    returns the list of all the extras of the Node object with uuid="ffe11...".

    Response:

    {
      "data": {
        "extras": {
          "trialBool": true,
          "trialFloat": 3.0,
          "trialInt": 34,
          "trialStr": "trial"
        }
      },
      "method": "GET",
      "path": "/api/v2/codes/ffe11/content/extras",
      "pk": 1822,
      "query_string": "",
      "resource_type": "codes",
      "url": "http://localhost:5000/api/v2/codes/ffe11/content/extras",
      "url_root": "http://localhost:5000/"
    }
    
  6. Getting a user-defined list of attributes/extras of a specific node

    REST url:

    http://localhost:5000/api/v2/codes/ffe11/content/attributes?alist=append_text,is_local
    

    Description:

    returns a list of the attributes append_text and is_local of the Node object with uuid="ffe11...".

    Response:

    {
      "data": {
        "attributes": {
          "append_text": "",
          "is_local": false
        }
      },
      "method": "GET",
      "path": "/api/v2/codes/ffe11/content/attributes",
      "pk": 1822,
      "query_string": "alist=append_text,is_local",
      "resource_type": "codes",
      "url": "http://localhost:5000/api/v2/codes/ffe11/content/attributes?alist=append_text,is_local",
      "url_root": "http://localhost:5000/"
    }
    

    REST url:

    http://localhost:5000/api/v2/codes/ffe11/content/extras?elist=trialBool,trialInt
    

    Description:

    returns a list of the extras trialBool and trialInt of the Node object with uuid="ffe11...".

    Response:

    {
      "data": {
        "extras": {
          "trialBool": true,
          "trialInt": 34
        }
      },
      "method": "GET",
      "path": "/api/v2/codes/ffe11/content/extras",
      "pk": 1822,
      "query_string": "elist=trialBool,trialInt",
      "resource_type": "codes",
      "url": "http://localhost:5000/api/v2/codes/ffe11/content/extras?elist=trialBool,trialInt",
      "url_root": "http://localhost:5000/"
    }
    
  7. Getting all the attributes/extras of a specific node except a user-defined list

    REST url:

    http://localhost:5000/api/v2/codes/ffe11/content/attributes?nalist=append_text,is_local
    

    Description:

    returns all the attributes of the Node object with uuid="ffe11..." except append_text and is_local.

    Response:

    {
      "data": {
        "attributes": {
          "input_plugin": "quantumespresso.pw",
          "prepend_text": "",
          "remote_exec_path": "/project/espresso-5.1-intel/bin/pw.x"
        }
      },
      "method": "GET",
      "path": "/api/v2/codes/ffe11/content/attributes",
      "pk": 1822,
      "query_string": "nalist=append_text,is_local",
      "resource_type": "codes",
      "url": "http://localhost:5000/api/v2/codes/ffe11/content/attributes?nalist=append_text,is_local",
      "url_root": "http://localhost:5000/"
    }
    

    REST url:

    http://localhost:5000/api/v2/codes/ffe11/content/extras?nelist=trialBool,trialInt
    

    Description:

    returns all the extras of the Node object with uuid="ffe11..." except trialBool and trialInt.

    Response:

    {
      "data": {
        "extras": {
          "trialFloat": 3.0,
          "trialStr": "trial"
        }
      },
      "method": "GET",
      "path": "/api/v2/codes/ffe11/content/extras",
      "pk": 1822,
      "query_string": "nelist=trialBool,trialInt",
      "resource_type": "codes",
      "url": "http://localhost:5000/api/v2/codes/ffe11/content/extras?nelist=trialBool,trialInt",
      "url_root": "http://localhost:5000/"
    }
    

Note

The same REST urls supported for the resource nodes are also available with the derived resources, namely, calculations, data, and codes, just changing the resource field in the path.

Users
  1. Getting a list of the users

    REST url:

    http://localhost:5000/api/v2/users/
    

    Description:

    returns a list of all the User objects.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    {
      "data": {
        "users": [
          {
            "date_joined": "Mon, 25 Jan 2016 14:31:17 GMT",
            "first_name": "AiiDA",
            "id": 1,
            "institution": "",
            "last_name": "Daemon"
          },
          {
            "date_joined": "Thu, 11 Aug 2016 12:35:32 GMT",
            "first_name": "Gengis",
            "id": 2,
            "institution": "",
            "last_name": "Khan"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/users/",
      "pk": null,
      "query_string": "",
      "resource_type": "users",
      "url": "http://localhost:5000/api/v2/users/",
      "url_root": "http://localhost:5000/"
    }
    
  2. Getting a list of users whose first name starts with a given string

    REST url:

    http://localhost:5000/api/v2/users/?first_name=ilike="aii%"
    

    Description:

    returns a lists of the User objects whose first name starts with "aii", regardless the case of the characters.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    {
      "data": {
        "users": [
          {
            "date_joined": "Mon, 25 Jan 2016 14:31:17 GMT",
            "first_name": "AiiDA",
            "id": 1,
            "institution": "",
            "last_name": "Daemon"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/users/",
      "pk": null,
      "query_string": "first_name=ilike=%22aii%%22",
      "resource_type": "users",
      "url": "http://localhost:5000/api/v2/users/?first_name=ilike=\"aii%\"",
      "url_root": "http://localhost:5000/"
    }
    
Groups
  1. Getting a list of groups

    REST url:

    http://localhost:5000/api/v2/groups/?limit=10&orderby=-user_id
    

    Description:

    returns the list of ten Group objects (limit=10) starting from the 1st row of the database table (offset=0) and the list will be ordered by user_id in descending order.

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    {
      "data": {
        "groups": [
          {
            "description": "",
            "id": 104,
            "name": "SSSP_new_phonons_0p002",
            "type": "",
            "user_id": 2,
            "uuid": "7c0e0744-8549-4eea-b1b8-e7207c18de32"
          },
          {
            "description": "",
            "id": 102,
            "name": "SSSP_cubic_old_phonons_0p025",
            "type": "",
            "user_id": 1,
            "uuid": "c4e22134-495d-4779-9259-6192fcaec510"
          },
          ...
    
        ]
      },
      "method": "GET",
      "path": "/api/v2/groups/",
      "pk": null,
      "query_string": "limit=10&orderby=-user_id",
      "resource_type": "groups",
      "url": "http://localhost:5000/api/v2/groups/?limit=10&orderby=-user_id",
      "url_root": "http://localhost:5000/"
    }
    
  2. Getting the details of a specific group

    REST url:

    http://localhost:5000/api/v2/groups/a6e5b
    

    Description:

    returns the details of the Group object with uuid="a6e5b...".

    Response:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    {
      "data": {
        "groups": [
          {
            "description": "GBRV US pseudos, version 1.2",
            "id": 23,
            "name": "GBRV_1.2",
            "type": "data.upf.family",
            "user_id": 2,
            "uuid": "a6e5b6c6-9d47-445b-bfea-024cf8333c55"
          }
        ]
      },
      "method": "GET",
      "path": "/api/v2/groups/a6e5b",
      "pk": 23,
      "query_string": "",
      "resource_type": "groups",
      "url": "http://localhost:5000/api/v2/groups/a6e5b",
      "url_root": "http://localhost:5000/"
    }
    

Other guide resources

Other guide resources

AiiDA cookbook (useful code snippets)

This cookbook is intended to be a collection of useful short scripts and code snippets that may be useful in the everyday usage of AiiDA. Please read carefully the nodes (if any) before running the scripts!

Deletion of nodes

At the moment, we do not support natively the deletion of nodes. This is mainly because it is very dangerous to delete data, as this is cannot be undone.

If you really feel the need to delete some code, you can use the function below.

Note

WARNING! In order to preserve the provenance, this function will delete not only the list of specified nodes, but also all the children nodes! So please be sure to double check what is going to be deleted before running this function.

Here is the function, pass a list of PKs as parameter to delete those nodes and all the children nodes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
def delete_nodes(pks_to_delete):
    """
    Delete a set of nodes.

    :note: The script will also delete
    all children calculations generated from the specified nodes.

    :param pks_to_delete: a list of the PKs of the nodes to delete
    """
    from django.db import transaction
    from django.db.models import Q
    from aiida.backends.djsite.db import models
    from aiida.orm import load_node

    # Delete also all children of the given calculations
    # Here I get a set of all pks to actually delete, including
    # all children nodes.
    all_pks_to_delete = set(pks_to_delete)
    for pk in pks_to_delete:
        all_pks_to_delete.update(models.DbNode.objects.filter(
            parents__in=pks_to_delete).values_list('pk', flat=True))

    print "I am going to delete {} nodes, including ALL THE CHILDREN".format(
        len(all_pks_to_delete))
    print "of the nodes you specified. Do you want to continue? [y/N]"
    answer = raw_input()

    if answer.strip().lower() == 'y':
        # Recover the list of folders to delete before actually deleting
        # the nodes.  I will delete the folders only later, so that if
        # there is a problem during the deletion of the nodes in
        # the DB, I don't delete the folders
        folders = [load_node(pk).folder for pk in all_pks_to_delete]

        with transaction.atomic():
            # Delete all links pointing to or from a given node
            models.DbLink.objects.filter(
                Q(input__in=all_pks_to_delete) |
                Q(output__in=all_pks_to_delete)).delete()
            # now delete nodes
            models.DbNode.objects.filter(pk__in=all_pks_to_delete).delete()

        # If we are here, we managed to delete the entries from the DB.
        # I can now delete the folders
        for f in folders:
            f.erase()

Troubleshooting and tricks

Some tricks
Using the proxy_command option with ssh

This page explains how to use the proxy_command feature of ssh. This feature is needed when you want to connect to a computer B, but you are not allowed to connect directly to it; instead, you have to connect to computer A first, and then perform a further connection from A to B.

Requirements

The idea is that you ask ssh to connect to computer B by using a proxy to create a sort of tunnel. One way to perform such an operation is to use netcat, a tool that simply takes the standard input and redirects it to a given TCP port.

Therefore, a requirement is to install netcat on computer A. You can already check if the netcat or nc command is available on you computer, since some distributions include it (if it is already installed, the output of the command:

which netcat

or:

which nc

will return the absolute path to the executable).

If this is not the case, you will need to install it on your own. Typically, it will be sufficient to look for a netcat distribution on the web, unzip the downloaded package, cd into the folder and execute something like:

./configure --prefix=.
make
make install

This usually creates a subfolder bin, containing the netcat and nc executables. Write down the full path to nc that we will need later.

ssh/config

You can now test the proxy command with ssh. Edit the ~/.ssh/config file on the computer on which you installed AiiDA (or create it if missing) and add the following lines:

Host FULLHOSTNAME_B
Hostname FULLHOSTNAME_B
User USER_B
ProxyCommand ssh USER_A@FULLHOSTNAME_A ABSPATH_NETCAT %h %p

where you have to replace:

  • FULLHOSTNAMEA and FULLHOSTNAMEB with the fully-qualified hostnames of computer A and B (remembering that B is the computer you want to actually connect to, and A is the intermediate computer to which you have direct access)
  • USER_A and USER_B are the usernames on the two machines (that can possibly be the same).
  • ABSPATH_NETCAT is the absolute path to the nc executable that you obtained in the previous step.

Remember also to configure passwordless ssh connections using ssh keys both from your computer to A, and from A to B.

Once you add this lines and save the file, try to execute:

ssh FULLHOSTNAME_B

which should allow you to directly connect to B.

WARNING

There are several versions of netcat available on the web. We found at least one case in which the executable wasn’t working properly. At the end of the connection, the netcat executable might still be running: as a result, you may rapidly leave the cluster with hundreds of opened ssh connections, one for every time you connect to the cluster B. Therefore, check on both computers A and B that the number of processes netcat and ssh are disappearing if you close the connection. To check if such processes are running, you can execute:

ps -aux | grep <username>

Remember that a cluster might have more than one login node, and the ssh connection will randomly connect to any of them.

AiiDA config

If the above steps work, setup and configure now the computer as explained here.

If you properly set up the ~/.ssh/config file in the previous step, AiiDA should properly parse the information in the file and provide the correct default value for the proxy_command during the verdi computer configure step.

Some notes on the proxy_command option
  • In the ~/.ssh/config file, you can leave the %h and %p placeholders, that are then automatically replaced by ssh with the hostname and the port of the machine B when creating the proxy. However, in the AiiDA proxy_command option, you need to put the actual hostname and port. If you start from a properly configured ~/.ssh/config file, AiiDA will already replace these placeholders with the correct values. However, if you input the proxy_command value manually, remember to write the hostname and the port and not %h and %p.
  • In the ~/.ssh/config file, you can also insert stdout and stderr redirection, e.g. 2> /dev/null to hide any error that may occur during the proxying/tunneling. However, you should only give AiiDA the actual command to be executed, without any redirection. Again, AiiDA will remove the redirection when it automatically reads the ~/.ssh/config file, but be careful if entering manually the content in this field.
Connection problems
  • When AiiDA tries to connect to the remote computer, it says paramiko.SSHException: Server u'FULLHOSTNAME' not found in known_hosts

    AiiDA uses the paramiko library to establish SSH connections. paramiko is able to read the remote host keys from the ~/.ssh/known_hosts of the user under which the AiiDA daemon is running. You therefore have to make sure that the key of the remote host is stored in the file.

    • As a first check, login as the user under which the AiiDA daemon is running and run a:

      ssh FULLHOSTNAME
      

      command, where FULLHOSTNAME is the complete host name of the remote computer configured in AiiDA. If the key of the remote host is not in the known_hosts file, SSH will ask confirmation and then add it to the file.

    • If the above point is not sufficient, check the format of the remote host key. On some machines (we know that this issue happens at least on recent Ubuntu distributions) the default format is not RSA but ECDSA. However, paramiko is still not able to read keys written in this format.

      To discover the format, run the following command:

      ssh-keygen -F FULLHOSTNAME
      

      that will print the remote host key. If the output contains the string ecdsa-sha2-nistp256, then paramiko will not be able to use this key (see below for a solution). If instead ssh-rsa, the key should be OK and paramiko will be able to use it.

      In case your key is in ecdsa format, you have to first delete the key by using the command:

      ssh-keygen -R FULLHOSTNAME
      

      Then, in your ~/.ssh/config file (create it if it does not exist) add the following lines:

      Host *
        HostKeyAlgorithms ssh-rsa
      

      (use the same indentation, and leave an empty line before and one after). This will set the RSA algorithm as the default one for all remote hosts. In case, you can set the HostKeyAlgorithms attribute only to the relevant computers (use man ssh_config for more information).

      Then, run a:

      ssh FULLHOSTNAME
      

      command. SSH will ask confirmation and then add it to the file, but this time it should use the ssh-rsa format (it will say so in the prompt messsage). You can also double-check that the host key was correctly inserted using the ssh-keygen -F FULLHOSTNAME command as described above. Now, the error messsage should not appear anymore.

Increasing the debug level

By default, the logging level of AiiDA is minimal to avoid filling logfiles. Only warnings and errors are logged (to the ~/.aiida/daemon/log/aiida_daemon.log file), while info and debug messages are discarded.

If you are experiencing a problem, you can change the default minimum logging level of AiiDA messages (and celery messages – celery is the library that we use to manage the daemon process) using, on the command line, the two following commands:

verdi devel setproperty logging.celery_loglevel DEBUG
verdi devel setproperty logging.aiida_loglevel DEBUG

After rebooting the daemon (verdi daemon restart), the number of messages logged will increase significantly and may help in understanding the source of the problem.

Note

In the command above, you can use a different level than DEBUG. The list of the levels and their order is the same of the standard python logging module.

Note

When the problem is solved, we suggest to bring back the default logging level, using the two commands:

verdi devel delproperty logging.celery_loglevel
verdi devel delproperty logging.aiida_loglevel

to avoid to fill the logfiles.

Tips to ease the life of the hard drive (for large databases)

Those tips are useful when your database is very large, i.e. several hundreds of thousands of nodes and workflows or more. With such large databases the hard drive may be constantly working and the computer slowed down a lot. Below are some solutions to take care of the most typical reasons.

Repository backup

The backup of the repository takes an extensively long time if it is done through a standard rsync or backup software, since it contains as many folders as the number of nodes plus the number of workflows (and each folder can contain many files!). A solution is to use instead the incremental backup described in the repository backup section.

mlocate cron job

Under typical Linux distributions, there is a cron job (called updatedb.mlocate) running every day to update a database of files and folders – this is to be used by the locate command. This might become problematic since the repository contains many folders and will be scanned everyday. The net effect is a hard drive almost constantly working.

To avoid this issue, edit as root the file /etc/updatedb.conf and put in PRUNEPATHS the name of the repository folder.

Using AiiDA in multi-user mode

Note

multi-user mode is still not fully supported, and the way it works will change significantly soon. Do not use unless you know what you are doing.

Todo

To be documented.

Discuss:

  • Security issues

  • Under which linux user (aiida) to run, and remove the pwd with passwd -d aiida.

  • How to setup each user (aiida@localhost for the daemon user, correct email for the others using verdi install --only-config)

  • How to configure a given user (verdi user configure)

  • How to list users (also the –color option, and the meaning of colors)

  • How to setup the daemon user (verdi daemon configureuser)

  • How to start the daemon

  • How to configure the permissions! (all AiiDA in the same group, and set the ‘chmod -R g+s’ flag to all folders and subfolders of the AiiDA repository) (comment that by default now we have a flag (harcoded to True) in aiida.common.folders to give write permissions to the group both to files and folders created using the Folder class.

  • Some configuration example:

    {u'compress': True,
     u'key_filename': u'/home/aiida/.aiida/sshkeys/KEYFILE',
     u'key_policy': u'RejectPolicy',
     u'load_system_host_keys': True,
     u'port': 22,
     u'proxy_command': u'ssh -i /home/aiida/.aiida/sshkeys/KEYFILE USERNAME@MIDDLECOMPUTER /bin/nc FINALCOMPUTER 22',
     u'timeout': 60,
     u'username': u'xxx'}
    
  • Moreover, on the remote computer do:

    ssh-keyscan FINALCOMPUTER
    

    and append the output to the known_hosts of the aiida daemon account. Do the same also for the MIDDLECOMPUTER if a proxy_command is user.

Developer’s guide

Developer’s guide

Developer’s Guide For AiiDA

Python style

When writing python code, a more than reasonable guideline is given by the Google python styleguide http://google-styleguide.googlecode.com/svn/trunk/pyguide.html. The documentation should be written consistently in the style of sphinx.

And more generally, write verbose! Will you remember after a month why you had to write that check on that line? (Hint: no) Write comments!

Pylint

You can check your code style and other important code errors by using Pylint. Once installed you can run Pylint from the root source directory on the code using the command:

pylint aiida

The most important part is the summary under the Messages table near the end.

Version number

The AiiDA version number is stored in aiida/__init__.py. Make sure to update this when changing version number.

Inline calculations

If an operation is extremely fast to be run, this can be done directly in Python, without being submitted to a cluster. However, this operation takes one (or more) input data nodes, and creates new data nodes, the operation itself is not recorded in the database, and provenance is lost. In order to put a Calculation object inbetween, we define the InlineCalculation class, that is used as the class for these calculations that are run “in-line”.

We also provide a wrapper (that also works as a decorator of a function), make_inline(). This can be used to wrap suitably defined function, so that after their execution, a node representing their execution is stored in the DB, and suitable input and output nodes are also stored.

Note

See the documentation of this function for further documentation of how it should be used, and of the requirements for the wrapped function.

Database schema
Django

The Django database schema can be found in aiida.backends.djsite.db.models.

If you need to change the database schema follow these steps:

  1. Make all the necessary changes to aiida.backends.djsite.db.models

  2. Create a new migration file. From aiida/backends/djsite, run:

    python manage.py makemigrations
    

    This will create the migration file in aiida/backends/djsite/db/migrations whose name begins with a number followed by some description. If the description is not appropriate then change to it to something better but retain the number.

  3. Open the generated file and make the following changes:

    from aiida.backends.djsite.db.migrations import update_schema_version
    ...
    SCHEMA_VERSION = # choose an appropriate version number
                     # (hint: higher than the last migration!)
    ...
    class Migration(migrations.Migration):
      ...
      operations = [
        ..
        update_schema_version(SCHEMA_VERSION)
      ]
    
  1. Change the LATEST_MIGRATION variable in aiida/backends/djsite/db/migrations/__init__.py to the name of your migration file:

    LATEST_MIGRATION = '0003_my_db_update'
    

    This let’s AiiDA get the version number from your migration and make sure the database and the code are in sync.

  2. Migrate your database to the new version, (again from aiida/backends/djsite), run:

    python manage.py migrate
    
SQLAlchemy

The SQLAlchemy database schema can be found in aiida/backends/sqlalchemy/models

If you need to change the database schema follow these steps:

  1. Make all the necessary changes to the model than you would like to modify located in the aiida/backends/sqlalchemy/models directory.

  2. Create new migration file by going to aiida/backends/sqlalchemy and executing:

    ./alembic_manage.py revision "This is a new revision"
    

    This will create a new migration file in aiida/backends/sqlalchemy/migrations/versions whose names begins with an automatically generated hash code and the provided message for this new migration. Of course you can change the migration message to a message of your preference. Please look at the generatedvfile and ensure that migration is correct. If you are in doubt about the operations mentioned in the file and its content, you can have a look at the Alembic documentation.

  3. Your database will be automatically migrated to the latest revision as soon as you run your first verdi command. You can also migrate it manually with the help of the alembic_manage.py script as you can see below.

Overview of alembic_manage.py commands

The alembic_manage.py provides several options to control your SQLAlchemy migrations. By executing:

./alembic_manage.py --help

you will get a full list of the available arguments that you can pass and commands. Briefly, the available commands are:

  • upgrade This command allows you to upgrade to the later version. For the moment, you can only upgrade to the latest version.
  • downgrade This command allows you to downgrade the version of your database. For the moment, you can only downgrade to the base version.
  • history This command lists the available migrations in chronological order.
  • current This command displays the current version of the database.
  • revision This command creates a new migration file based on the model changes.
Debugging Alembic

Alembic migrations should work automatically and migrate your database to the latest version. However, if you were using SQLAlchemy before we introduced Alembic, you may get a message like to following during the first migration:

sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation
"db_dbuser" already exists [SQL: '\nCREATE TABLE db_dbuser (\n\tid SERIAL
NOT NULL, \n\temail VARCHAR(254), \n\tpassword VARCHAR(128),
\n\tis_superuser BOOLEAN NOT NULL, \n\tfirst_name VARCHAR(254),
\n\tlast_name VARCHAR(254), \n\tinstitution VARCHAR(254), \n\tis_staff
BOOLEAN, \n\tis_active BOOLEAN, \n\tlast_login TIMESTAMP WITH TIME ZONE,
\n\tdate_joined TIMESTAMP WITH TIME ZONE, \n\tCONSTRAINT db_dbuser_pkey
PRIMARY KEY (id)\n)\n\n']

In this case, you should create manually the Alembic table in your database and add a line with the database version number. To do so, use psql to connect to the desired database:

psql aiidadb_sqla

(you should replace aiidadb_sqla with the name of the database that you would like to modify). Then, execute the following commands:

CREATE TABLE alembic_version (version_num character varying(32) not null, PRIMARY KEY(version_num));
INSERT INTO alembic_version VALUES ('e15ef2630a1b');
GRANT ALL ON alembic_version TO aiida;
Commits and GIT usage

In order to have an efficient management of the project development, we chose to adopt the guidelines for the branching model described here. In particular:

  • The main branch in which one should work is called develop
  • The master branch is reserved for releases: every commit there implies a new release. Therefore, one should never commit directly there (except once per every release).
  • New releases should also be tagged.
  • Any new modification requiring just one commit can be done in develop
  • mid-to-long development efforts should be done in a branch, branching off from develop (e.g. a long bugfix, or a new feature)
  • while working on the branch, often merge the develop branch back into it (if you also have a remote branch and there are no conflicts, that can be done with one click from the GitHub web interface, and then you just do a local ‘git pull’)
  • remember to fix generic bugs in the develop (or in a branch to be then merged in the develop), not in your local branch (except if the bug is present only in the branch); only then merge develop back into your branch. In particular, if it is a complex bugfix, better to have a branch because it allows to backport the fix also in old releases, if we want to support multiple versions
  • only when a feature is ready, merge it back into develop. If it is a big change, better to instead do a pull request on GitHub instead of directly merging and wait for another (or a few other) developers to accept it beforehand, to be sure it does not break anything.

For a cheatsheet of git commands, see here.

Note

Before committing, always run:

verdi devel tests

to be sure that your modifications did not introduce any new bugs in existing code. Remember to do it even if you believe your modification to be small - the tests run pretty fast!

Pre-commit hooks

Git has a hooks mechanism to run tasks automatically when an event is triggered.We use this mechanism to trigger code checks on every commit. Currently only the :aiida.control module is affected, but more parts of the code will be added progressively.

These checks are to prevent syntax or other coding errors from being committed and to enforce style consistency.

Two tools are run on all changed files before allowing a commit:

  • yapf will automatically format your files for you. If it makes any changes, the commit will fail and you will get the opportunity to review them. Once you are done, git add the changed file and commit again.

  • prospector will run multiple linters (mainly pylint) that will do:

    • Syntax checking
    • Static analysis
    • Check for missing docstrings
    • Check for secrets (prevent you from committing passwords, tokens, etc)

    It will output file, line number and helpful messages and suggestions for each problem it finds.

Setting up the hooks is simple:

cd aiida_core
pip install [-e] .[dev_precommit]
pre-commit install
# from now on on every git commit the checks will be run on changed files

When working on parts of the code that are not included in pre-commit tests yet, it is ok to not install the hooks.

When code that fails the pre-commit checks is commited, the checks will run in a continuous integration stage and the commit will fail tests. Still sometimes it is necessary to push a work-in-progress state to continue working somewhere else, this can be accomplished by git commit --no-verify.

If you want to run the checks without having to call git commit, you can do so using pre-commit run.

Tests
Running the tests

To run the tests, use the:

verdi devel tests

command. You can add a list of tests after the command to run only a selected portion of tests (e.g. while developing, if you discover that only a few tests fail). Use TAB completion to get the full list of tests. For instance, to run only the tests for transport and the generic tests on the database, run:

verdi devel tests aiida.transport db.generic

Furthermore, you need to set up a few things on your local machine to successfully run the tests:

Test profile

To run the tests involving the database, you need to have a special testing profile. A profile is considered a testing profile if the profile name and the database name both start with test_, and the repository path contains test_.

SSH to localhost

For the transport tests, you need to be able to ssh into your local machine (localhost). Here is how this is done for different operating systems:

Linux (Ubuntu)
  • Install openssh-server
  • Create an ssh key (if you don’t have one already), and add it to ~/.ssh/authorized_keys
  • For security reasons, you might want to disallow ssh connections from outside your local machine. To do this, change #ListenAddress 0.0.0.0 to ListenAddress 127.0.0.1 (note the missing #) in /etc/ssh/sshd_config.
  • Now you should be able to type ssh localhost and get a successful connection.

If your OS was not listed above but you managed to get the ssh connection running, please add the description above.

Install extras

In case you did not install all extras, it is possible that some tests fail due to missing packages. If you installed AiiDA with pip, you can use the following command to get the necessary extras:

pip install -e .[testing]

Where the -e flag means that the code is just linked to the appropriate folder, and the package will update when you change the code.

The test-first approach

Remember in best codes actually the tests are written even before writing the actual code, because this helps in having a clear API.

For any new feature that you add/modify, write a test for it! This is extremely important to have the project last and be as bug-proof as possible. Even more importantly, add a test that fails when you find a new bug, and then solve the bug to make the test work again, so that in the future the bug is not introduced anymore.

Remember to make unit tests as atomic as possible, and to document them so that other developers can understand why you wrote that test, in case it should fail after some modification.

Creating a new test

There are three types of tests:

  1. Tests that do not require the usage of the database (testing the creation of paths in k-space, the functionality of a transport plugin, ...)
  2. Tests that require the database, but do not require submission (e.g. verifying that node attributes can be correctly queried, that the transitive closure table is correctly generated, ...)
  3. Tests that require the submission of jobs

For each of the above types of tests, a different testing approach is followed (you can also see existing tests as guidelines of how tests are written):

  1. Tests are written inside the package that one wants to test, creating a test_MODULENAME.py file. For each group of tests, create a new subclass of unittest.TestCase, and then create the tests as methods using the unittests module. Tests inside a selected number of AiiDA packages are automatically discovered when running verdi devel tests. To make sure that your test is discovered, verify that its parent module is listed in the base_allowed_test_folders property of the Devel class, inside aiida.cmdline.commands.devel.

    For an example of this type of tests, see, e.g., the aiida.common.test_utils module.

  2. In this case, we use the testing functionality of Django, adapted to run smoothly with AiiDA.

    To create a new group of tests, create a new python file under aiida.backends.djsite.db.substests, and instead of inheriting each class directly from unittest, inherit from aiida.backends.djsite.db.testbase.AiidaTestCase. In this way:

    1. The Django testing functionality is used, and a temporary database is used
    2. every time the class is created to run its tests, default data are added to the database, that would otherwise be empty (in particular, a computer and a user; for more details, see the code of the AiidaTestCase.setUpClass() method).
    3. at the end of all tests of the class, the database is cleaned (nodes, links, ... are deleted) so that the temporary database is ready to run the tests of the following test classes.

    Note

    it is extremely important that these tests are run from the verdi devel tests command line interface. Not only this will ensure that a temporary database is used (via Django), but also that a temporary repository folder is used. Otherwise, you risk to corrupt your database data. (In the codes there are some checks to avoid that these classes are run without the correct environment being prepared by verdi devel tests.)

    Once you create a new file in aiida.backends.djsite.db.substests, you have to add a new entry to the db_test_list inside aiida.backends.djsite.db.testbase module in order for verdi devel tests to find it. In particular, the key should be the name that you want to use on the command line of verdi devel tests to run the test, and the value should be the full module name to load. Note that, in verdi devel tests, the string db. is prepended to the name of each test involving the database. Therefore, if you add a line:

    db_test_list = {
      ...
      'newtests': 'aiida.backends.djsite.db.subtests.mynewtestsmodule',
      ...
    }
    

    you will be able to run all all tests inside aiida.backends.djsite.db.subtests.mynewtestsmodule with the command:

    verdi devel tests db.newtests
    

    Note

    If in the list of parameters to verdi devel tests you add also a db parameter, then all database-related tests will be run, i.e., all tests that start with db. (or, if you want, all tests in the db_test_list described above).

  3. These tests require an external engine to submit the calculations and then check the results at job completion. We use for this a continuous integration server, and the best approach is to write suitable workflows to run simulations and then verify the results at the end.

Special tests

Some tests have special routines to ease and simplify the creation of new tests. One case is represented by the tests for transport. In this case, you can define tests for a specific plugin as described above (e.g., see the aiida.transport.plugins.test_ssh and aiida.transport.plugins.test_local tests). Moreover, there is a test_all_plugins module in the same folder. Inside this module, the discovery code is adapted so that each test method defined in that file and decorated with @run_for_all_plugins is run for all available plugins, to avoid to rewrite the same test code more than once and ensure that all plugins behave in the same way (e.g., to copy files, remove folders, etc.).

Virtual environment

Sometimes it’s useful to have a virtual environment that separates out the AiiDA dependencies from the rest of the system. This is especially the case when testing AiiDA against library versions that are different from those installed on the system.

First, install virtualenv using pip:

pip install virtualenv
Basic usage
  1. To create a virtual environment in folder venv, while in the AiiDA directory type:

    virtualenv venv
    

    This puts a copy of the Python executables and the pip library within the venv folder hierarchy.

  2. Activate the environment with:

    source venv/bin/activate
    

    Your shell should now be prompt should now start with (venv).

  3. (optional) Install AiiDA:

    pip install .
    
  4. Deactivate the virtual environment:

    deactivate
    
Deprecated features, renaming, and adding new methods

In case a method is renamed or removed, this is the procedure to follow:

  1. (If you want to rename) move the code to the new function name. Then, in the docstring, add something like:

    .. versionadded:: 0.7
       Renamed from OLDMETHODNAME
    
  2. Don’t remove directly the old function, but just change the code to use the new function, and add in the docstring:

    .. deprecated:: 0.7
       Use :meth:`NEWMETHODNAME` instead.
    

    Moreover, at the beginning of the function, add something like:

    import warnings
    
    warnings.warn(
        "OLDMETHODNAME is deprecated, use NEWMETHODNAME instead",
        DeprecationWarning)
    

    (of course, replace OLDMETHODNAME and NEWMETHODNAME with the correct string, and adapt the strings to the correct content if you are only removing a function, or just adding a new one).

AiiDA internals

Node

The AbstractNode class is the basic class that represents all the possible objects at the AiiDA world. More precisely it is inherited by many classes including (among others) the AbstractCalculation class, representing computations that convert data into a different form, the AbstractCode class representing executables and file collections that are used by calculations and the Data class which represents data that can be input or output of calculations.

Methods & properties

In the sequel the most important methods and properties of the AbstractNode class will be described.

Node subclasses organization

The AbstractNode class has two important variables:

  • ~aiida.orm.implementation.general.node.AbstractNode._plugin_type_string characterizes the class of the object.
  • ~aiida.orm.implementation.general.node.AbstractNode._query_type_string characterizes the class and all its subclasses (by pointing to the package or Python file that contain the class).

The convention for all the AbstractNode subclasses is that if a class B is inherited by a class A then there should be a package A under aiida/orm that has a file __init__.py and a B.py in that directory (or a B package with the corresponding __init__.py)

An example of this is the ArrayData and the KpointsData. ArrayData is placed in aiida/orm/data/array/__init__.py and KpointsData which inherits from ArrayData is placed in aiida/orm/data/array/kpoints.py

This is an implicit & quick way to check the inheritance of the AbstractNode subclasses.

General purpose methods
  • __init__(): The initialization of the Node class can be done by not providing any attributes or by providing a DbNode as initialization. E.g.:

    dbn = a_dbnode_object
    n = Node(dbnode=dbn.dbnode)
    
  • ctime() and mtime() provide the creation and the modification time of the node.

  • is_stored() informs whether a node is already stored to the database.

  • query() queries the database by filtering for the results for similar nodes (if the used object is a subclass of AbstractNode) or with no filtering if it is a AbstractNode class. Note that for this check _plugin_type_string should be properly set.

  • get_computer() returns the computer associated to this node.

  • _validate() does a validation check for the node. This is important for AbstractNode subclasses where various attributes should be checked for consistency before storing.

  • get_user() returns the user that created the node.

  • _increment_version_number_db(): increment the version number of the node on the DB. This happens when adding an attribute or an extra to the node. This method should not be called by the users.

  • copy() returns a not stored copy of the node with new UUID that can be edited directly.

  • uuid() returns the universally unique identifier (UUID) of the node.

  • pk() returns the principal key (ID) of the node.

  • dbnode() returns the corresponding Django object.

  • get_computer() & set_computer() get and set the computer to be used & is associated to the node.

Annotation methods

The AbstractNode can be annotated with labels, description and comments. The following methods can be used for the management of these properties.

Label management:

  • label() returns the label of the node. The setter method can be used for the update of the label.
  • _update_db_label_field() updates the label in the database. This is used by the setter method of the label.

Description management:

Comment management:

Folder management

Folder objects represent directories on the disk (virtual or not) where extra information for the node are stored. These folders can be temporary or permanent.

Store & deletion
  • store_all() stores all the input nodes, then it stores the current node and in the end, it stores the cached input links.
  • _store_input_nodes() stores the input nodes.
  • _check_are_parents_stored() checks that the parents are stored.
  • _store_cached_input_links() stores the input links that are in memory.
  • store() method checks that the node data is valid, then check if node‘s parents are stored, then moves the contents of the temporary folder to the repository folder and in the end, it stores in the database the information that are in the cache. The latter happens with a database transaction. In case this transaction fails, then the data transfered to the repository folder are moved back to the temporary folder.
  • __del__() deletes temporary folder and it should be called when an in-memory object is deleted.
DbNode

The DbNode is the Django class that corresponds to the AbstractNode class allowing to store and retrieve the needed information from and to the database. Other classes extending the AbstractNode class, like Data, AbstractCalculation and AbstractCode use the DbNode code too to interact with the database. The main methods are:

  • get_aiida_class() which returns the corresponding AiiDA class instance.
  • get_simple_name() which returns a string with the type of the class (by stripping the path before the class name).
  • attributes() which returns the all the attributes of the specific node as a dictionary.
  • extras() which returns all the extras of the specific node as a dictionary.
Folders

AiiDA uses Folder and its subclasses to add an abstraction layer between the functions and methods working directly on the file-system and AiiDA. This is particularly useful when we want to easily change between different folder options (temporary, permanent etc) and storage options (plain local directories, compressed files, remote files & directories etc).

Folder

This is the main class of the available Folder classes. Apart from the abstraction provided to the OS operations needed by AiiDA, one of its main features is that it can restrict all the available operations within a given folder limit. The available methods are:

RepositoryFolder

Objects of this class correspond to the repository folders. The RepositoryFolder specific methods are:

  • __init__() initializes the object with the necessary folder names and limits.
  • get_topdir() returns the top directory.
  • section() returns the section to which the folder belongs. This can be for the moment a workflow or node.
  • subfolder() returns the subfolder within the section/uuid folder.
  • uuid() the UUID of the corresponding node or workflow.
SandboxFolder

SandboxFolder objects correspond to temporary (“sandbox”) folders. The main methods are:

Plugin Development

Basics
What a Plugin Is

An AiiDA plugin is a python package that provides a set of extensions to AiiDA.

AiiDA plugins can use entry points in order to make the aiida_core package aware of the extensions.

Note: In the python community, the term ‘package’ is used rather loosely. Depending on context, it can refer to a collection of python modules or it may, in addition, include the files necessary for building and installing the package.

What a Plugin Can Do
  • add new classes to AiiDA’s unified interface, including:

    • calculations
    • parsers
    • data types
    • schedulers
    • transports
    • db importers
    • db exporters
    • subcommands to some verdi commands
    • tests to be run using verdi devel tests

    This typically involves subclassing the respective base class AiiDA provides for that purpose.

  • install separate commandline and/or GUI executables

  • depend on any number of other plugins (the required versions must not clash with AiiDA’s requirements)

What a Plugin Should Not Do

An AiiDA plugin should not:

  • change the database schema AiiDA uses
  • use protected functions, methods or classes of AiiDA (those starting with an underscore _)
  • circumvent data provenance
  • monkey patch anything within the aiida namespace (or the namespace itself)

Failure to comply will likely prevent your plugin from being listed on the official AiiDA plugin registry.

If you find yourself tempted to do any of the above, please open an issue on the AiiDA repository and explain why. We will advise on how to proceed.

Quickstart

You have a code and would like to use it from AiiDA? You need a special data type, parser, scheduler, ... that is not available? Then you’ll need to write an AiiDA plugin.

Let’s get started with creating a new plugin aiida-mycode.

  1. At least once, install an existing aiida plugin to make sure this works.

  2. Check on the aiida plugin registry that the plugin name is still available

  3. Download the AiiDA plugin template:

    wget https://github.com/aiidateam/aiida-plugin-template/archive/master.zip
    unzip master.zip
    cd aiida-plugin-template
    
  4. Replace the name aiida-plugin-template by aiida-mycode:

    mv aiida_plugin_template aiida_mycode
    sed -i .bak 's/aiida_plugin_template/aiida_mycode/g' README.md setup.json examples/*.py
    sed -i .bak 's/aiida-plugin-template/aiida-mycode/g' README.md setup.json
    sed -i .bak 's/template\./mycode./g' setup.json
    
  5. Install your new plugin:

    workon <name_of_your_virtualenv> # if you have one
    pip install -e .
    reentry scan -r aiida
    

That’s it - now you can import aiida-mycode and start developing your plugin

A few things to keep in mind:
  • Be sure to update the setup.json, in particular the license and version number
  • Get your plugin listed as soon as possible to reserve your plugin name and to inform others of your ongoing development
Entry Points
What is an Entry Point?

The setuptools package to which pip is a frontend has a feature called entry points. When a distribution which registers entry points is installed, the entry point specifications are written to a file inside the distribution’s .egg-info folder. setuptools provides a package pkg_resources which can find these entry points by distribution, group and/or name and load the data structure to which it points.

This is the way AiiDA finds and loads classes provided by plugins.

AiiDA Entry Points

This document contains a list of entry point groups AiiDA uses, with an example usage for each. In the following, we assume the following folder structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
aiida-mycode/           - distribution folder
   aiida_mycode/        - toplevel package (from aiida_myplug import ..)
      __init__.py
      calcs/
         __init__.py
         mycode.py      - contains MycodeCalculation
      parsers/
         __init__.py
         mycode.py      - contains MycodeParser
      data/
         __init__.py
         mydat.py       - contains MyData (supports code specific format)
      commands/
         __init__.py
         mydat.py       - contains visualization subcommand for MyData
      workflows/
         __init__.py
         mywf.py        - contains a basic workflow using mycode
      ...
   setup.py             - install script
   setup.json           - install configuration
   ...

For a plugin that uses this folder structure, see the aiida plugin template.

Note, however, that the folder structure inside aiida-mycode/ is entirely up to you. A very simple plugin might look like:

aiida-mysimple/
   aiida_mysimple/
      __init__.py
      simpledata.py
   setup.py
   setup.json

The plugin has to tell AiiDA where to look for the classes to be used as calculations, parsers, transports, etc. This is done inside setup.json by way of the entry_points keyword:

...
entry_points={
   <Entry Point Group>: [
      <Entry Point Specification>,
      ...
   ],
...

It is given as a dictionary containing entry point group names as keywords. The list for each entry point group contains entry point specifications.

A specification in turn is given as a string and consists of two parts, a name and an import path describing where the class is to be imported from. The two parts are sparated by an = sign:

"mycode.mydat = aiida_mycode.data.mydat:MyData"

We strongly suggest to start the name of each entry point with the name of the plugin, ommitting the leading ‘aiida-‘. In our example this leads to entry specifications like "mycode.<any.you.want> = <module.path:class>", just like the above example. Exceptions to this rule are schedulers, transports and potentially data ones. Further exceptions can be tolerated in order to provide backwards compatibility if the plugin was in use before aiida-0.9 and its modules were installed in locations which does not make it possible to follow this rule.

Below, a list of valid entry points recognized by AiiDA follows.

aiida.calculations

Entry points in this group are expected to be subclasses of aiida.orm.JobCalculation. This replaces the previous method of placing a python module with the class in question inside the aiida/orm/calculation/job subpackage.

Example entry point specification:

entry_points={
   "aiida.calculations": [
      "mycode.mycode = aiida_mycode.calcs.mycode:MycodeCalculation"
   ]
}

aiida_mycode/calcs/mycode.py:

from aiida.orm import JobCalculation
class MycodeCalculation(JobCalculation):
   ...

Will lead to usage:

from aiida.orm import CalculationFactory
calc = CalculationFactory('mycode.mycode')
aiida.parsers

Aiida expects a subclass of Parser. Replaces the previous approach consisting in placing a parser module under aiida/parsers/plugins.

Example spec:

entry_points={
   "aiida.calculations": [
      "mycode.mycode = aiida_mycode.parsers.mycode:MycodeParser"
   ]
}

aida_mycode/parsers/myparser.py:

from aiida.parsers import Parser
class MycodeParser(Parser)
   ...

Usage:

from aiida.parsers import ParserFactory
parser = ParserFactory('mycode.mycode')
aiida.data

Group for Data subclasses. Previously located in a subpackage of aiida/orm/data.

Spec:

entry_points={
   "aiida.data": [
      "mycode.mydata = aiida_mycode.data.mydat:MyData"
   ]
}

aiida_mycode/data/mydat.py:

from aiida.orm import Data
class MyData(Data):
   ...

Usage:

from aiida.orm import DataFactory
params = DataFactory('mycode.mydata')
aiida.workflows

For AiiDA workflows. Instead of putting a workflow somewhere under the aiida.workflows package, it can now be packaged as a plugin and exposed to aiida as follows:

Spec:

entry_points={
   "aiida.workflows": [
      "mycode.mywf = aiida_mycode.workflows.mywf:MyWorkflow"
   ]
}

aiida_mycode/workflows/mywf.py:

from aiida.work.workchain import WorkChain
class MyWorkflow(WorkChain):
   ...

Usage:

from aiida.orm import WorkflowFactory
wf = WorkflowFactory('mycode.mywf')
aiida.cmdline

For subcommands to verdi commands like verdi data mydata. This was previously not possible to achieve without editing aiida source code directly. AiiDA expects each entry point to be either a click.Command or click.CommandGroup.

Plugin support for commands is possible due to using click.

Note

In aiida-0.9, the subcommand in question is not yet exposed to verdi. There is a aiida-verdi package that is being developed to implement such functionality (experimental yet). The command will then be called verdi-exp data mydata instead.

Spec:

entry_points={
   "aiida.cmdline.data": [
      "mydata = aiida_mycode.commands.mydata:mydata"
   ]
}

aiida_mycode/commands/mydata.py:

import click
@click.group()
mydata():
   """commandline help for mydata command"""

@mydata.command('animate')
@click.option('--format')
@click.argument('pk')
create_fancy_animation(format, pk):
   """help"""
   ...

Usage:

$ verdi data mydata animate --format=Format PK
aiida.tools.dbexporters

If your plugin adds support for exporting to an external database, use this entry point to have aiida find the module where you define the necessary functions.

aiida.tools.dbimporters

If your plugin adds support for importing from an external database, use this entry point to have aiida find the module where you define the necessary functions.

aiida.tools.dbexporters.tcod_plugins

If you want to support exporting your plugin classes to tcod, use this entry point for your BaseTcodtranslator subclass.

Spec:

entry_points={
    "aiida.tools.dbexporters.tcod_plugins": [
        "myplugin.mycalc = aiida_myplugin.tcod_plugins.mycalc:MycalcTcodtranslator"
    ]
}
aiida.schedulers

For scheduler plugins. Note that the entry point name is not prefixed by the plugin name. This is because typically a scheduler should be distributed in a plugin on its own, and only one plugin per scheduler should be necessary.

Spec:

entry_points={
   "aiida.schedulers": [
      "myscheduler = aiida_myscheduler.myscheduler:MyScheduler"
   ]
}

aiida_myscheduler/myscheduler.py:

from aiida.scheduler import Scheduler
class MyScheduler(Scheduler):
   ...

Usage: The scheduler is used in the familiar way by entering ‘myscheduler’ as the scheduler option when setting up a computer.

aiida.transports

Like schedulers, transports are supposed to be distributed in a separate plugin. Therefore we will again omit the plugin’s name in the entry point name.

Spec:

entry_points={
   "aiida.transports": [
      "mytransport = aiida_mytransport.mytransport:MyTransport"
   ]
}

aiida_mytransport/mytransport.py:

from aiida.transport import Transport
class MyTransport(Transport):
   ...

Usage:

from aiida.transport import TransportFactory
transport = TransportFactory('mytransport')

Jus like one would expect, when a computer is setup, mytransport can be given as the transport option.

Documenting a plugin

The aiida plugin template already comes with a template for documentation that just needs to be adjusted to your needs. In the following we, again, assume you wrote a plugin named aiida-compute:

  1. Modify docs/source/conf.py template configuration file inserting the information about your plugin:

    • Replace aiida_plugin_template by aiida_compute

    • Replace aiida-plugin-template by aiida-compute

    • Modify intersphinx_mapping adding any other packages that are needed by your plugin

    • Update general information about the project:

      project = u'aiida-compute'
      copyright_first_year = 2017
      copyright_owners = "My Institution, Country"
      
  2. Populate or delete the individual documentation pages:

    docs/source/module_guide/calculations.rst
    docs/source/module_guide/data.rst
    docs/source/module_guide/parsers.rst
    docs/source/user_guide/get_started.rst
    docs/source/user_guide/tutorial.rst
    
  3. Update the indices accordingly:

    docs/source/module_guide/index.rst
    docs/source/user_guide/index.rst
    
  4. Make sure that AiiDA docs dependencies sphinx and sphinx_rtd_theme are installed

  5. Generate the html documentation via sphinx:

    cd docs
    make
    
  6. Host your documentation online on ReadTheDocs. Simply sign up and import your project. Make sure to add the path to the requirements file docs/requirements_for_rtd.txt and the Python configuration file docs/source/conf.py in Admin => Advanced settings.

Note: When updating the plugin to a new version, remember to update the version number both in setup.json and aiida_compute/__init__.py.

Publishing a plugin
1. Choose a name

The naming convention for AiiDA plugins is aiida-mycode for the plugin and aiida_mycode for the corresponding python package, leading to the following folder structure:

aiida-mycode/
   aiida_mycode/
      __init__.py

This marks your plugin as an AiiDA package and makes it easy to find on package indices like PyPI.

Note: Python packages cannot contain dashes, thus the underscore.

2. Get Your Plugin Listed

AiiDA plugins should be listed on the AiiDA plugin registry to avoid name-clashes with other plugins.

If you wish to get your plugin listed on the official registry for AiiDA plugins, you will provide the following keyword arguments as key-value pairs in a setup.json or setup.yaml. It is recommended to have setup.py read the keyword arguments from that file:

aiida-myplugin/
   aiida_myplugin/
      ...
   setup.py
   setup.json       # or setup.yaml
  • name
  • author
  • author_email
  • description
  • url
  • license
  • classifiers (optional)
  • version
  • install_requires
  • entry_points
  • scripts (optional)

Now, fork the plugin registry repository, fill in your plugin’s information in the same fashion as the plugins already registered, and create a pull request. The registry will allow users to discover your plugin using verdi plugin search (note: the latter verdi command is not yet implemented in AiiDA).

3. Get Your Plugin On PyPI

For packaging and distributing AiiDA plugins, we recommend to follow existing guidelines for packaging python, which include making the plugin available on the python package index. This makes it possible for users to simply pip install aiida-myplugin.

Our suggested layout:

aiida-compute/       top-folder containing you package and additional files
   aiida_compute/    The package that is to be installed
      __init__.py
      ...
   MANIFEST.in       (optional) lists non-python files to be installed
   README.rst        (optional) description to be used by github etc and PyPI
   setup.py          installation script
   setup.json        contains requirements, metainformation, etc

Note: In principle, aiida-compute could contain and install multiple packages.

Incidentally a distribution can contain and install more than one package at a time.

The most user-friendly way to distribute a package is to create such a distribution and uploading it to PyPI. Users then can simply install the package(s) by running pip <distribution-name>.

Updating an Existing Plugin

This document describes the process of updating an AiiDA plugin written using the old plugin system (pre AiiDA version 0.8) to the current plugin system.

Once the update is complete, make sure to get your plugin listed.

Folder structure

Old plugin system:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
aiida/
   orm/
      calculation/
         job/
            myplugin/
               __init__.py
               mycalc.py
               myothercalc.py
   parsers/
      plugins/
         myplugin/
            __init__.py
            myparser.py
            myotherparser.py
   data/
      myplugin/
         __init__.py
         mydata.py
   tools/
      codespecific/
         myplugin/
            __init__.py
            ...

Turns into:

aiida-myplugin/
   aiida_myplugin/
      __init__.py
      calculations/
         __init__.py
         mycalc.py
         myothercalc.py
      parsers/
         __init__.py
         myparser.py
         myotherparser.py
      data/
         __init__.py
         mydata.py
      tools/
         __init__.py
         ...
Entry points

If you are converting a plugin from the old system to new new system, the name of your entry points must correspond to where your plugin module was installed inside the AiiDA package. Otherwise, your plugin will not be backwards compatible. For example, if you were using a calculation as:

from aiida.orm.calculation.job.myplugin.mycalc import MycalcCalculation
# or
CalculationFactory('myplugin.mycalc')

Then in setup.py:

setup(
   ...,
   entry_points: {
      'aiida.calculations': [
         'myplugin.mycalc = aiida_myplugin.calculations.mycalc:MycalcCalculation'
      ],
      ...
   },
   ...
)

As you see, the name of the entry point matches the argument to the factory method.

import statements

If you haven’t done so already, now would be a good time to search and replace any import statements that refer to the old locations of your modules inside AiiDA. We recommend to change them to absolute imports from your top-level package:

old:

from aiida.tools.codespecific.myplugin.thistool import this_convenience_func

new:

from aiida_myplugin.tools.thistool import this_convenience_func
The AiiDA Plugin System
Introduction

The plugin system is the part of AiiDA that enables sharing workflows and distributing extensions to the core capabilities of AiiDA.

Since both are written in python and shared as source code, we believe they should be distributed in the usual way for python code - python packages using setuptools. This provides a well documented install process familiar to all python users. It simplifies the user experience for sharing workflows and extensions, especially since this allows AiiDA to be distributed and deployed in the same way.

Goals

The goals of the plugin system are the following

Sharing of workflows and extensions

A workflow or extension is written as a python package, distributed as a zip source archive, python egg or PyPI package. There is extensive documentation available for how to distribute python packages here.

Ease of use

Plugins can be found in an online curated list of plugins and installed with one simple command. This process is familiar to every regular python user.

Decouple development and update cycles of AiiDA and plugins

Since plugins are separate python packages, they can be developed in a separate code repository and updated when the developer sees fit without a need to update AiiDA. Similarly, if AiiDA is updated, plugins may not need to release a new version.

Promote modular design in AiiDA development

Separating plugins into their own python packages ensures that plugins can not (easily) access parts of the AiiDA code which are not part of the public API, enabling AiiDA development to stay agile. The same applies to plugins relying on other plugins.

Low overhead for developers

Plugin developers can write their extensions the same way they would write any python code meant for distribution.

Automatic AiiDA setup and testing of plugins

Installation of complete python environments consisting of many packages can be automated, provided all packages use setuptools as a distribution tool. This enables use of AiiDA in a service-based way using, e.g., docker images. At the same it becomes possible to create automated tests for any combination of plugins, as long as the plugins provide test entry points.

Mechanism overview

The new plugin system (introduced in AiiDA 0.9) takes advantage of the already well established entry points mechanism within setuptools, documented in the section “Extensible Applications and Frameworks” in the setuptools documentation. (Previously, plugins had to install python modules directly into specific source folders of AiiDA).

Conceptually, an entry point consists of a group name, an entry point name and a path to the definition of a python object (any object, including modules, classes, functions, variables). A plugin host like AiiDA can iterate through entry points by group, find a specific one by name and load the associated python object. Iterating and finding entry points does not require any python code to be imported. A plugin is a separately-distributed, self-contained python package which implements any number of plugin classes and declares entry points accordingly.

Example

In the following snippet only the most relevant code lines are picked to give an idea of the functioning. We will look only at one type of plugin, calculations, for simplicity.

First of all, AiiDA defines groups of of entry points in aiida_core/setup.py:

# in setuptools.setup() call
entry_points = {
    'aiida.calculations' = [...],
    ...
}

AiiDA then provides a callable CalculationFactory which does something equivalent to this:

def CalculationFactory(plugin_name):
   from pkg_resources import iter_entry_points
   entry_points = iter_entry_points('aiida.calculations')
   plugin = [i for i in entry_points if i.name==plugin_name]
   if plugin and len(plugin) == 1:
      return plugin[0].load()
   elif len(plugin) > 1:
      # raise Error: Ambiguity
   else:
      # raise Error: Plugin not found

In aiida-myplugin/setup.py:

# in setuptools.setup() call
entry_points = {
    'aiida.calculations' = [
        'myplugin.mycalc = aiida_myplugin.calculations.mycalc:MyPluginCalculation,
        ...
    ],
    ...
}

In aiida-myplugin/aiida_myplugin/calculations/mycalc.py:

from aiida.orm import Calculation
class MyPluginCalculation(Calculation):
    ...

In user code:

from aiida import CalculationFactory
Mycalc = CalculationFactory('myplugin.mycalc')
...

Note that the plugin developer can freely choose the code structure as well as the names of the modules and plugin classes. The developer is also free to refactor his code without fear of breaking compatibility, as long as no information stored in the database is changed (note that this unfortunately includes entry point name and class name).

Limitations

The chosen approach to plugins has some limitations:

  • In the current version the interface for entry point objects is enforced implicitly by the way the object is used. It is the responsibility of the plugin developer to test for compliance, especially if the object is not derived from the recommended base classes provided by AiiDA. This is to be clearly communicated in the documentation for plugin developers;
  • The freedom of the plugin developer to name and rename classes ends where the information in question is stored in the database as, e.g., node attributes.
  • The system is designed with the possibility of plugin versioning in mind, however this is not implemented yet.
  • In principle, two different plugins can give the same name to an entry point, creating ambiguity when trying to load the associated objects. Plugin development guidelines in the documentation will advise on how to avoid this problem, and this is addressed via the use of a centralized registry of known AiiDA plugins.
  • Plugins can potentially contain malicious or otherwise dangerous code. In the registry of AiiDA plugins, we try to flag plugins that we know are safe to be used.
Developing The Plugin System

Note

this page is intended for people wanting to contribute to the plugin system in aiida_core and is not needed for people who just want to contribute a plugin.

Design Principles
  1. Only restrict plugin developers when really necessary;
  2. Avoid schema changes whenever reasonably possible;
  3. Finding and loading plugins must be as fast as the plugin allows, especially for command-line (“cli”) commands. In other words, directly importing a plugin class should not be noticeably faster than using the pluginloader/factory;
  4. Implement as a drop-in replacement, provide backwards compatibility at first, think about changing interfaces if/when the old system is dropped;
  5. plugin management should be as user friendly from ipython as from the cli.
Mini-Spec
Terms
plugin_name

A unique name identifying the plugin. Suggested naming scheme is

  • pypi distribution / repo name: aiida-<plugin_name>
  • import name: aiida_<plugin_name>
  • entry point names: <plugin_name>.name
name (entry point)

The entry point for a plugin class looks as follows:

name = <module_path>:<classname>

Therefore within a plugin category the name allows us to find a specific plugin (as well as a typestring) The name is recommended to contain the plugin name (as detailed under plugin_name.

category

A name given to each area extensible via plugins, one of

  • calculations
  • data
  • parsers
  • schedulers
  • transports
  • workflows
  • tools.dbexporters
  • tools.dbimporters
  • tools.dbexporters.tcod_plugins

Each category maps to an entry point group called:

aiida.<category>
Interfaces
Pluginloader (aiida/common/pluginloader.py)

The pluginloader relies on the reentry PyPI package to find and load entry points. reentry has been added to setup_requires for AiiDA in order to enable scanning for existing plugins when AiiDA is installed. If for some reason reentry is uninstalled or is not found, the plugin system will fall back on pkg_resources from setuptools, which is slower.

The API docs are found at the following link: Plugin loaders.

Registry Tools (aiida/plugins)

The registry tools API is located here: aiida.plugins documentation.

Developer calculation plugin tutorial - Integer summation

In this chapter we will give you some examples and a brief guide on how to write a plugin to support a new code. We will focus here on a very simple code (that simply adds two numbers), so that we can focus only on how AiiDA manages the calculation. At the end, you will have an overview of how a plugin is developed. You will be able then to proceed to more complex plugin guides like the guide for the Quantum Espresso plugin, or you can directly jump in and develop your own plugin!

Overview

Before analysing the different components of the plugin, it is important to understand which are these and their interaction.

We should keep in mind that AiiDA is a tool allowing us to perform easily calculations and to maintain data provenance. That said, it should be clear that AiiDA doesn’t perform the calculations but orchestrates the calculation procedure following the user’s directives. Therefore, AiiDA executes (external) codes and it needs to know:

  • where the code is;
  • how to prepare the input for the code. This is called an input plugin or a Calculation subclass;
  • how to parse the output of the code. This is called an output plugin or a Parser subclass.

It is also useful, but not necessary, to have a script that prepares the calculation for AiiDA with the necessary parameters and submits it. Let’s start to see how to prepare these components.

Code

The code is an external program that does a useful calculation for us. For detailed information on how to setup the new codes, you can have a look at the respective documentation page.

Imagine that we have the following python code that we want to install. It does the simple task of adding two numbers that are found in a JSON file, whose name is given as a command-line parameter:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import json
import sys

in_file = sys.argv[1]
out_file = sys.argv[2]

with open(in_file) as f:
    in_dict = json.load(f)

out_dict = { 'sum':in_dict['x1']+in_dict['x2'] }

with open(out_file,'w') as f:
    json.dump(out_dict,f)

The result will be stored in JSON format in a file which name is also passed as parameter. The resulting file from the script will be handled by AiiDA. The code can be downloaded from here. We will now proceed to prepare an AiiDA input plugin for this code.

Input plugin

In abstract term, this plugin must contain the following two pieces of information:

  • what are the input data objects of the calculation;
  • how to convert the input data object in the actual input file required by the external code.

Let’s have a look at the input plugin developed for the aforementioned summation code (a detailed description of the different sections follows):

# -*- coding: utf-8 -*-

from aiida.orm import JobCalculation
from aiida.orm.data.parameter import ParameterData
from aiida.common.utils import classproperty
from aiida.common.exceptions import InputValidationError
from aiida.common.exceptions import ValidationError
from aiida.common.datastructures import CalcInfo, CodeInfo
import json

class SumCalculation(JobCalculation):
    """
    A generic plugin for adding two numbers.
    """

    def _init_internal_params(self):
        super(SumCalculation, self)._init_internal_params()

        self._DEFAULT_INPUT_FILE = 'in.json'
        self._DEFAULT_OUTPUT_FILE = 'out.json'
        self._default_parser = 'sum'

    @classproperty
    def _use_methods(cls):
        """
        Additional use_* methods for the namelists class.
        """
        retdict = JobCalculation._use_methods
        retdict.update({
            "parameters": {
               'valid_types': ParameterData,
               'additional_parameter': None,
               'linkname': 'parameters',
               'docstring': ("Use a node that specifies the input parameters "
                             "for the namelists"),
               },
            })
        return retdict

    def _prepare_for_submission(self,tempfolder, inputdict):
        """
        This is the routine to be called when you want to create
        the input files and related stuff with a plugin.

        :param tempfolder: a aiida.common.folders.Folder subclass where
                           the plugin should put all its files.
        :param inputdict: a dictionary with the input nodes, as they would
                be returned by get_inputs_dict (with the Code!)
        """
        try:
            parameters = inputdict.pop(self.get_linkname('parameters'))
        except KeyError:
            raise InputValidationError("No parameters specified for this "
                                       "calculation")
        if not isinstance(parameters, ParameterData):
            raise InputValidationError("parameters is not of type "
                                       "ParameterData")
        try:
            code = inputdict.pop(self.get_linkname('code'))
        except KeyError:
            raise InputValidationError("No code specified for this "
                                       "calculation")
        if inputdict:
            raise ValidationError("Cannot add other nodes beside parameters")

        ##############################
        # END OF INITIAL INPUT CHECK #
        ##############################

        input_json = parameters.get_dict()

        # write all the input to a file
        input_filename = tempfolder.get_abs_path(self._DEFAULT_INPUT_FILE)
        with open(input_filename, 'w') as infile:
            json.dump(input_json, infile)

        # ============================ calcinfo ================================

        calcinfo = CalcInfo()
        calcinfo.uuid = self.uuid
        calcinfo.local_copy_list = []
        calcinfo.remote_copy_list = []
        calcinfo.retrieve_list = [self._DEFAULT_OUTPUT_FILE]
        calcinfo.retrieve_temporary_list = [['path/hugefiles*[0-9].xml', '.', '1']]

        codeinfo = CodeInfo()
        codeinfo.cmdline_params = [self._DEFAULT_INPUT_FILE,self._DEFAULT_OUTPUT_FILE]
        codeinfo.code_uuid = code.uuid
        calcinfo.codes_info = [codeinfo]

        return calcinfo

The above input plugin can be downloaded from (here) and should be placed at aiida/orm/calculation/job/sum.py.

In order the plugin to be automatically discoverable by AiiDA, it is important to:

  • give the right name to the file. This should be the name of your input plugin (all lowercase);
  • place the plugin under aiida/orm/calculation/job;
  • name the class inside the plugin as PluginnameCalculation. For example, the class name of the summation input plugin is, as you see above, SumCalculation. The first letter must be capitalized, the other letters must be lowercase;
  • inherit the class from JobCalculation.

By doing the above, your plugin will be discoverable and loadable using CalculationFactory.

Note

The base Calculation class should only be used as the abstract base class. Any calculation that needs to run on a remote scheduler must inherit from AbstractJobCalculation, that contains all the methods to run on a remote scheduler, get the calculation state, copy files remotely and retrieve them, ...

Defining the accepted input Data nodes

The input data nodes that the input plugin expects are those returned by the _use_methods class property. It is important to always extend the dictionary returned by the parent class, starting this method with:

retdict = JobCalculation._use_methods

(or the correct parent class, instead of JobCalculation, if you are inheriting from a subclass).

The specific parameters needed by the plugin are defined by the following code snippet:

retdict.update({
    "parameters": {
       'valid_types': ParameterData,
       'additional_parameter': None,
       'linkname': 'parameters',
       'docstring': ("Use a node that specifies the input parameters "
                     "for the namelists"),
       },
    })

This means that this specific summation plugin expects only one input data node, which is of the type ParameterData and with link name parameters.

The main plugin logic

The main logic of the plugin (called by AiiDA just before submission, in order to read the AiiDA input data nodes and create the actual input files for the extenal code) must be defined inside a method _prepare_for_submission, that will receive (beside self) two parameters, a temporary folder tempfolder in which content can be written, and a dictionary containing all the input nodes that AiiDA will retrieve from the database (in this way, the plugin does not need to browse the database).

The input data node with the parameter is retrieved using its link name parameters specified above:

parameters = inputdict.pop(self.get_linkname('parameters'))

A few additional checks are performed to retrieve also the input code (the AiiDA node representing the code executable, that we are going to setup in the next section) and verify that there are no unexpected additional input nodes.

The following lines do the actual job, and prepare the input file for the external code, creating a suitable JSON file:

input_json = parameters.get_dict()

# write all the input to a file
input_filename = tempfolder.get_abs_path(self._DEFAULT_INPUT_FILE)
with open(input_filename, 'w') as infile:
    json.dump(input_json, infile)
The last step: the calcinfo

We can now create the calculation info: an object containing some additional information that AiiDA needs (beside the files you generated in the folder) in order to submit the claculation. In the calcinfo object, you need to store the calculation UUID:

calcinfo.uuid = self.uuid

You should also define a list of output files that will be retrieved automatically after the code execution, and that will be stored permanently into the AiiDA database:

calcinfo.retrieve_list = [self._DEFAULT_OUTPUT_FILE]

The entries of the list should either be a string, which corresponds to the full filepath of the file on the remote, or if you want to specify a group of files with wildcards, it should be another list containing the following three items

  • Remote path with wildcards e.g. some/path/bigfiles*[0-9].xml
  • Local path, which should always be '.' in this case of using wildcards
  • Depth, which is an integer that indicates to what level the nested subtree structure should be kept. For example in this example, with a depth of 1, the matched files will be copied to the root directory as bigfiles*[0-9].xml. For depth=1, the sub path path will be included and the files will be copied as path/bigfiles*[0-9].xml

There is another field that follows exactly the same syntax as the retrieve_list but behaves a little differently.

calcinfo.retrieve_temporary_list = [[‘some/path/bigfiles*[0-9].xml’, ‘.’, 0]]

The difference is that these files will be retrieved and stored in a temporary folder, that will only be available during the parsing of the calculation. After the parsing is completed, successfully or not, the files will be deleted. This is useful if during parsing, one wants to analyze the contents of big files and parse a small subset of the data to keep permanently, but does not want to have the store the raw files themselves which would unnecessarily increase the size of the repository. The files that are retrieved will be stored in a temporary FolderData and be passed as an argument to the parse_with_retrieved method of the Parser class, which is implemented by the specific plugin. It will be passed under the key retrieved_temporary_folder.

For the time being, just define also the following variables as empty lists (we will describe them in the next sections):

calcinfo.local_copy_list = []
calcinfo.remote_copy_list = []

Finally, you need to specify which code executable(s) need to be called link the code to the codeinfo object. For each code, you need to create a CodeInfo object, specify the code UUID, and define the command line parameters that should be passed to the code as a list of strings (only paramters after the executable name must be specified. Moreover, AiiDA takes care of escaping spaces and other symbols). In our case, our code requires the name of the input file, followed by the name of the output file, so we write:

codeinfo.cmdline_params = [self._DEFAULT_INPUT_FILE,self._DEFAULT_OUTPUT_FILE]

Finally, we link the just created codeinfo to the calcinfo, and return it:

calcinfo.codes_info = [codeinfo]

return calcinfo

Note

calcinfo.codes_info is a list of CodeInfo objects. This allows to support the execution of more than one code, and will be described later.

Note

All content stored in the tempfolder will be then stored into the AiiDA database, potentially forever. Therefore, before generating huge files, you should carefully think at how to design your plugin interface. In particular, give a look to the local_copy_list and remote_copy_list attributes of calcinfo, described in more detail in the Quantum ESPRESSO developer plugin tutorial.

By doing all the above, we have clarified what parameters should be passed to which code, we have prepared the input file that the code will access and we let also AiiDA know the name of the output file: our first input plugin is ready!

Note

A few class internal parameters can (or should) be defined inside the _init_internal_params method:

def _init_internal_params(self):
    super(SumCalculation, self)._init_internal_params()

    self._DEFAULT_INPUT_FILE = 'in.json'
    self._DEFAULT_OUTPUT_FILE = 'out.json'
    self._default_parser = 'sum'

In particular, it is good practice to define a _DEFAULT_INPUT_FILE and _DEFAULT_OUTPUT_FILE attributes (pointing to the default input and output file name – these variables are then used by some verdi commands, such as verdi calculation outputcat). Also, you need to define the name of the default parser that will be invoked when the calculation completes in _default_parser. In the example above, we choose the ‘sum’ plugin (that we are going to define later on). If you don’t want to call any parser, set this variable to None.

As a final step, after copying the file in the location specified above, we can check if AiiDA recognised the plugin, by running the command verdi calculation plugins and veryfing that our new sum plugin is now listed.

Setup of the code

Now that we know the executable that we want to run, and we have setup the input plugin, we can proceed to configure AiiDA by setting up a new code to execute:

verdi code setup

During the setup phase, you can either configure a remote code (meaning that you are going to place the python executable in the right folder of the remote computer, and then just instruct AiiDA on the location), or as a local folder, meaning that you are going to store (during the setup phase) the python executable into the AiiDA DB, and AiiDA will copy it to the remote computer when needed. In this second case, put the sum_executable.py in an empty folder and pass this folder in the setup phase.

Note

In both cases, remember to set the executable flag to the code by running chmod +x sum_executable.py.

After defining the code, we should be able to see it in the list of our installed codes by typing:

verdi code list

A typical output of the above command is:

$ verdi code list
# List of configured codes:
# (use 'verdi code show CODEID' to see the details)
* Id 73: sum

Where we can see the already installed summation code. We can further see the specific parameters that we gave when we set-up the code by typing:

verdi code show 73

Which will give us an output similar to the following:

$ verdi code show 73
 * PK:             73
* UUID:           34b44d33-86c1-478b-88ff-baadfb6f30bf
 * Label:          sum
 * Description:    A simple sum executable
 * Default plugin: sum
 * Used by:        0 calculations
 * Type:           local
 * Exec name:      ./sum_executable.py
 * List of files/folders:
   * [file] sum_executable.py
 * prepend text:
   # No prepend text.
 * append text:
   # No append text.

What is important to keep from the above is that we have informed AiiDA for the existence of a code that resides at a specific location and we have also specified the default (input) plugin that will be used.

Output plugin: the parser

In general, it is useful to parse files generated by the code to import relevant data into the database. This has two advantages:

  • we can store information in specific data classes to facilitate their use (e.g. crystal structures, parameters, ...)
  • we can then make use of efficient database queries if, e.g., output quantities are stored as integers or floats rather than as strings in a long text file.

The following is a sample output plugin for the summation code, described in detail later:

# -*- coding: utf-8 -*-

from aiida.orm.calculation.job.sum import SumCalculation
from aiida.parsers.parser import Parser
from aiida.parsers.exceptions import OutputParsingError
from aiida.orm.data.parameter import ParameterData

import json

class SumParser(Parser):
    """
    This class is the implementation of the Parser class for Sum.
    """
    def parse_with_retrieved(self, retrieved):
        """
        Parses the datafolder, stores results.
        This parser for this simple code does simply store in the DB a node
        representing the file of forces in real space
        """

        successful = True
        # select the folder object
        # Check that the retrieved folder is there
        try:
            out_folder = retrieved[self._calc._get_linkname_retrieved()]
        except KeyError:
            self.logger.error("No retrieved folder found")
            return False, ()

        # check what is inside the folder
        list_of_files = out_folder.get_folder_list()
        # at least the stdout should exist
        if self._calc._DEFAULT_OUTPUT_FILE not in list_of_files:
            successful = False
            self.logger.error("Output json not found")
            return successful,()

        try:
            with open( out_folder.get_abs_path(self._calc._DEFAULT_OUTPUT_FILE) ) as f:
                out_dict = json.load(f)
        except ValueError:
            successful=False
            self.logger.error("Error parsing the output json")
            return successful,()

        # save the arrays
        output_data = ParameterData(dict=out_dict)
        link_name = self.get_linkname_outparams()
        new_nodes_list = [(link_name, output_data)]

        return successful,new_nodes_list

As mentioned above the output plugin will parse the output of the executed code at the remote computer and it will store the results to the AiiDA database.

All the parsing code is enclosed in a single method parse_with_retrieved, that will receive as a single parameter retrieved, a dictionary of retrieved nodes. The default behavior is to create a single FolderData node, that can be retrieved using:

out_folder = retrieved[self._calc._get_linkname_retrieved()]

We then read and parse the output file that will contain the result:

with open( out_folder.get_abs_path(self._calc._DEFAULT_OUTPUT_FILE) ) as f:
    out_dict = json.load(f)

Note

all parsers have a self._calc attribute that points to the calculation being parsed. This is automatically set in the parent Parser class.

After loading the code result data to the dictionary out_dict, we construct a ParameterData object (ParameterData(dict=out_dict)) that will be linked to the calculation in the AiiDA graph to be later in the database:

output_data = ParameterData(dict=out_dict)
link_name = self.get_linkname_outparams()
new_nodes_list = [(link_name, output_data)]

return successful,new_nodes_list

Note

Parsers should not store nodes manually. Instead, they should return a list of output unstored nodes (together with a link name string, as shown above). AiiDA will then take care of storing the node, and creating the appropriate links in the DB.

Note

the self.get_linkname_outparams() is a string automatically defined in all Parser classes and subclasses. In general, you can have multiple output nodes with any name, but it is good pratice so have also one of the output nodes with link name self.get_linkname_outparams() and of type ParameterData. The reason is that this node is the one exposed with the calc.res interface (for instance, later we will be able to get the results using print calc.res.sum.

The above output plugin can be downloaded from here and should be placed at aiida/parsers/plugins/sum.py.

Note

Before continuing, it is important to restart the daemon, so that it can recognize the new files added into the aiida code and use the new plugins. To do so, run now:

verdi daemon restart
Submission script

It’s time to calculate how much 2+3 is! We need to submit a new calculation. To this aim, we don’t necessarily need a submission script, but it definitely facilitates the calculation submission. A very minimal sample script follows (other examples can be found in the aiida/examples/submission folder):

#!/usr/bin/env runaiida
# -*- coding: utf-8 -*-
import sys
import os

from aiida.common.exceptions import NotExistent
ParameterData = DataFactory('parameter')

# The name of the code setup in AiiDA
codename = 'sum'
computer_name = 'localhost'

################################################################
try:
    dontsend = sys.argv[1]
    if dontsend == "--dont-send":
        submit_test = True
    elif dontsend == "--send":
        submit_test = False
    else:
        raise IndexError
except IndexError:
    print >> sys.stderr, ("The first parameter can only be either "
                          "--send or --dont-send")
    sys.exit(1)

code = Code.get_from_string(codename)
# The following line is only needed for local codes, otherwise the
# computer is automatically set from the code
computer = Computer.get(computer_name)

# These are the two numbers to sum
parameters = ParameterData(dict={'x1':2,'x2':3})

calc = code.new_calc()
calc.label = "Test sum"
calc.description = "Test calculation with the sum code"
calc.set_max_wallclock_seconds(30*60) # 30 min
calc.set_computer(computer)
calc.set_withmpi(False)
calc.set_resources({"num_machines": 1})

calc.use_parameters(parameters)

if submit_test:
    subfolder, script_filename = calc.submit_test()
    print "Test submit file in {}".format(os.path.join(
        os.path.relpath(subfolder.abspath),
        script_filename
        ))
else:
    calc.store_all()
    calc.submit()
    print "submitted calculation; calc=Calculation(uuid='{}') # ID={}".format(
        calc.uuid,calc.dbnode.pk)

What is important to note in the script above is the definition of the code to be used:

codename = 'sum'
code = Code.get_from_string(codename)

and the definition of the parameters:

parameters = ParameterData(dict={'x1':2,'x2':3})
calc.use_parameters(parameters)

If everything is done correctly, by running the script a new calculation will be generated and submitted to AiiDA (to run the script, remember to change its permissions with chmod +x filename first, and then run it with ./scriptname.py). When the code finishes its execution, AiiDA will retrieve the results, parse and store them back to the AiiDA database using the output plugin. You can download the submission script from here.

Conclusion

We have just managed to write our first AiiDA plugin! What is important to remember is that:

  • AiiDA doesn’t know how to execute your code. Therefore, you have to setup your code (with verdi code setup) and let AiiDA know how to prepare the data that will be given to the code (input plugin or calculation) and how to handle the result of the code (output plugin or parser).
  • you need to do pass the actual data for the calculation you want to submit, either in the interactive shell, or via a submission script.

As usual, we can see the executed calculations by doing a verdi calculation list. To see the calculations of the last day:

$ verdi calculation list -a -p1
# Last daemon state_updater check: 0h:00m:06s ago (at 20:10:31 on 2015-10-20)
# Pk|State        |Creation|Sched. state|Computer   |Type
327 |FINISHED     |4h ago  |DONE        |localhost  |sum

and we can see the result of the sum by running in the verdi shell the following commands (change 327 with the correct calculation PK):

>>> calc = load_node(327)
>>> print calc.res.sum
<<< 5

So we verified that, indeed, 2+3=5.

Developer data plugin tutorial - Float summation

Now that you have writen your first AiiDA plugin, we can try to extend it to see how we can introduce different type of parameters and how the plugins have to be modified to encompass these changes.

Introducing a new data type

We will start by describing what is a data plugin, and by creating a new one.

A data plugin is a subclass of Data. What you have to do is just to define a subclass with a suitable name inside the aiida/orm/data folder (with the same name convention of Calculation plugins: the class should be called NameData (with Name being a name of your choice) and put in a aiida/orm/data/name.py file. In the class, you should provide methods that the end user should use to store high-level objects (for instance, for a crystal structure, there can be a method for setting the unit cell, one for adding an atom in a given position, ...). Internally, you should choose where to store the content. There are two options:

  • In the AiiDA database. This is useful for small amounts of data, that you plan to query. In this case, use self._set_attr(attr_name, attr_value) to store the required value.
  • In the AiiDA file repository (as a file on the disk). This is suitable for big files and quantities that you do not want to query. In this case, access the folder using self.folder and use the methods of self.folder to create files, subfolders, ...

Of course, it is also good practice to provide “getter” methods to retrieve the data in the database and return it back to the user. The idea is that the user can operate directly only with the methods you provide, and should not need to know how you decided to store the data inside the AiiDA database.

As a simple example that we will use for the exercise below, imagine that we want to introduce a new type of data node that simply stores a float number. We will call it FloatData, and the class implementation can look like this:

from aiida.orm.data import Data

class FloatData(Data):

    @property
    def value(self):
        """
        The value of the Float
        """
        return self.get_attr('number')

    @value.setter
    def value(self,value):
        """
        Set the value of the Float

        :raise ValueError:
        """
        self._set_attr('number', float(value))

This file should be placed under aiida/orm/data/float.py and it should extend the class Data.

Exercise: Modifying the calculation plugin

Your exercise consists in creating a new code plugin (let’s call it for instance floatsum) that will also perform the sum, but accept as input two FloatData node and return also a FloatData node containing the sum.

Below, you will find some hints on the parts you need to modify with respect to the previous tutorial using instead ParameterData both as inputs and outputs.

Note

remember to create copies of your files with a new name floatsum.py instead of sum.py, and to change the class name accordingly.

Changes to the parser

The plugin should now return a FloatData instead of a ParameterData, therefore the parser code should contain something like the following:

output_data = FloatData()
output_data.value = out_dict["sum"]
linkname = 'output_data'
Changes to the input plugin

To be able to run your new FloatsumParser, you will need the corresponding input plugin (FloatsumCalculation). The first modification is then to link to the correct parser class:

self._default_parser = 'floatsum'

For consistency, we also want that the input plugin accepts two FloatData instead of a single ParameterData. Therefore, you have to update the retdict object accordingly:

retdict.update({
    "float_data_1": {
       'valid_types': FloatData,
       'additional_parameter': None,
       'linkname': 'float_data_1',
       'docstring': ("The first addend"),
       },
    "float_data_2": {
       'valid_types': FloatData,
       'additional_parameter': None,
       'linkname': 'float_data_2',
       'docstring': ("The second addend"),
       },
    })

You need then to change the main code to use the values obtained from the two nodes, rather than from a single node as before. This should be easy, so we leave this task to you. Note that we plan to use the same python code to actually perform the sum, so the JSON file to be generated should have the same format.

We also suggest that you add utility methods (to the benefit of the end user) to provide the addends to the code, something like:

def set_addend1(self, value):
    fl = FloatData()
    fl.value = value
    self.use_float_data_1(fl)

and similarly for the second addend.

The final input plugin should be placed at aiida/orm/calculation/job/floatsum.py.

Code

The python code that actually performs the calculation does not need to be modified. We can reuse the same file, but we suggest to setup a new code in AiiDA, with a different name, using as default plugin the floatsum plugin.

Submission script

Finally, adapt your submission script to create the correct input nodes, and try to perform a sum of two numbers to verify that you did all correctly!

Note

After placing your files, do not forget to restart the daemon so that it will recognize the files! The same should be done if you do any change to the plugin, otherwise the daemon may have cached the old file and will keep using it.

Developer code plugin tutorial - Quantum Espresso

In this section we will focus on AiiDA’s Quantum Espresso plugin that we are going to analyse and show how a physics oriented plugin is developed. It will be assumed that you have already tried to run an example of Quantum Espresso, and you know more or less how the AiiDA interface works. We hope that in the end you will be able to replicate the task for other codes.

Note

While we keep this documentation here, as it is a guide on how to write a plugin, we mention that since AiiDA 0.10 the aiida-quantumespresso plugin has been moved to a different repository. You can find the documentation (including the tutorials on how to run Quantum ESPRESSO calculations with the plugins that we describe here) on the respective Read The Docs page.

In fact, when writing your own plugin, keep in mind that you need to satisfy multiple users, and the interface needs to be simple (not the code below). But always try to follow the Zen of Python:

Simple is better than complex.

Complex is better than complicated.

Readability counts.

As demonstrated in previous sections, there will be two kinds of plugins: the input and the output. The former has the purpose to convert python object in text inputs that can be executed by external software. The latter will convert the text output of these software back into python dictionaries/objects that can be put back in the database.

InputPlugin

Create a new file, which has the same name as the class you are creating (in this way, it will be possible to load it with CalculationFactory). Save it in a subfolder at the path aiida/orm/calculation/job.

Step 1: inheritance

First define the class:

class SubclassCalculation(JobCalculation):

(Substitute Subclass with the name of your plugin). Take care of inheriting the JobCalculation class, or the plugin will not work.

Now, you will likely need to define some variables that belong to SubclassCalculation. In order to be sure that you don’t lose any variables belonging to the inherited class, every subclass of calculation needs to have a method which is called _init_internal_params(). An example of it would look like:

def _init_internal_params(self):
    super(SubclassCalculation, self)._init_internal_params()

    self.A_NEW_VARIABLE = 'nabucco'

This function will be called by the __init__ method and will initialize the variable A_NEW_VARIABLE at the moment of the instancing. The second line will call the _init_internal_params() of the parent class and load other variables eventually defined there. Now you are able to access the variable A_NEW_VARIABLE also in the rest of the class by calling self.A_NEW_VARIABLE.

Note

Even if you don’t need to define new variables, it is safer to define the method with the call to super().

Note

It is not recommended to rewrite an __init__ by yourself: this method is inherited from the classes Node and Calculation, and you shouldn’t alter it unless you really know the code down to the lowest-level.

Note

The following is a list of relevant parameters you may want to (re)define in _init_internal_params:

  • self._default_parser: set to the string of the default parser to be used, in the form accepted by the plugin loader (e.g., for the Quantum ESPRESSO plugin for phonons, this would be “quantumespresso.ph”, loaded from the aiida.parsers.plugins module).
  • self._DEFAULT_INPUT_FILE: specify here the relative path to the filename of the default file that should be shown by verdi calculation outputcat --default . If not specified, the default value is None and verdi calculation outputcat will not accept the --default option, but it will instead always ask for a specific path name.
  • self._DEFAULT_OUTPUT_FILE: same of _DEFAULT_INPUT_FILE, but for the default output file.
Step 2: define input nodes

First, you need to specify what are the objects that are going to be accepted as input to the calculation class. This is done by the class property _use_methods. An example is as follows:

from aiida.common.utils import classproperty

class SubclassCalculation(JobCalculation):

    def _init_internal_params(self):
        super(SubclassCalculation, self)._init_internal_params()

    @classproperty
    def _use_methods(cls):
        retdict = JobCalculation._use_methods
        retdict.update({
            "settings": {
               'valid_types': ParameterData,
               'additional_parameter': None,
               'linkname': 'settings',
               'docstring': "Use an additional node for special settings",
               },
            "pseudo": {
               'valid_types': UpfData,
               'additional_parameter': 'kind',
               'linkname': cls._get_pseudo_linkname,
               'docstring': ("Use a remote folder as parent folder (for "
                             "restarts and similar"),
               },
            })
        return retdict

    @classmethod
    def _get_pseudo_linkname(cls, kind):
        """
        Return the linkname for a pseudopotential associated to a given
        structure kind.
        """
        return "pseudo_{}".format(kind)

After this piece of code is written, we now have defined two methods of the calculation that specify what DB object could be set as input (and draw the graph in the DB). Specifically, here we will find the two methods:

calculation.use_settings(an_object)
calculation.use_pseudo(another_object,'object_kind')

What did we do?

  1. We added implicitly the two new use_settings and use_pseudo methods (because the dictionary returned by _use_methods now contains a settings and a pseudo key)

  2. We did not lose the use_code call defined in the Calculation base class, because we are extending Calculation._use_methods. Therefore: don’t specify a code as input in the plugin!

  3. use_settings will accept only one parameter, the node specifying the settings, since the additional_parameter value is None.

  4. use_pseudo will require two parameters instead, since additional_parameter value is not None. If the second parameter is passed via kwargs, its name must be ‘kind’ (the value of additional_parameters). That is, you can call use_pseudo in one of the two following ways:

    use_pseudo(pseudo_node, 'He')
    use_pseudo(pseudo_node, kind='He')
    

    to associate the pseudopotential node pseudo_node (that you must have loaded before) to helium (He) atoms.

  5. The type of the node that you pass as first parameter will be checked against the type (or the tuple of types) specified with valid_types (the check is internally done using the isinstance python call).

  6. The name of the link is taken from the linkname value. Note that if additional_parameter is None, this is simply a string; otherwise, it must be a callable that accepts one single parameter (the further parameter passed to the use_XXX function) and returns a string with the proper name. This functionality is provided to have a single use_XXX method to define more than one input node, as it is the case for pseudopotentials, where one input pseudopotential node must be specified for each atomic species or kind.

  7. Finally, docstring will contain the documentation of the function, that the user can obtain by printing e..g. use_pseudo.__doc__.

Note

The actual implementation of the use_pseudo method in the Quantum ESPRESSO tutorial is slightly different, as it allows the user to specify a list of kinds that are associated with the same pseudopotential file (while in the example above only one kind string can be passed).

Step 3: prepare a text input

How are the input nodes used internally? Every plugin class is required to have the following method:

def _prepare_for_submission(self,tempfolder,inputdict):

This function is called by the daemon when it is trying to create a new calculation.

There are two arguments:

  1. tempfolder: is an object of kind SandboxFolder, which behaves exactly as a folder. In this placeholder, you are going to write the input files. This tempfolder is gonna be copied to the remote cluster.

2. inputdict: contains all the input data nodes as a dictionary, in the same format that is returned by the get_inputs_dict() method, i.e. a linkname as key, and the object as value.

Changed in version 0.5: inputdict should contain all input Data nodes, and the code. (this is what the get_inputs_dict() method returns, by the way). In older versions, the code is not present.

In general, you simply want to do:

inputdict = self.get_inputs_dict()

right before calling _prepare_for_submission. The reason for having this explicitly passed is that the plugin does not have to perform explicit database queries, and moreover this is useful to test for submission without the need to store all nodes on the DB.

For the sake of clarity, it’s probably going to be easier looking at an implemented example. Take a look at the NamelistsCalculation located in aiida_quantumespresso.calculations.namelists (this is not in AiiDA-core but on the aiida-quantumespresso package - you can find the links to this repository and its documentation at the top of this page).

How does the method _prepare_for_submission work in practice?

  1. You should start by checking if the input nodes passed in inputdict are logically sufficient to run an actual calculation. Remember to raise an exception (for example InputValidationError) if something is missing or if something unexpected is found. Ideally, it is better to discover now if something is missing, rather than waiting the queue on the cluster and see that your job has crashed. Also, if there are some nodes left unused, you are gonna leave a DB more complicated than what has really been, and therefore is better to stop the calculation now.

  2. create an input file (or more if needed). In the Namelist plugin is done like:

    input_filename = tempfolder.get_abs_path(self._DEFAULT_INPUT_FILE)
    with open(input_filename,'w') as infile:
        # Here write the information of a ParameterData inside this
        # file
    

    Note that here it all depends on how you decided the ParameterData to be written. In the namelists plugin we decided the convention that a ParameterData of the format:

    ParameterData(dict={"INPUT":{'smearing':2,
                                 'cutoff':30}
                        })
    

    is written in the input file as:

    &INPUT
        smearing = 2,
        cutoff=30,
    /
    

    Of course, it’s up to you to decide a convention which defines how to convert the dictionary to the input file. You can also impose some default values for simplicity. For example, the location of the scratch directory, if needed, should be imposed by the plugin and not by the user, and similarly you can/should decide the naming of output files.

    Note

    it is convenient to avoid hard coding of all the variables that your code has. The convention stated above is sufficient for all inputs structured as fortran cards, without the need of knowing which variables are accepted. Hard coding variable names implies that every time the external software is updated, you need to modify the plugin: in practice the plugin will easily become obsolete if poor maintained. Easyness of maintainance here win over user comfort!

  3. copy inside this folder some auxiliary files that resides on your local machine, like for example pseudopotentials.

  4. return a CalcInfo object.

    This object contains some accessory information. Here’s a template of what it may look like:

    calcinfo = CalcInfo()
    
    calcinfo.uuid = self.uuid
    calcinfo.local_copy_list = local_copy_list
    calcinfo.remote_copy_list = remote_copy_list
    
    calcinfo.retrieve_list = []
    ### Modify here !
    calcinfo.retrieve_list.append('Every file/folder you want to store back locally')
    ### Modify here!
    calcinfo.retrieve_temporary_list = []
    ### Modify here!
    calcinfo.retrieve_singlefile_list = []
    
    ### Modify here and put a name for standard input/output files
    codeinfo = CodeInfo()
    codeinfo.cmdline_params = settings_dict.pop('CMDLINE', [])
    codeinfo.stdin_name = self._DEFAULT_INPUT_FILE
    codeinfo.stdout_name = self._DEFAULT_OUTPUT_FILE
    codeinfo.withmpi = self.get_withmpi()
    codeinfo.code_pk = code.pk
    
    calcinfo.codes_info = [codeinfo]
    
    return calcinfo
    

    There are a couple of things to be set on calcinfo.

    1. local_copy_list: a list of length-two-tuples: ('localabspath', 'relativedestpath'). Files to be copied from the aiida server to the cluster.
    2. remote_copy_list: a list of tuples: ('remotemachinename', 'remoteabspath', 'relativedestpath'). Files/folders to be copied from a remote source to a remote destination, sitting both on the same machine.
    3. retrieve_list: a list of relative file pathnames, that will be copied from the cluster to the aiida server, after the calculation has run on cluster. Note that all the file names you need to modify are not absolute path names (you don’t know the name of the folder where it will be created) but rather the path relative to the scratch folder.
    4. retrieve_temporary_list: a list of relative file pathnames, that will be copied from the cluster to the aiida server, after the calculation has run on the cluster, that will only be available during the parsing. After parsing has completed, be it successfully or not, the copied files will be lost. This is useful if one needs to retrieve big files that are required for the parsing, but that one does not want to store permanently in the repository. The input format follows the exact same rules as that of the retrieve_list.
    5. retrieve_singlefile_list: a list of triplets, in the form ['linkname_from calc to singlefile', 'subclass of singlefile', 'filename']. If this is specified, at the end of the calculation it will be created a SinglefileData-like object in the Database, children of the calculation, if of course the file is found on the cluster.
    6. codes_info: a list of informations that needs to be passed on the command line to the code, passed in the form of a list of CalcInfo objects (see later). Every element in this list corresponds to a call to a code that will be executed in the same scheduling job. This can be useful if a code needs to execute a short preprocessing. For long preprocessings, consider to develop a separate plugin.
    7. codes_run_mode: a string, only necessary if you want to run more than one code in the same scheduling job. Determines the order in which the multiple codes are run (i.e. sequentially or all at the same time. It assumes one of the values of aiida.common.datastructures.code_run_modes, like code_run_modes.PARALLEL or code_run_modes.SERIAL

    A CodeInfo object, as said before, describes how a code has to be executed. The list of CodeInfo objects passed to CalcInfo will determined the ordered execution of one (or more) calls to executables. The attributes that can be set to CodeInfo are:

    1. stdin_name: the name of the standard input.
    2. stdin_name: the name of the standard output.
    3. cmdline_params: like parallelization flags, that will be used when running the code.
    4. stderr_name: the name of the error output.
    5. withmpi: whether the code has to be called with mpi or not.
    6. code_pk: the pk of the code associated to the CodeInfo instance.

    If you need to change other settings to make the plugin work, you likely need to add more information to the calcinfo than what we showed here. For the full definition of CalcInfo() and CodeInfo(), refer to the source aiida.common.datastructures.

That’s what is needed to write an input plugin. To test that everything is done properly, remember to use the calculation.submit_test() method, which creates locally the folder to be sent on cluster, without submitting the calculation on the cluster.

OutputPlugin

Well done! You were able to have a successful input plugin. Now we are going to see what you need to do for an output plugin. First of all let’s create a new folder: $path_to_aiida/aiida/parsers/plugins/the_name_of_new_code, and put there an empty __init__.py file. Here you will write in a new python file the output parser class. It is actually a rather simple class, performing only a few (but tedious) tasks.

After the calculation has been computed and retrieved from the cluster, that is, at the moment when the parser is going to be called, the calculation has two children: a RemoteData and a FolderData. The RemoteData is an object which represents the scratch folder on the cluster: you don’t need it for the parsing phase. The FolderData is the folder in the AiiDA server which contains the files that have been retrieved from the cluster. Moreover, if you specified a retrieve_singlefile_list, at this stage there is also going to be some children of SinglefileData kind.

Let’s say that you copied the standard output in the FolderData. The parser than has just a couple of tasks:

  1. open the files in the FolderData
  2. read them
  3. convert the information into objects that can be saved in the Database
  4. return the objects and the linkname.

Note

The parser should not save any object in the DB, that is a task of the daemon: never use a .store() method!

Basically, you just need to specify an __init__() method, and a function parse_with_retrieved(calc, retrieved)__, which does the actual work.

The difficult and long part is the point 3, which is the actual parsing stage, which convert text into python objects. Here, you should try to parse as much as you can from the output files. The more you will write, the better it will be.

Note

You should not only parse physical values, a very important thing that could be used by workflows are exceptions or others errors occurring in the calculation. You could save them in a dedicated key of the dictionary (say ‘warnings’), later a workflow can easily read the exceptions from the results and perform a dedicated correction!

In principle, you can save the information in an arbitrary number of objects. The most useful classes to store the information back into the DB are:

  1. ParameterData: This is the DB representation of a python dictionary. If you put everything in a single ParameterData, then this could be easily accessed from the calculation with the .res method. If you have to store arrays / large lists or matrices, consider using ArrayData instead.
  2. ArrayData: If you need to store large arrays of values, for example, a list of points or a molecular dynamic trajectory, we strongly encourage you to use this class. At variance with ParameterData, the values are not stored in the DB, but are written to a file (mapped back in the DB). If instead you store large arrays of numbers in the DB with ParameterData, you might soon realize that: a) the DB grows large really rapidly; b) the time it takes to save an object in the DB gets very large.
  3. StructureData: If your code relaxes an input structure, you can end up with an output structure.

Of course, you can create new classes to be stored in the DB, and use them at your own advantage.

A kind of template for writing such parser for the calculation class NewCalculation is as follows:

class NewParser(Parser):
    """
    A doc string
    """

    def __init__(self,calc):
        """
        Initialize the instance of NewParser
        """
        # check for valid input
        if not isinstance(calc,NewCalculation):
            raise ParsingError("Input must calc must be a NewCalculation")

        super(NewParser, self).__init__(calc)

    def parse_with_retrieved(self, retrieved):
        """
        Parses the calculation-output datafolder, and stores
        results.

        :param retrieved: a dictionary of retrieved nodes, where the keys
            are the link names of retrieved nodes, and the values are the
            nodes.
        """
        # check the calc status, not to overwrite anything
        state = calc.get_state()
        if state != calc_states.PARSING:
            raise InvalidOperation("Calculation not in {} state"
                                   .format(calc_states.PARSING) )

        # retrieve the whole list of input links
        calc_input_parameterdata = self._calc.get_inputs(node_type=ParameterData,
                                                         also_labels=True)

        # then look for parameterdata only
        input_param_name = self._calc.get_linkname('parameters')
        params = [i[1] for i in calc_input_parameterdata if i[0]==input_param_name]
        if len(params) != 1:
            # Use self.logger to log errors, warnings, ...
            # This will also add an entry to the DbLog table associated
            # to the calculation that we are trying to parse, that can
            # be then seen using 'verdi calculation logshow'
            self.logger.error("Found {} input_params instead of one"
                                  .format(params))
            successful = False
            calc_input = params[0]

            # Check that the retrieved folder is there
            try:
                out_folder = retrieved[self._calc._get_linkname_retrieved()]
            except KeyError:
                self.logger.error("No retrieved folder found")
                return False, ()

            # check what is inside the folder
            list_of_files = out_folder.get_folder_list()
            # at least the stdout should exist
            if not calc._DEFAULT_OUTPUT_FILE in list_of_files:
                raise QEOutputParsingError("Standard output not found")
            # get the path to the standard output
            out_file = os.path.join( out_folder.get_abs_path('.'),
                                     calc._DEFAULT_OUTPUT_FILE )


        # read the file
        with open(out_file) as f:
            out_file_lines = f.readlines()

        # call the raw parsing function. Here it was thought to return a
        # dictionary with all keys and values parsed from the out_file (i.e. enery, forces, etc...)
        # and a boolean indicating whether the calculation is successfull or not
        # In practice, this is the function deciding the final status of the calculation
        out_dict,successful = parse_raw_output(out_file_lines)

        # convert the dictionary into an AiiDA object, here a
        # ParameterData for instance
        output_params = ParameterData(dict=out_dict)

        # prepare the list of output nodes to be returned
        # this must be a list of tuples having 2 elements each: the name of the
        # linkname in the database (the one below, self.get_linkname_outparams(),
        # is defined in the Parser class), and the object to be saved
        new_nodes_list = [ (self.get_linkname_outparams(),output_params) ]

        # The calculation state will be set to failed if successful=False,
        # to finished otherwise
        return successful, new_nodes_list

Parser warnings policy

As a rule of thumb, always include two keys in the output parameters of a calculation, warnings and parser_warnings. These two keys contain a list of messages (strings) that are useful for debugging problems in the execution of calculations. Below are the guidelines for the usage of the keys warnings and parser_warnings in the output parameters of a calculation.

Warnings

These should be devoted to warnings or error messages relative to the execution of the code. As a (non-exhaustive) list of examples, for Quantum-ESPRESSO, run-time messages such as

  • Maximum CPU time exceeded.
  • c_bands:  2 eigenvalues not converged
  • Not enough space allocated for radial FFT
  • The scf cycle did not reach convergence.
  • The FFT is incommensurate: some symmetries may be lost.
  • Error in routine [...]

should be put in the warnings. In the above cases the warning messages are directly copied from the output of the code, but a warning can also be elaborated by the parser when it finds out that something strange went on during the execution of the code. For QE an example is QE pw run did not reach the end of the execution.

Among the code-based warnings, some can be identified as ‘’critical’‘, meaning that when present the calculation should be set in FAILED state. There should be an internal list in the parser, e.g. critical_messages, defining such critical warnings. Other non-critical warnings instead might be used to signal the presence of some possible source of troubles, but that nevertheless did not prevent the calculation to be considered FINISHED.

Parser_warnings

These should be reserved to warnings occurring during parsing, i.e. when the parser does not find an information it was looking for in the output files. For Quantum-ESPRESSO (PW), for instance (implemented in the aiida-quantumespresso package) examples are

  • Skipping the parsing of the xml file.
  • Error while parsing for energy terms.
  • etc.

Therefore, these warnings should be placed just to notify that the output was not found in the way the developer had expected, and they signal the necessity of improving the parser code.

How to extend the AiiDA REST API

The AiIDA REST API is made of two main classes:

  • App, inheriting flask.Flask. The latter represents any Flask web app, including REST APIs.
  • Api, inheriting flask_restful.Api. This represents the API itself.

Once instanciated both Api and App classes into, say, app and api, these two objects have to be coupled by adding app as one of the attributes of api. As we will see in a moment, we provide a function that, besides other things, does exactly this.

In a Flask API the resources, e.g. Nodes, Kpoints, etc., are represented by flask_restful.Resource-derived classes.

If you need to include additional endpoints besides those built in the AiiDA REST API you should:

  • create the resource classes that will be bound to the new endpoints;
  • extend the class Api into a user-defined class to register the new endpoints.
  • (Optional) Extend App into a user-defined class for finer customization.

Let’s provide a minimal example through which we add the endpoint /new-endpoint supporting two HTTP methods:

  • GET: retrieves the latest created ParameterData object and returns its id, ctime in ISO 8601 format, and attributes.
  • POST: creates a ParameterData object with placeholder attributes, stores it, and returns its id.

Let’s assume you’ve put the code in the file example.py, reading:

#!/usr/bin/env python
from aiida.restapi.api import AiidaApi, App
from aiida.restapi.run_api import run_api
from flask_restful import Resource


class NewResource(Resource):
    """
    resource containing GET and POST methods. Description of each method
    follows:

    GET: returns id, ctime, and attributes of the latest created ParameterData.

    POST: creates a ParameterData object, stores it in the database,
    and returns its newly assigned id.

    """

    def get(self):
        from aiida.orm.querybuilder import QueryBuilder
        from aiida.orm.data.parameter import ParameterData

        qb = QueryBuilder()
        qb.append(ParameterData,
                  project=['id', 'ctime', 'attributes'],
                  tag='pdata')
        qb.order_by({'pdata': {'ctime': "desc"}})
        result = qb.first()

        # Results are returned as a dictionary, datetime objects is
        # serialized as ISO 8601
        return dict(id=result[0],
                    ctime=result[1].isoformat(),
                    attributes=result[2])

    def post(self):
        from aiida.orm.data.parameter import ParameterData

        params = dict(property1="spam", property2="egg")
        paramsData = ParameterData(dict=params).store()

        return {'id': paramsData.pk}


class NewApi(AiidaApi):

    def __init__(self, app=None, **kwargs):
        """
        This init serves to add new endpoints to the basic AiiDA Api

        """
        super(NewApi, self).__init__(app=app, **kwargs)

        self.add_resource(NewResource, '/new-endpoint/', strict_slashes=False)


# Standard boilerplate to run the api
import sys
import aiida.restapi.common as common
config_dir = common.__path__[0]

if __name__ == '__main__':
    """
    Run the app accepting arguments.

    Ex:
     python example.py --host=127.0.0.2 --port=6000 --config-dir

    Defaults:
     address: 127.0.01:5000,
     config directory: <aiida_path>/aiida/restapi/common
    """

    run_config = dict(
        hookup=True,
        default_config_dir=config_dir,
        default_host='127.0.0.1',
        default_port='5000',
        parse_aiida_profile=False,
    )

    run_api(App, NewApi, *sys.argv[1:], **run_config)

Let us dissect the previous code explaining each part. First things first: the imports.

from aiida.restapi.api import AiidaApi, App
from aiida.restapi.run_api import run_api
from flask_restful import Resource

To start with, we import the base classes to be extended/employed: AiidaApi and App. For simplicity, it is advisable to import the method run_api, as it provides an interface to configure the Api, parse command-line arguments, and couple the two classes representing the Api and the App. However, you can refer to the documentation of flask_restful to configure and hook-up an Api through its built-in methods.

Then we define a class representing the additional resource:

class NewResource(Resource):
    """
    resource containing GET and POST methods. Description of each method
    follows:

    GET: returns id, ctime, and attributes of the latest created ParameterData.

    POST: creates a ParameterData object, stores it in the database,
    and returns its newly assigned id.

    """

    def get(self):
        from aiida.orm.querybuilder import QueryBuilder
        from aiida.orm.data.parameter import ParameterData

        qb = QueryBuilder()
        qb.append(ParameterData,
                  project=['id', 'ctime', 'attributes'],
                  tag='pdata')
        qb.order_by({'pdata': {'ctime': "desc"}})
        result = qb.first()

        # Results are returned as a dictionary, datetime objects is
        # serialized as ISO 8601
        return dict(id=result[0],
                    ctime=result[1].isoformat(),
                    attributes=result[2])

    def post(self):
        from aiida.orm.data.parameter import ParameterData

        params = dict(property1="spam", property2="egg")
        paramsData = ParameterData(dict=params).store()

        return {'id': paramsData.pk}

The class NewResource contains two methods: get and post. The names chosen for these functions are not arbitrary but fixed by Flask to individuate the functions that respond to HTTP request of type GET and POST, respectively. In other words, when the API receives a GET (POST) request to the URL new-endpoint, the function NewResource.get() (NewResource.post()) will be executed. The HTTP response is constructed around the data returned by these functions. The data, which are packed as dictionaries, are serialized by Flask as a JSON stream of data. All the Python built-in types can be serialized by Flask (e.g. int, float, str, etc.), whereas for serialization of custom types we let you refer to the Flask documentation . The documentation of Flask is the main source of information also for topics such as customization of HTTP responses, construction of custom URLs (e.g. accepting parameters), and more advanced serialization issues.

Whenever you face the need to handle errors, consider to use the AiiDA REST API-specific exceptions already defined in aiida.restapi.common.exceptions. The reason will become clear slightly later in this section.

Once the new resource is defined, we have to register it to the API by assigning it one (or more) endpoint(s). This is done in the __init__() of NewApi by means of the method add_resource():

class NewApi(AiidaApi):

    def __init__(self, app=None, **kwargs):
        """
        This init serves to add new endpoints to the basic AiiDA Api

        """
        super(NewApi, self).__init__(app=app, **kwargs)

        self.add_resource(NewResource, '/new-endpoint/', strict_slashes=False)

In our original intentions, the main (if not the only) purpose of overriding the __init__() method is to register new resources to the API. In fact, the general form of __init__() is meant to be:

class NewApi(AiidaApi):

    def __init__(self, app=None, **kwargs):

        super(NewApi, self.__init__(app=app, *kwargs))

        self.add_resource( ... )
        self.add_resource( ... )
        self.add_resource( ... )

        ...

In the example, indeed, the only characteristic line is self.add_resource(NewResource, '/new-endpoint/', strict_slashes=False). Anyway, the method add_resource() is defined and documented in Flask.

Finally, the main code configures and runs the API, thanks to the method run_api():

# Standard boilerplate to run the api
import sys
import aiida.restapi.common as common
config_dir = common.__path__[0]

if __name__ == '__main__':
    """
    Run the app accepting arguments.

    Ex:
     python example.py --host=127.0.0.2 --port=6000 --config-dir '<path_to_config.py>'

    Defaults:
     address: 127.0.01:5000,
     config directory: <aiida_path>/aiida/restapi/common
    """

    run_config = dict(
        hookup=True,
        default_config_dir=config_dir,
        default_host='127.0.0.1',
        default_port='5000'
    )

    run_api(App, NewApi, *sys.argv[1:], **run_config)

The method run_api() accomplishes several functions: it couples the API to an instance of flask.Flask, namely, the Flask fundamental class representing a web app. Consequently, the app is configured and, if required, hooked up. The spirit of run_api is to take all the ingredients to setup an API and use them to build up a command-line utility that serves to hook it up.

It requires as inputs:

  • the classes representing the Api and the App. We strongly suggest to pass to run_api() the class aiida.restapi.api.App, inheriting from flask.Flask, as it handles correctly AiiDA RESTApi-specific exceptions.
  • a tuple of positional arguments representing the command-line arguments/options (notice the use of sys.argv);
  • a dictionary of key-value arguments to set the default values of the command line options, e.g. --port, --host, --config-dir and --aiida-profile. If no default is set, the app will use 5000, 127.0.0.1, aiida.restapi.common and False, respectively.

You should know few more things before using the script:

  • If you want to customize further the error handling, you can take inspiration by looking at the definition of App and create your derived class NewApp(App).
  • The option hookup of the configuration dictionary must be set to True to use the script to start the API from command line. Below, we will show when it is appropriate to set hookup=False.
  • the supported command line options are identical to those of verdi restapi. Use verdi restapi --help for their full documentation. If you want to add more options or modify the existing ones, create you custom runner taking inspiration from run_api.

It is time to run example.py. Type in a terminal

$ chmod +x example.py
$ ./example.py --host=127.0.0.2 --port=6000

You should read the message

$  * Running on http://127.0.0.2:6000/ (Press CTRL+C to quit)

To route a request to the API from a terminal you can employ curl. Alternatively, you can use any REST client providing a GUI. Let us first ask for the latest created node through the GET method:

$ curl http://127.0.0.2:6000/api/v2/new-endpoint/ -X GET

The form of the output (and only the form) should resemble

$ {"attributes": {"binding_energy_per_substructure_per_unit_area_units": "eV/ang^2", "binding_energy_per_substructure_per_unit_area": 0.0220032273047497}, "ctime": "2017-04-05T16:01:06.227942+00:00", "id": 403504}

, whereas the actual values of the response dictionary as well as the internal structure of the attributes field will be in general very different.

Now, let us create a node through the POST method, and check it again through GET:

$ curl http://127.0.0.2:6000/api/v2/new-endpoint/ -X POST
$ {"id": 410618}
$ curl http://127.0.0.2:6000/api/v2/new-endpoint/ -X GET
$ {"attributes": {"property1": "spam", "property2": "egg"}, "ctime": "2017-06-20T15:36:56.320180+00:00", "id": 410618}

The POST request triggers the creation of a new ParameterData node, as confirmed by the response to the GET request.

As a final remark, there might be circumstances in which you do not want to hook up the API from command line. For example, you might want to expose the API through Apache for production, rather than the built-in Flask server. In this case, you can invoke run_api to return two custom objects app and api.

run_config = dict(
    hookup=False,
    catch_internal_server=False,
)

(app, api) = run_api(App, McloudApi, *sys.argv[1:], **run_config)

This snippet of code becomes the fundamental block of a wsgi file used by Apache as documented in How to run the REST API through Apache. Moreover, we recommend to consult the documentation of mod_wsgi.

Notice that we have set hookup=False and catch_internal_server=False. It is clear why the app is no longer required to be hooked up, i.e. Apache will do the job for us. The second option, instead, is not mandatory but potentially useful. It lets the exceptions thrown during the execution of the apps propagate all the way through until they reach the logger of Apache. Especially when the app is not entirely stable yet, one would like to read the full python error traceback in the Apache error log.

Workflow’s Guide For AiiDA

Creating new workflows

New user specific workflows should be importable by python. The simplest thing is to put them in aiida/workflows/user. A better option, if the workflow is general enough to be of interest for the community, is to create a new AiiDA plugin containing the workflow and install it (you can check the documentation on how to make new plugin repositories).

In the first case, put __init__.py files in all subdirectories of aiida/workflows/user to be able to import any workflows. Also, it may be a good idea to create a specific workflow factory to load easily workflows of the subdirectory. To do so place in your __init__.py file in the main workflow directory (e.g. in aiida/workflows/user/myname/__init__.py in the example above):

from aiida.orm.workflow import Workflow

def MynameWorkflowFactory(module):
    """
    Return a suitable Workflow subclass for the workflows defined here.
    """
    from aiida.common.pluginloader import BaseFactory

    return BaseFactory(module, Workflow, "aiida.workflows.user.myname")

In this example, a workflow located in e.g. aiida/workflows/user/myname/foldername/plugin.py can be loaded simply by typing:

MynameWorkflowFactory('foldername.plugin')

Note

The class name of the workflow should be compliant with the BaseFactory syntax. In the above example, it should be called PluginWorkflow otherwise the workflow factory won’t work.

You can also customize your verdi shell by adding this function to the modules to be loaded automatically – see here for more information.

Developer Workflow tutorial

Creating new workflows

In this section we are going to write a very simple AiiDA workflow. Before starting this tutorial, we assume that you have successfully completed the Developer calculation plugin tutorial and have your input and output plugins ready to use with this tutorial.

This tutorial creates a workflow for the addition of three numbers. Number could be an integer or a float value. All three numbers will be passed as parameters to the workflow in dictionary format (e.g. {"a": 1, "b": 2.2, "c":3}).

To demonstrate how a workflow works, we will perform the sum of three numbers in two steps:

  1. Step 1: temp_value = a + b
  2. Step 2: sum = temp_value + c

A workflow in AiiDA is a python script with several user defined functions called steps. All AiiDA functions are available inside “steps” and calculations or sub-workflows can be launched and retrieved. The AiiDA daemon executes a workflow and handles all the operations starting from script loading, error handling and reporting, state monitoring and user interaction with the execution queue. The daemon works essentially as an infinite loop, iterating several simple operations:

  1. It checks the running step in all the active workflows, if there are new calculations attached to a step it submits them.
  2. It retrieves all the finished calculations. If one step of one workflow exists where all the calculations are correctly finished it reloads the workflow and executes the next step as indicated in the script.
  3. If a workflow’s next step is the exit one, the workflow is terminated and the report is closed.

Note

Since the daemon is aware only of the classes present at the time of its launch, make sure you restart the daemon every time you add a new workflow, or modify an existing one. To restart a daemon, use following command:

verdi daemon restart

Let’s start to write a workflow step by step. First we have to import some packages:

from aiida.common import aiidalogger
from aiida.orm.workflow import Workflow
from aiida.orm import Code, Computer
from aiida.orm.data.parameter import ParameterData
from aiida.common.exceptions import InputValidationError

In order to write a workflow, we must create a class by extending the Workflow class from aiida.orm.workflow. This is a fundamental requirement, since the subclassing is the way AiiDA understand if a class inside the file is an AiiDA workflow or a simple utility class. In the class, you need to re-define an __init__ method as shown below (in the current code version, this is a requirement). Create a new file, which has the same name as the class you are creating (in this way, it will be possible to load it with WorkflowFactory), in this case addnumbers.py, with the following content:

class AddnumbersWorkflow(Workflow):
    """
    This workflow takes 3 numbers as an input and gives
    its addition as an output.
    Workflow steps:
    passed parameters: a,b,c
    1st step: a + b = step1_result
    2nd step: step1_result + c = final_result
    """

    def __init__(self, **kwargs):
        super(AddnumbersWorkflow, self).__init__(**kwargs)

Once the class is defined a user can add methods to generate calculations, download structures or compute new structures starting form a query in previous AiiDA calculations present in the DB. Here we will add simple helper function to validate the input parameters which will be the dictionary with keys a, b and c. All dictionary values should be of type integer or float.

def validate_input(self):
    """
    Check if the passed parameters are of type int or float
    else raise exception
    """
    # get parameters passed to workflow when it was
    # initialised. These parameters can not be modified
    # during an execution
    params = self.get_parameters()

    for k in ['a','b','c']:
        try:
            # check if value is int or float
            if not (isinstance(params[k], int) or isinstance(params[k], float)):
                raise InputValidationError("Value of {} is not of type int or float".format(k))
        except KeyError:
            raise InputValidationError("Missing input key {}".format(k))

    # add in report
    self.append_to_report("Starting workflow with params: {0}".format(params))

In the above method we have used append_to_report workflow method. Once the workflow is launched, the user interactions are limited to some events (stop, relaunch, list of the calculations). So most of the times it is very useful to have custom messages during the execution. Hence, workflow is equipped with a reporting facility self.append_to_report(string), where the user can fill with any text and can retrieve both live and at the end of the execution.

Now we will add the method to launch the actual calculations. We have already done this as part of plugin exercise and hence we do not discuss it in detail here.

def get_calculation_sum(self, a, b):
        """
        launch new calculation
        :param a: number
        :param b: number
        :return: calculation object, already stored
        """
        # get code/executable file
        codename = 'sum'
        code = Code.get_from_string(codename)

        computer_name = 'localhost'
        computer = Computer.get(computer_name)

        # create new calculation
        calc = code.new_calc()
        calc.set_computer(computer)
        calc.label = "Add two numbers"
        calc.description = "Calculation step in a workflow to add more than two numbers"
        calc.set_max_wallclock_seconds(30*60) # 30 min
        calc.set_withmpi(False)
        calc.set_resources({"num_machines": 1})

        # pass input to the calculation
        parameters = ParameterData(dict={'x1': a,'x2':b,})
        calc.use_parameters(parameters)

        # store calculation in database
        calc.store_all()
        return calc

Now we will write the first step which is one of the main components in the workflow. In the example below, the start method is decorated with Workflow.step making it a very unique kind of method, automatically stored in the database as a container of calculations and sub-workflows.

@Workflow.step
    def start(self):
        """
        Addition for first two parameters passed to workflow
        when it was initialised
        """

        try:
            self.validate_input()
        except InputValidationError:
            self.next(self.exit)
            return

        # get first parameter passed to workflow when it was initialised.
        a = self.get_parameter("a")
        # get second parameter passed to workflow when it was initialised.
        b = self.get_parameter("b")

        # start first calculation
        calc = self.get_calculation_sum(a, b)

        # add in report
        self.append_to_report("First step calculation is running...")

        # attach calculation in workflow to access in next steps
        self.attach_calculation(calc)

        # go to next step
        self.next(self.stage2)

Several functions are available to the user when coding a workflow step, and in the above method we have used basic ones discussed below:

  • self.get_parameters(): with this method we can retrieve the parameters passed to the workflow when it was initialized. Parameters cannot be modified during an execution, while attributes can be added and removed.
  • self.attach_calculation(calc): this is a key point in the workflow, and something possible only inside a step method. Every JobCalculation, generated in the method itself or retrieved from other utility methods, is attached to the workflow’s step. They are then launched and executed completely by the daemon, without the need of user interaction. Any number of calculations can be attached. The daemon will poll the servers until all the step calculations will be finished, and only after that it will call the next step.
  • self.next(Workflow.step): this is the final part of a step, where the user points the engine about what to do after all the calculations in the steps (on possible sub-workflows, as we will see later) are terminated. The argument of this function has to be a Workflow.step decorated method of the same workflow class, or in case this is the last step to be executed, you can use the common method self.exit which is always present in each Workflow subclass. Note that while this call typically occurs at the end of the function, this is not required and you can call the next() method as soon as you can decide which method should follow the current one. As it can be seen above, we can use some python logic (if, ...) to decide what the next method is going to be (above, we directly point to self.exit if the input is invalid).

Note

remember to call self.next(self.stage2) and NOT self.next(self.stage2())!! In the first case, we are correctly passing the method stage2 to next. In the second case we are instead immediately running the stage2 method, something we do not want to do (we need to wait for the current step to finish), and passing its return value to self.next (which is wrong).

The above start step calls method validate_input() to validate the input parameters. When the workflow will be launched through the start method, the AiiDA daemon will load the workflow, execute the step, launch all the calculations and monitor their state.

Now we will create a second step to retrieve the addition of first two numbers from the first step and then we will add the third input number. Once all the calculations in the start step will be finished, the daemon will load and execute the next step i.e. stage2, shown below:

@Workflow.step
    def stage2(self):
        """
        Get result from first calculation and add third value passed
        to workflow when it was initialised
        """
        # get third parameter passed to workflow when it was initialised.
        c = self.get_parameter("c")
        # get result from first calculation
        start_calc = self.get_step_calculations(self.start)[0]

        # add in report
        self.append_to_report("Result of first step calculation is {}".format(
            start_calc.res.sum))

        # start second calculation
        result_calc = self.get_calculation_sum(start_calc.res.sum, c)

        # add in report
        self.append_to_report("Second step calculation is done..")

        # attach calculation in workflow to access in next steps
        self.attach_calculation(result_calc)

        # go to next step
        self.next(self.stage3)

The new feature used in the above step is:

  • self.get_step_calculations(Workflow.step): anywhere after the first step we may need to retrieve and analyze calculations executed in a previous steps. With this method we can have access to the list of calculations of a specific workflows step, passed as an argument.

Now in the last step of the workflow we will retrieve the results from stage2 and exit the workflow by calling self.next(self.exit) method:

@Workflow.step
    def stage3(self):
        """
        Get the result from second calculation and add it as final
        result of this workflow
        """
        # get result from second calculation
        second_calc = self.get_step_calculations(self.stage2)[0]

        # add in report
        self.append_to_report("Result of second step calculation is {}".format(
            second_calc.res.sum))

        # add workflow result
        self.add_result('value',second_calc.res.sum)

        # add in report
        self.append_to_report("Added value to workflow results")

        # Exit workflow
        self.next(self.exit)

The new features used in the above step are:

  • self.add_result(): When all calculations are done it is useful to tag some of them as results, using custom string to be later searched and retrieved. Similarly to the get_step_calculations, this method works on the entire workflow and not on a single step.
  • self.next(self.exit): This is the final part of each workflow. Every workflow inheritate a fictitious step called exit that can be set as a next to any step. As the names suggest, this implies the workflow execution finished correctly.
Running a workflow

After saving the workflow inside a python file (i.e. addnumbers.py`) located in the ``aiida/workflows directory, we can launch the workflow simply invoking the specific workflow class and executing the start() method inside the verdi shell or in a python script (with the AiiDA framework loaded).

Note

Don’t forget to (re)start your daemon at this point!

In this case, let’s use the verdi shell. In the shell we execute:

AddnumbersWorkflow = WorkflowFactory("addnumbers")
params = {"a":2, "b": 1.4, "c": 1}
wobject = AddnumbersWorkflow(params=params)
wobject.store()
wobject.start()

In the above example we initialized the workflow with input parameters as a dictionary. The WorkflowFactory will work only if you gave the correct name both the python file and to the class. Otherwise, you can just substitute that line with a suitable import like:

from aiida.orm.workflows.addnumbers import AddnumbersWorkflow

We launched the workflow using start() method after storing it. Since start is a decorated workflow step, the workflow is added to the workflow to the execution queue monitored by the AiiDA daemon.

We now need to know what is going on. There are basically two main ways to see the workflows that are running: by printing the workflow list or a single workflow report.
  • Workflow list

    From the command line we run:

    >> verdi workflow list
    

    This will list all the running workflows, showing the state of each step and each calculation (and, when present, each sub-workflow). It is the fastest way to have a snapshot of what your AiiDA workflow daemon is working on. An example output right after the AddnumbersWorkflow submission should be:

    + Workflow AddnumbersWorkflow (pk: 76) is RUNNING [0h:00m:14s ago]
    |-* Step: start [->stage2] is RUNNING
    | | Calculation ('Number sum', pk: 739) is TOSUBMIT
    |
    

    The pk number of each workflow is reported, a unique ID identifying that specific execution of the workflow, something necessary to retrieve it at any other time in the future (as explained in the next point).

  • Workflow report

    As explained, each workflow is equipped with a reporting facility the user can use to log any intermediate information, useful to debug the state or show some details. Moreover the report is also used by AiiDA as an error reporting tool: in case of errors encountered during the execution, the AiiDA daemon will copy the entire stack trace in the workflow report before halting its execution. To access the report we need the specific pk of the workflow. From the command line you would run:

    verdi workflow report PK_NUMBER
    

    while from the verdi shell the same operation requires to use the get_report() method:

    >> load_workflow(PK_NUMBER).get_report()
    

    In both variants, PK_NUMBER is the pk number of the workflow we want the report of. The load_workflow function loads a Workflow instance from its pk number, or from its uuid (given as a string).

    Once launched, the workflows will be handled by the daemon until the final step or until some error occurs. In the last case, the workflow gets halted and the report can be checked to understand what happened.

  • Workflow result

    As explained, when all the calculations are done it is useful to tag some nodes or quantities as results, using a custom string to be later searched and retrieved. This method works on the entire workflow and not on a single step.

    To access the results we need the specific pk of the workflow. From the verdi shell, you can use the get_report() method:

    >> load_workflow(PK_NUMBER).get_results()
    

    In both variants, PK_NUMBER is the pk number of the workflow we want the report of.

  • Killing a workflow

    A user can also kill a workflow while it is running. This can be done with the following verdi command:

    >> verdi workflow kill PK_NUMBER_1 PK_NUMBER_2 PK_NUMBER_N
    

    where several pk numbers can be given. A prompt will ask for a confirmation; this can be avoided by using the -f option.

    An alternative way to kill an individual workflow is to use the kill method. In the verdi shell type:

    >> load_workflow(PK_NUMBER).kill()
    
Exercise

In the exercise you have to write a workflow for the addition of six numbers, using the workflow we just wrote as subworkflows.

For this workflow use:

  • Input parameters:
    params = {“w1”: {“a”: 2, “b”: 2.1, “c”: 1}, “w2”: {“a”: 2, “b”: 2.1, “c”: 4}}
  • start step:
    Use two sub workflows (the ones developed above) for the addition of three numbers:
    • Sub workflow with input w1 and calculate its sum (temp_result1)
    • Sub workflow with input w2 and calculate its sum (temp_result2)
  • stage2 step:
    final_result = temp_result1 + temp_result2 Add final_result to the workflow results and exit the workflow.

Some notes and tips:

  • You can attach a subworkflow similarly to how you attach a calculation: in the step, create the new subworkflow, set its parameters using set_parameters, store it, call the start() method, and then call self.attach_workflow(wobject) to attach it to the current step.
  • If you want to pass intermediate data from one step to another, you can set the data as a workflow attibute: in a step, call self.add_attribute(attr_name, attr_value), and retrieve it in another step using attr_value = self.get_attribute(attr_name). Values can be any JSON-serializable value, or an AiiDA node.

Verdi command line plugins

AiiDA can be extended by adding custom means of use to interact with it via the command line, by extending the ‘verdi’ commands.

We will describe in particular how to extend verdi data by adding a new subcommand.

Framework for verdi data

The code for each of the verdi data <datatype> <action> [--format <plugin>] commands is placed in _<Datatype> class inside aiida.cmdline.commands.data.py. Standard actions, such as

  • list
  • show
  • import
  • export

are implemented in corresponding classes:

which are inherited by _<Datatype> classes (multiple inheritance is possible). Actions show, import and export can be extended with new format plugins simply by adding additional methods in _<Datatype> (these are automatically detected). Action list can be extended by overriding default methods of the Listable.

Adding plugins for show, import, export and like

A plugin to show, import or export the data node can be added by inserting a method to _<Datatype> class. Each new method is automatically detected, provided it starts with _<action>_ (that means _show_ for show, _import_ for import and _export_ for export). Node for each of such method is passed using a parameter.

Note

plugins for show are passed a list of nodes, while plugins for import and export are passed a single node.

As the --format option is optional, the default plugin can be specified by setting the value for _default_<action>_plugin in the inheriting class, for example:

class _Parameter(VerdiCommandWithSubcommands, Visualizable):
    """
    View and manipulate Parameter data classes.
    """

    def __init__(self):
        """
        A dictionary with valid commands and functions to be called.
        """
        from aiida.orm.data.parameter import ParameterData
        self.dataclass = ParameterData
        self._default_show_format = 'json_date'
        self.valid_subcommands = {
            'show': (self.show, self.complete_visualizers),
            }

    def _show_json_date(self, exec_name, node_list):
        """
        Show contents of ParameterData nodes.
        """

If the default plugin is not defined and there are more than one plugin, an exception will be raised upon issuing verdi data <datatype> <action> to be caught and explained for the user.

Plugin-specific command line options

Plugin-specific command line options can be appended in plugin-specific methods _<action>_<plugin>_parameters(self,parser). All these methods are called before parsing command line arguments, and are passed an argparse.ArgumentParser instance, to which command line argument descriptions can be appended using parser.add_argument(). For example:

def _show_jmol_parameters(self, parser):
    """
    Describe command line parameters.
    """
    parser.add_argument('--step',
                        help="ID of the trajectory step. If none is "
                             "supplied, all steps are exported.",
                        type=int, action='store')

Note

as all _<action>_<plugin>_parameters(self,parser) methods are called, it requires some attention in order not to make conflicting command line argument names!

Note

it’s a good practice to set default=None for all command line arguments, since None-valued arguments are excluded before passing the parsed argument dictionary to a desired plugin.

Implementing list

As listing of data nodes can be extended with filters, controllable using command line parameters, the code of Listable is split into a few separate methods, that can be individually overridden:

  • list:
    the main method, parsing the command line arguments and printing the data node information to the standard output;
  • query:
    takes the parsed command line arguments and performs a query on the database, returns table of unformatted strings, representing the hits;
  • append_list_cmdline_arguments:
    informs the command line argument parser about additional, user-defined parameters, used to control the query function;
  • get_column_names:
    returns the names of columns to be printed by list method.
Adding a verdi command

Here we will add a new verdi command for the FloatData datatype we created and used in Developer code plugin tutorial exercise.

The new command will be:

>> verdi data float show <pk>

To create the above verdi command, we will write a _Float class inheriting from both VerdiCommandWithSubcommands and Visualizable classes; this class will be added inside aiida.cmdline.commands.data.py file. By inheriting from Visualizable, our class will have a``show()`` method, that we can use as the default action for verdi data float show:

class _Float(VerdiCommandWithSubcommands, Visualizable):
    """
    View and manipulate Float data classes.
    """

    def __init__(self):
        """
        A dictionary with valid commands and functions to be called.
        """
        from aiida.orm.data.float import FloatData

        self.dataclass = FloatData
        self.valid_subcommands = {
            'show': (self.show, self.complete_none),
        }
        self._default_show_format = 'simple'

The features used in init method are:

  • self.dataclass: It is the data type for which the command is written. In this example it is FloatData.
  • self.valid_subcommands: It is the dictionary of valid subcommands and the two functions to be called when the given command is called, or when bash completion is needed. Each key will be the command for the defined data type. For FloatData we are therefore adding a show command, that will call self.show() as method from base cass to be called on. We pass self.complete_none as completion function to disable further bash completion after the command (this method is defined in the VerdiCommandWithSubcommands base class). The self.show() method creates a list of all methods of the current class with prefix _show_ in their name, and provides them as possible formats.
  • self._default_show_format: It is the default format to be displayed for the show' command when no specific format is passed as an argument. For FloatData, we will show data in a simple format by default. To display node in simple format, we will simply add a method called _show_simple() in the _Float class. Please note that the method name should follow the convention _show_ + format_name.

The _show_simple() method will be:

def _show_simple(self, exec_name, node_list):
    """
    Show contents of FloatData nodes.
    """
    from aiida.cmdline import print_dictionary

    for node in node_list:
        print node.value

In this method we have passed the executable name and the list of nodes. To print FloatData in simple format we are just printing the corresponding value on screen.

Once the _Float class is added, make sure to add entry in self.routed_subcommands dictionary in the __init__ method of the Data class in aiida.cmdline.commands.data.py file as shown below.

class Data(VerdiCommandRouter):
    """
    Setup and manage data specific types

    There is a list of subcommands for managing specific types of data.
    For instance, 'data upf' manages pseudopotentials in the UPF format.
    """

    def __init__(self):
        """
        A dictionary with valid commands and functions to be called.
        """
        ## Add here the classes to be supported.
        self.routed_subcommands = {
            .
            .
            # other entries
            'float': _Float,
        }

The new verdi command float, is now ready!

Try experimenting by adding other formats for show command or by adding other commands like list, import and export for FloatData data type.

Exporting structures to TCOD

Export of StructureData and CifData (or any other data type, which can be converted to them) to the Theoretical Crystallography Open Database (TCOD) can be divided into following workflow steps:

No. Description Input Output Type Implemented?
0 Conversion of the StructureData to CifData StructureData CifData Inline +
1 Detection of the symmetry and reduction to the unit cell CifData CifData Inline +
2 Niggli reduction of the unit cell CifData CifData Inline
3 Addition of structure properties (total energy, residual forces) CifData, ParameterData CifData Inline PW and CP
4 Addition of the metadata for reproduction of the results CifData CifData Inline ~
5 Depostition to the TCOD CifData ParameterData Job +

Type of each step’s calculation (InlineCalculation or JobCalculation) defined in column Type. Each step is described in more detail below:

  • Conversion of the StructureData to CifData
    Conversion between the StructureData and CifData is done via ASE atoms object.
  • Detection of the symmetry and reduction to the unit cell
    Detection of the symmetry and reduction to the unit cell is performed using spglib.refine_cell() function.
  • Niggli reduction of the unit cell
    Reduction of the unit cell to Niggli cell is a nice to have feature, as it would allow to represent structure as an unambiguously selected unit cell.
  • Addition of structure properties (energy, remaining forces)
    The structure properties from the calculations, such as total energy and residual forces can be extracted from ParameterData nodes and put into related TCOD CIF dictionaries tags using calculation-specific parameter translator, derived from BaseTcodtranslator.
  • Addition of the metadata for reproduction of the results
    Current metadata, added for reproducibility, includes scripts for re-running of calculations, outputs from the calculations and exported subset of AiiDA database. It’s not quite clear what/how to record the metadata for calculations of type InlineCalculation.
  • Depostition to the TCOD
    Deposition of the final CifData to the TCOD is performed using cif_cod_deposit script from the codtools plugin.

GIT cheatsheet

Excellent and thorough documentation on how to use GIT can be found online on the official GIT documentation or by searching on Google. We summarize here only a set of commands that may be useful.

Interesting online resources
Set the push default behavior to push only the current branch

The default push behavior may not be what you expect: if a branch you are not working on changes, you may not be able to push your own branch, because git tries to check them all. To avoid this, use:

git config push.default upstream

to set the default push.default behaviour to push the current branch to its upstream branch. Note the actual string to set depends on the version of git; newer versions allow to use:

git config push.default simple

which is better; see also discussion on this stackoverflow page.

View commits that would be pushed

If you want to see which commits would be sent to the remote repository upon a git push command, you can use (e.g. if you want to compare with the origin/develop remote branch):

git log origin/develop..HEAD

to see the logs of the commits, or:

git diff origin/develop..HEAD

to see also the differences among the current HEAD and the version on origin/develop.

Switch to another branch

You can switch to another branch with:

git checkout newbranchname

and you can see the list of checked-out branches, and the one you are in, with:

git branch

(or git branch -a to see also the list of remote branches).

Associate a local and remote branch

To tell GIT to always push a local branch (checked-out) to a remote branch called remotebranchname, check out the correct local branch and then do:

git push --set-upstream origin remotebranchname

From now on, you will just need to run git push. This will create a new entry in .git/config similar to:

[branch "localbranchname"]
  remote = origin
  merge = refs/heads/remotebranchname
Branch renaming

To rename a branch locally, from oldname to newname:

git checkout oldname
git branch -m oldname newname

If you want also to rename it remotely, you have to create a new branch and then delete the old one. One way to do it, is first editing ~/.git/config so that the branch points to the new remote name, changing refs/heads/oldname to refs/heads/newname in the correct section:

[branch "newname"]
  remote = origin
  merge = refs/heads/newname

Then, do a:

git push origin newname

to create the new branch, and finally delete the old one with:

git push origin :oldname

(notice the : symbol). Note that if you are working e.g. on GitHub, there may be a filter to disallow the deletion of branches (check in the repository settings, and then under “Branch management”). Moreover, the “Main branch” (set in the repository settings, under “Repository details”) cannot be deleted.

Create a new (lightweight) tag

If you want to create a new tag, e.g. for a new version, and you have checked out the commit that you want to tag, simply run:

git tag TAGNAME

(e.g., git tag v0.2.0). Afterwards, remember to push the tag to the remote repository (otherwise it will remain only local):

git push --tags
Create a new branch from a given tag

This will create a new newbranchname branch starting from tag v0.2.0:

git checkout -b newbranchname v0.2.0

Then, if you want to push the branch remotely and have git remember the association:

git push --set-upstream origin remotebranchname

(for the meaning of –set-upsteam see the section Associate a local and remote branch above).

Disallow a branch deletion, or committing to a branch, on GitHub

You can find these settings in the repository settings of the web interface, and then under “Branches”.

Note

if you commit to a branch (locally) and then discover that you cannot push (e.g. you mistakenly committed to the master branch), you can remove your last commit using:

git reset --hard HEAD~1

(this removes one commit only, and you should have no local modifications; if you do it, be sure to avoid losing your modifications!)

Merge from a different repository

It is possible to do a pull request of a forked repository from the GitHub web interface. However, if one just wants to keep in sync, e.g., the main AiiDA repository with a fork you are working into without creating a pull request (e.g., for daily merge of your fork’s develop into the main repo’s develop), you can:

  • commit and pull all your changes in your fork

  • from the GitHub web interface, sync your fork with the main repository, if needed

  • go in a local cloned version of the main repository

  • [only the first time] add a remote pointing to the new repository, with the name you prefer (here: myfork):

    git remote add myfork git@github.com:GITHUBUSER/FORKEDREPO.git
    
  • checkout to the correct branch you want to merge into (git checkout develop)

  • do a git pull (just in case)

  • Fetch the correct branch of the other repository (e.g., the develop branch):

    git fetch myfork develop
    

    (this will fetch that branch into a temporary location called FETCH_HEAD).

  • Merge the modifications:

    git merge FETCH_HEAD
    
  • Fix any merge conflicts (if any) and commit.

  • Finally, push the merged result into the main repository:

    git push
    

    (or, if you did not use the default remote with --set-upstream, specify the correct remote branch, e.g. git push origin develop).

Note

If you want to fetch and transfer also tags, use instead:

git fetch -t myfork develop
git merge FETCH_HEAD
git push --tags

to get the tags from myfork and then push them in the current repository.

Sphinx cheatsheet

A brief overview of some of the main functions of Sphinx as used in the aiida documentation. View This Page to see how this page was formatted. This is only a brief outline for more please see the Sphinx documentation

Main Titles and Subtitles

This is an example of a main title.

subtitles are made like this

This is an example of a subtitle.

Formatting
Basic Paragraph Formatting

Words can be written in italics or in bold. Text describing a specific computer_thing can be formatted as well.

Paragraph and Indentation

Much like in regular python, the indentation plays a strong role in the formatting.

For example all of this sentence will appear on the same line.

While this sentence will appear
differently because there is an indent.
Terminal and Code Formatting

Something to be run in command line can be formatted like this:

>> Some command

As can be seen above, while snippets of python on code can be done like this:

import module
print('hello world')
Notes

Note

Notes can be added like this.

Bullet Points and Lists
  • Bullet points can be added
  • Just like this * With sub-bullets like this
  1. While numerical bullets
  2. Can be added
  3. Like this
Table of Contents Docs and Code
Table of Contents for Docs

An example of the table of contents syntax for the GIT cheatsheet can be seen here note that these are especially important in the global structure of the document, as found in index.rst files.

Note

The maxdepth parameter can be used to change how deep the title indexing goes. See This Page.

Table of Contents for Code

Table of contents, that cross reference code, can be done very similarly to how it is done for documents. For example the parser docs can be indexed like this

ORM documentation: generic aiida.orm

This section describes the aiida object-relational mapping.

Some generic methods of the module aiida.orm.utils

class aiida.orm.utils.BackendDelegateWithDefault(backend)[source]

This class is a helper to implement the delegation pattern [1] by delegating functionality (i.e. calling through) to the backend class which will do the actual work.

[1] https://en.wikipedia.org/wiki/Delegation_pattern

aiida.orm.utils.CalculationFactory(module, from_abstract=False)[source]

Return a suitable JobCalculation subclass.

Parameters:
  • module – a valid string recognized as a Calculation plugin
  • from_abstract – A boolean. If False (default), actually look only to subclasses of JobCalculation, not to the base Calculation class. If True, check for valid strings for plugins of the Calculation base class.
aiida.orm.utils.DataFactory(module)[source]

Return a suitable Data subclass.

aiida.orm.utils.WorkflowFactory(module)[source]

Return a suitable Workflow subclass.

aiida.orm.utils.create_node_id_qb(node_id=None, pk=None, uuid=None, parent_class=None, query_with_dashes=True)[source]

Returns the QueryBuilder instance set to retrieve AiiDA objects given their (parent)class and PK (in which case the object should be unique) or UUID or UUID starting pattern.

Parameters:
  • node_id – PK (integer) or UUID (string) or a node
  • pk – PK of a node
  • uuid – UUID of a node, or the beginning of the uuid
  • parent_class – if specified, looks only among objects that are instances of a subclass of parent_class, otherwise among nodes
  • query_with_dashes (bool) – Specific if uuid is passed, allows to put the uuid in the correct form. Default=True
Returns:

a QueryBuilder instance

aiida.orm.utils.load_node(node_id=None, pk=None, uuid=None, parent_class=None, query_with_dashes=True)[source]

Returns an AiiDA node given its PK or UUID.

Parameters:
  • node_id – PK (integer) or UUID (string) or a node
  • pk – PK of a node
  • uuid – UUID of a node, or the beginning of the uuid
  • parent_class – if specified, checks whether the node loaded is a subclass of parent_class
  • query_with_dashes (bool) – Specific if uuid is passed, allows to put the uuid in the correct form. Default=True
  • return_node (bool) – lets the function return the AiiDA node referred by the input. Default=False
Returns:

the required AiiDA node if existing, unique, and (sub)instance of parent_class

Raises:
aiida.orm.utils.load_workflow(wf_id=None, pk=None, uuid=None)[source]

Return an AiiDA workflow given PK or UUID.

Parameters:
  • wf_id – PK (integer) or UUID (string) or UUID instance or a workflow
  • pk – PK of a workflow
  • uuid – UUID of a workflow
Returns:

an AiiDA workflow

Raises:

ValueError if none or more than one of parameters is supplied or type of wf_id is neither string nor integer

Computer
class aiida.orm.implementation.general.computer.AbstractComputer(**kwargs)[source]

Base class to map a node in the DB + its permanent repository counterpart.

Stores attributes starting with an underscore.

Caches files and attributes before the first save, and saves everything only on store(). After the call to store(), in general attributes cannot be changed, except for those listed in the self._updatable_attributes tuple (empty for this class, can be extended in a subclass).

Only after storing (or upon loading from uuid) metadata can be modified and in this case they are directly set on the db.

In the plugin, also set the _plugin_type_string, to be set in the DB in the ‘type’ field.

copy()[source]

Return a copy of the current object to work with, not stored yet.

full_text_info

Return a (multiline) string with a human-readable detailed information on this computer.

classmethod get(computer)[source]

Return a computer from its name (or from another Computer or DbComputer instance)

static get_db_columns()[source]

This method returns a list with the column names and types of the table corresponding to this class. :return: a list with the names of the columns

get_dbauthinfo(user)[source]

Return the aiida.backends.djsite.db.models.DbAuthInfo instance for the given user on this computer, if the computer is not configured for the given user.

Parameters:user – a DbUser instance.
Returns:a aiida.backends.djsite.db.models.DbAuthInfo instance
Raises:NotExistent – if the computer is not configured for the given user.
get_default_mpiprocs_per_machine()[source]

Return the default number of CPUs per machine (node) for this computer, or None if it was not set.

get_mpirun_command()[source]

Return the mpirun command. Must be a list of strings, that will be then joined with spaces when submitting.

I also provide a sensible default that may be ok in many cases.

id

Return the principal key in the DB.

is_user_configured(user)[source]

Return True if the computer is configured for the given user, False otherwise.

Parameters:user – a DbUser instance.
Returns:a boolean.
is_user_enabled(user)[source]

Return True if the computer is enabled for the given user (looking only at the per-user setting: the computer could still be globally disabled).

Note:Return False also if the user is not configured for the computer.
Parameters:user – a DbUser instance.
Returns:a boolean.
classmethod list_names()[source]

Return a list with all the names of the computers in the DB.

pk

Return the principal key in the DB.

set_default_mpiprocs_per_machine(def_cpus_per_machine)[source]

Set the default number of CPUs per machine (node) for this computer. Accepts None if you do not want to set this value.

set_mpirun_command(val)[source]

Set the mpirun command. It must be a list of strings (you can use string.split() if you have a single, space-separated string).

store()[source]

Store the computer in the DB.

Differently from Nodes, a computer can be re-stored if its properties are to be changed (e.g. a new mpirun command, etc.)

uuid

Return the UUID in the DB.

validate()[source]

Check if the attributes and files retrieved from the DB are valid. Raise a ValidationError if something is wrong.

Must be able to work even before storing: therefore, use the get_attr and similar methods that automatically read either from the DB or from the internal attribute cache.

For the base class, this is always valid. Subclasses will reimplement this. In the subclass, always call the super().validate() method first!

Node
class aiida.orm.implementation.general.node.AbstractNode(**kwargs)[source]

Base class to map a node in the DB + its permanent repository counterpart.

Stores attributes starting with an underscore.

Caches files and attributes before the first save, and saves everything only on store(). After the call to store(), in general attributes cannot be changed, except for those listed in the self._updatable_attributes tuple (empty for this class, can be extended in a subclass).

Only after storing (or upon loading from uuid) extras can be modified and in this case they are directly set on the db.

In the plugin, also set the _plugin_type_string, to be set in the DB in the ‘type’ field.

__del__()[source]

Called only upon real object destruction from memory I just try to remove junk, whenever possible; do not trust too much this function!

__init__(**kwargs)[source]

Initialize the object Node.

Parameters:uuid – if present, the Node with given uuid is loaded from the database. (It is not possible to assign a uuid to a new Node.)
class __metaclass__[source]

Some python black magic to set correctly the logger also in subclasses.

__weakref__

list of weak references to the object (if defined)

Add a link in the cache.

Add a link to the current node from the ‘src’ node. Both nodes must be a Node instance (or a subclass of Node)

Note:

this function should not be called directly; it acts directly on the database.

Parameters:
  • src – the source object
  • label (str) – the name of the label to set the link from src. Default = None.
_check_are_parents_stored()[source]

Check if all parents are already stored, otherwise raise.

Raises:ModificationNotAllowed – if one of the input nodes in not already stored.
_db_attrs()[source]

Returns the keys of the attributes as a generator, directly from the DB.

DO NOT USE DIRECTLY.

_db_iterattrs()[source]

Iterator over the attributes (directly in the DB!)

DO NOT USE DIRECTLY.

_del_all_attrs()[source]

Delete all attributes associated to this node.

Raises:ModificationNotAllowed – if the Node was already stored.
_del_attr(key)[source]

Delete an attribute.

Parameters:

key – attribute to delete.

Raises:
_del_db_attr(key)[source]

Delete an attribute directly from the DB

DO NOT USE DIRECTLY.

Parameters:key – The key of the attribute to delete
_del_db_extra(key)[source]

Delete an extra, directly on the DB.

DO NOT USE DIRECTLY.

Parameters:key (str) – key name
_get_db_attr(key)[source]

Return the attribute value, directly from the DB.

DO NOT USE DIRECTLY.

Parameters:key – the attribute key
Returns:the attribute value
Raises:AttributeError – if the attribute does not exist.
_get_db_extra(key)[source]

Get an extra, directly from the DB.

DO NOT USE DIRECTLY.

Parameters:key (str) – key name
Returns:the key value
Raises:AttributeError – if the key does not exist

Return a list of tuples (label, aiida_class) for each input link, possibly filtering only by those of a given type.

Parameters:link_type – if not None, a link type to filter results
Returns:a list of tuples (label, aiida_class)

Return a list of tuples (label, aiida_class) for each output link, possibly filtering only by those of a given type.

Parameters:link_type – if not None, a link type to filter results
Returns:a list of tuples (label, aiida_class)
_get_dbcomments(pk=None)[source]

Return a sorted list of DbComment associated with the Node.

Parameters:pk – integer or list of integers. If it is specified, returns the comment values with desired pks. (pk refers to DbComment.pk)
Returns:the list of DbComment, sorted by pk.
_get_folder_pathsubfolder

Get the subfolder in the repository.

Returns:a Folder object.
_get_temp_folder()[source]

Get the folder of the Node in the temporary repository.

Returns:a SandboxFolder object mapping the node in the repository.

Return True if there is at least one cached (input) link, that is a link that is not stored yet in the database. False otherwise.

_increment_version_number_db()[source]

This function increments the version number in the DB. This should be called every time you need to increment the version (e.g. on adding a extra or attribute).

Note:Do not manually increment the version number, because if two different threads are adding/changing an attribute concurrently, the version number would be incremented only once.
_init_internal_params()[source]

Set here the default values for this class; this method is automatically called by the init.

Note:if you inherit this function, ALWAYS remember to call super()._init_internal_params() as the first thing in your inherited function.
_linking_as_output(dest, link_type)[source]

Raise a ValueError if a link from self to dest is not allowed. Implement in subclasses.

Parameters:dest – the destination output Node
Returns:a boolean (True)
_remove_comment(comment_pk, user)[source]

Function called by verdi comment remove

Remove from the DB the input link with the given label.

Note:

this function should not be called directly; it acts directly on the database.

Note:

No checks are done to verify that the link actually exists.

Parameters:
  • label (str) – the label of the link from src to the current Node
  • link_type – The type of link, must be one of the enum values form LinkType

Remove from the DB the input link with the given label.

Note:

In subclasses, change only this. Moreover, remember to call the super() method in order to properly use the caching logic!

Note:

No error is raised if the link does not exist.

Parameters:
  • label (str) – the name of the label to set the link from src.
  • link_type – The type of link, must be one of the enum values form LinkType

Replace an input link with the given label and type, or simply creates it if it does not exist.

Note:

this function should not be called directly; it acts directly on the database.

Parameters:
  • src (str) – the source object.
  • label (str) – the label of the link from src to the current Node
  • link_type – The type of link, must be one of the enum values form LinkType

Replace an input link with the given label, or simply creates it if it does not exist.

Note:

In subclasses, change only this. Moreover, remember to call the super() method in order to properly use the caching logic!

Parameters:
  • src – the source object
  • label (str) – the name of the label to set the link from src.
_repository_folder

Get the permanent repository folder. Use preferentially the folder property.

Returns:the permanent RepositoryFolder object
_reset_db_extras(new_extras)[source]

Resets the extras (replacing existing ones) directly in the DB

DO NOT USE DIRECTLY!

Parameters:new_extras – dictionary with new extras
_set_attr(key, value)[source]

Set a new attribute to the Node (in the DbAttribute table).

Parameters:
  • key (str) – key name
  • value – its value
Raises:
  • ModificationNotAllowed – if such attribute cannot be added (e.g. because the node was already stored, and the attribute is not listed as updatable).
  • ValidationError – if the key is not valid (e.g. it contains the separator symbol).
_set_db_attr(key, value)[source]

Set the value directly in the DB, without checking if it is stored, or using the cache.

DO NOT USE DIRECTLY.

Parameters:
  • key (str) – key name
  • value – its value
_set_db_computer(computer)[source]

Set the computer directly inside the dbnode member, in the DB.

DO NOT USE DIRECTLY.

Parameters:computer – the computer object
_set_db_extra(key, value, exclusive)[source]

Store extra directly in the DB, without checks.

DO NOT USE DIRECTLY.

Parameters:
  • key – key name
  • value – key value
  • exclusive – (default=False). If exclusive is True, it raises a UniquenessError if an Extra with the same name already exists in the DB (useful e.g. to “lock” a node and avoid to run multiple times the same computation on it).
_set_defaults

Default values to set in the __init__, if no value is explicitly provided for the given key. It is a dictionary, with k=v; if the key k is not provided to the __init__, and a value is present here, this is set.

_set_internal(arguments, allow_hidden=False)[source]

Works as self.set(), but takes a dictionary as the ‘arguments’ variable, instead of reading it from the kwargs; moreover, it allows to specify allow_hidden to True. In this case, if a a key starts with and underscore, as for instance _state, it will not call the function set__state but rather _set_state.

_set_with_defaults(**kwargs)[source]

Calls the set() method, but also adds the class-defined default values (defined in the self._set_defaults attribute), if they are not provided by the user.

Note:for the default values, also allow to define ‘hidden’ methods, meaning that if a default value has a key “_state”, it will not call the function “set__state” but rather “_set_state”. This is not allowed, instead, for the standard set() method.

Store all input links that are in the local cache, transferring them to the DB.

Note:This can be called only if all parents are already stored.
Note:Links are stored only after the input nodes are stored. Moreover, link storage is done in a transaction, and if one of the links cannot be stored, an exception is raised and all links will remain in the cache.
Note:This function can be called only after the node is stored. After that, it can be called multiple times, and nothing will be executed if no links are still in the cache.
Parameters:with_transaction – if False, no transaction is used. This is meant to be used ONLY if the outer calling function has already a transaction open!
_store_input_nodes()[source]

Find all input nodes, and store them, checking that they do not have unstored inputs in turn.

Note:this function stores all nodes without transactions; always call it from within a transaction!
_update_comment(new_field, comment_pk, user)[source]

Function called by verdi comment update

_update_db_description_field(field_value)[source]

Update the description of this node, acting directly at the DB level

_update_db_label_field(field_value)[source]

Update the label field acting directly on the DB

_validate()[source]

Check if the attributes and files retrieved from the DB are valid. Raise a ValidationError if something is wrong.

Must be able to work even before storing: therefore, use the get_attr and similar methods that automatically read either from the DB or from the internal attribute cache.

For the base class, this is always valid. Subclasses will reimplement this. In the subclass, always call the super()._validate() method first!

add_comment(content, user=None)[source]

Add a new comment.

Parameters:content – string with comment

Add a link to the current node from the ‘src’ node. Both nodes must be a Node instance (or a subclass of Node) :note: In subclasses, change only this. Moreover, remember to call the super() method in order to properly use the caching logic!

Parameters:
  • src – the source object
  • label (str) – the name of the label to set the link from src. Default = None.
  • link_type – The type of link, must be one of the enum values from LinkType
add_path(src_abs, dst_path)[source]

Copy a file or folder from a local file inside the repository directory. If there is a subpath, folders will be created.

Copy to a cache directory if the entry has not been saved yet.

Parameters:
  • src_abs (str) – the absolute path of the file to copy.
  • dst_filename (str) – the (relative) path on which to copy.
Todo:

in the future, add an add_attachment() that has the same meaning of a extras file. Decide also how to store. If in two separate subfolders, remember to reset the limit.

attrs()[source]

Returns the keys of the attributes as a generator.

Returns:a generator of a strings
copy()[source]

Return a copy of the current object to work with, not stored yet.

This is a completely new entry in the DB, with its own UUID. Works both on stored instances and with not-stored ones.

Copies files and attributes, but not the extras. Does not store the Node to allow modification of attributes.

Returns:an object copy
ctime

Return the creation time of the node.

dbnode
Returns:the corresponding DbNode object.
del_extra(key)[source]

Delete a extra, acting directly on the DB! The action is immediately performed on the DB. Since extras can be added only after storing the node, this function is meaningful to be called only after the .store() method.

Parameters:key (str) – key name
Raise:AttributeError: if key starts with underscore
Raise:ModificationNotAllowed: if the node is not stored yet
description

Get the description of the node.

Returns:a string
extras()[source]

Get the keys of the extras.

Returns:a list of strings
folder

Get the folder associated with the node, whether it is in the temporary or the permanent repository.

Returns:the RepositoryFolder object.
get_abs_path(path=None, section=None)[source]

Get the absolute path to the folder associated with the Node in the AiiDA repository.

Parameters:
  • path (str) – the name of the subfolder inside the section. If None returns the abspath of the folder. Default = None.
  • section – the name of the subfolder (‘path’ by default).
Returns:

a string with the absolute path

For the moment works only for one kind of files, ‘path’ (internal files)

get_attr(key, default=())[source]

Get the attribute.

Parameters:
  • key – name of the attribute
  • default – if no attribute key is found, returns default
Returns:

attribute value

Raises:

AttributeError – If no attribute is found and there is no default

get_attrs()[source]

Return a dictionary with all attributes of this node.

get_comments(pk=None)[source]

Return a sorted list of comment values, one for each comment associated to the node.

Parameters:pk – integer or list of integers. If it is specified, returns the comment values with desired pks. (pk refers to DbComment.pk)
Returns:the list of comments, sorted by pk; each element of the list is a dictionary, containing (pk, email, ctime, mtime, content)
get_computer()[source]

Get the computer associated to the node.

Returns:the Computer object or None.
static get_db_columns()[source]

This method returns a list with the column names and types of the table corresponding to this class. :return: a list with the names of the columns

get_desc()[source]

Returns a string with infos retrieved from a node’s properties. This method is actually overwritten by the inheriting classes

Returns:a description string
get_extra(key, *args)[source]

Get the value of a extras, reading directly from the DB! Since extras can be added only after storing the node, this function is meaningful to be called only after the .store() method.

Parameters:
  • key – key name
  • value – if no attribute key is found, returns value
Returns:

the key value

Raises:

ValueError – If more than two arguments are passed to get_extra

get_extras()[source]

Get the value of extras, reading directly from the DB! Since extras can be added only after storing the node, this function is meaningful to be called only after the .store() method.

Returns:the dictionary of extras ({} if no extras)
get_folder_list(subfolder='.')[source]

Get the the list of files/directory in the repository of the object.

Parameters:subfolder – get the list of a subfolder
Returns:a list of strings.
get_inputs(node_type=None, also_labels=False, only_in_db=False, link_type=None)[source]

Return a list of nodes that enter (directly) in this node

Parameters:
  • node_type – If specified, should be a class, and it filters only elements of that specific type (or a subclass of ‘type’)
  • also_labels – If False (default) only return a list of input nodes. If True, return a list of tuples, where each tuple has the following format: (‘label’, Node), with ‘label’ the link label, and Node a Node instance or subclass
  • only_in_db – Return only the inputs that are in the database, ignoring those that are in the local cache. Otherwise, return all links.
  • link_type – Only get inputs of this link type, if None then returns all inputs of all link types.
get_inputs_dict(only_in_db=False, link_type=None)[source]

Return a dictionary where the key is the label of the input link, and the value is the input node.

Parameters:
  • only_in_db – If true only get stored links, not cached
  • link_type – Only get inputs of this link type, if None then returns all inputs of all link types.
Returns:

a dictionary {label:object}

get_outputs(type=None, also_labels=False, link_type=None)[source]

Return a list of nodes that exit (directly) from this node

Parameters:
  • type – if specified, should be a class, and it filters only elements of that specific type (or a subclass of ‘type’)
  • also_labels – if False (default) only return a list of input nodes. If True, return a list of tuples, where each tuple has the following format: (‘label’, Node), with ‘label’ the link label, and Node a Node instance or subclass
  • link_type – Only return outputs connected by links of this type.
get_outputs_dict(link_type=None)[source]

Return a dictionary where the key is the label of the output link, and the value is the input node. As some Nodes (Datas in particular) can have more than one output with the same label, all keys have the name of the link with appended the pk of the node in output. The key without pk appended corresponds to the oldest node.

Returns:a dictionary {linkname:object}
classmethod get_subclass_from_pk(pk)[source]

Get a node object from the pk, with the proper subclass of Node. (integer primary key used in this database), but loading the proper subclass where appropriate.

Parameters:pk – a string with the pk of the object to be loaded.
Returns:the object of the proper subclass.
Raise:NotExistent: if there is no entry of the desired object kind with the given pk.
classmethod get_subclass_from_uuid(uuid)[source]

Get a node object from the uuid, with the proper subclass of Node. (if Node(uuid=...) is called, only the Node class is loaded).

Parameters:uuid – a string with the uuid of the object to be loaded.
Returns:the object of the proper subclass.
Raise:NotExistent: if there is no entry of the desired object kind with the given uuid.
get_user()[source]

Get the user.

Returns:a Django DbUser model object
has_children

Property to understand if children are attached to the node :return: a boolean

has_parents

Property to understand if parents are attached to the node :return: a boolean

id
Returns:the principal key (the ID) as an integer, or None if the node was not stored yet
inp

Traverse the graph of the database. Returns a databaseobject, linked to the current node, by means of the linkname. Example: B = A.inp.parameters: returns the object (B), with link from B to A, with linkname parameters C= A.inp: returns an InputManager, an object that is meant to be accessed as the previous example

is_stored

Return True if the node is stored, False otherwise.

iterattrs()[source]

Iterator over the attributes, returning tuples (key, value)

iterextras()[source]

Iterator over the extras, returning tuples (key, value)

Todo:verify that I am not creating a list internally
label

Get the label of the node.

Returns:a string.
logger

Get the logger of the Node object.

Returns:Logger object
mtime

Return the modification time of the node.

out

Traverse the graph of the database. Returns a databaseobject, linked to the current node, by means of the linkname. Example: B = A.out.results: Returns the object B, with link from A to B, with linkname parameters

pk
Returns:the principal key (the ID) as an integer, or None if the node was not stored yet
classmethod query(*args, **kwargs)[source]

Map to the aiidaobjects manager of the DbNode, that returns Node objects (or their subclasses) instead of DbNode entities.

# TODO: VERY IMPORTANT: the recognition of a subclass from the type # does not work if the modules defining the subclasses are not # put in subfolders. # In the future, fix it either to make a cache and to store the # full dependency tree, or save also the path.

querybuild(*args, **kwargs)[source]

Instantiates and :returns: a QueryBuilder instance.

The QueryBuilder’s path has one vertice so far, namely this class. Additional parameters (e.g. filters or a label), can be passes as keyword arguments.

Parameters:
  • label – Label to give
  • filters – filters to apply
  • project – projections

This class is a comboclass (see combomethod()) therefore the method can be called as class or instance method. If called as an instance method, adds a filter on the id.

remove_path(path)[source]

Remove a file or directory from the repository directory. Can be called only before storing.

Parameters:path (str) – relative path to file/directory.
reset_extras(new_extras)[source]

Deletes existing extras and creates new ones. :param new_extras: dictionary with new extras :return: nothing, an exceptions is raised in several circumnstances

set(**kwargs)[source]

For each k=v pair passed as kwargs, call the corresponding set_k(v) method (e.g., calling self.set(property=5, mass=2) will call self.set_property(5) and self.set_mass(2). Useful especially in the __init__.

Note:it uses the _set_incompatibilities list of the class to check that we are not setting methods that cannot be set at the same time. _set_incompatibilities must be a list of tuples, and each tuple specifies the elements that cannot be set at the same time. For instance, if _set_incompatibilities = [(‘property’, ‘mass’)], then the call self.set(property=5, mass=2) will raise a ValueError. If a tuple has more than two values, it raises ValueError if all keys are provided at the same time, but it does not give any error if at least one of the keys is not present.
Note:If one element of _set_incompatibilities is a tuple with only one element, this element will not be settable using this function (and in particular,
Raises:ValueError – if the corresponding set_k method does not exist in self, or if the methods cannot be set at the same time.
set_computer(computer)[source]

Set the computer to be used by the node.

Note that the computer makes sense only for some nodes: Calculation, RemoteData, ...

Parameters:computer – the computer object
set_extra(key, value, exclusive=False)[source]

Sets an extra of a calculation. No .store() to be called. Can be used only after saving.

Parameters:
  • key – key name
  • value – key value
  • exclusive – (default=False). If exclusive is True, it raises a UniquenessError if an Extra with the same name already exists in the DB (useful e.g. to “lock” a node and avoid to run multiple times the same computation on it).
Raises:

UniquenessError – if extra already exists and exclusive is True.

set_extra_exclusive(key, value)[source]

Set an extra in exclusive mode (stops if the attribute is already there). Deprecated, use set_extra() with exclusive=False

Parameters:
  • key – key name
  • value – key value
set_extras(the_dict)[source]

Immediately sets several extras of a calculation, in the DB! No .store() to be called. Can be used only after saving.

Parameters:the_dict – a dictionary of key:value to be set as extras
store(with_transaction=True)[source]

Store a new node in the DB, also saving its repository directory and attributes.

After being called attributes cannot be changed anymore! Instead, extras can be changed only AFTER calling this store() function.

Note:After successful storage, those links that are in the cache, and for which also the parent node is already stored, will be automatically stored. The others will remain unstored.
Parameters:with_transaction – if False, no transaction is used. This is meant to be used ONLY if the outer calling function has already a transaction open!
store_all(with_transaction=True)[source]

Store the node, together with all input links, if cached, and also the linked nodes, if they were not stored yet.

Parameters:with_transaction – if False, no transaction is used. This is meant to be used ONLY if the outer calling function has already a transaction open!
uuid
Returns:a string with the uuid
class aiida.orm.implementation.general.node.AttributeManager(node)[source]

An object used internally to return the attributes as a dictionary.

Note:Important! It cannot be used to change variables, just to read them. To change values (of unstored nodes), use the proper Node methods.
__dir__()[source]

Allow to list the keys of the dictionary

__getattr__(name)[source]

Interface to get to dictionary values, using the key as an attribute.

Note:it works only for attributes that only contain letters, numbers and underscores, and do not start with a number.
Parameters:name – name of the key whose value is required.
__getitem__(name)[source]

Interface to get to dictionary values as a dictionary.

Parameters:name – name of the key whose value is required.
__init__(node)[source]
Parameters:node – the node object.
__iter__()[source]

Return the keys as an iterator

__weakref__

list of weak references to the object (if defined)

_get_dict()[source]

Return the internal dictionary

class aiida.orm.implementation.general.node.NodeInputManager(node)[source]

To document

__dir__()[source]

Allow to list all valid input links

__getattr__(name)[source]
Parameters:name – name of the attribute to be asked to the parser results.
__getitem__(name)[source]

interface to get to the parser results as a dictionary.

Parameters:name – name of the attribute to be asked to the parser results.
__init__(node)[source]
Parameters:node – the node object.
__weakref__

list of weak references to the object (if defined)

class aiida.orm.implementation.general.node.NodeOutputManager(node)[source]

To document

__dir__()[source]

Allow to list all valid output links

__getattr__(name)[source]
Parameters:name – name of the attribute to be asked to the parser results.
__getitem__(name)[source]

interface to get to the parser results as a dictionary.

Parameters:name – name of the attribute to be asked to the parser results.
__init__(node)[source]
Parameters:node – the node object.
__weakref__

list of weak references to the object (if defined)

aiida.orm.implementation.general.node.clean_value(value)[source]

Get value from input and (recursively) replace, if needed, all occurrences of BaseType AiiDA data nodes with their value, and List with a standard list.

It also makes a deep copy of everything.

Note however that there is no logic to avoid infinite loops when the user passes some perverse recursive dictionary or list. In any case, however, this would not be storable by AiiDA...

Parameters:value – A value to be set as an attribute or an extra
Returns:a “cleaned” value, potentially identical to value, but with values replaced where needed.
class aiida.orm.node.Node(**kwargs)[source]
Workflow
class aiida.orm.implementation.general.workflow.AbstractWorkflow(**kwargs)[source]

Base class to represent a workflow. This is the superclass of any workflow implementations, and provides all the methods necessary to interact with the database.

The typical use case are workflow stored in the aiida.workflow packages, that are initiated either by the user in the shell or by some scripts, and that are monitored by the aiida daemon.

Workflow can have steps, and each step must contain some calculations to be executed. At the end of the step’s calculations the workflow is reloaded in memory and the next methods is called.

add_attribute(_name, _value)[source]

Add one attributes to the Workflow. If another attribute is present with the same name it will be overwritten. :param name: a string with the attribute name to store :param value: a storable object to store

add_attributes(_params)[source]

Add a set of attributes to the Workflow. If another attribute is present with the same name it will be overwritten. :param name: a string with the attribute name to store :param value: a storable object to store

add_path(src_abs, dst_path)[source]

Copy a file or folder from a local file inside the repository directory. If there is a subpath, folders will be created.

Copy to a cache directory if the entry has not been saved yet. src_abs: the absolute path of the file to copy. dst_filename: the (relative) path on which to copy.

add_result(_name, _value)[source]

Add one result to the Workflow. If another result is present with the same name it will be overwritten. :param name: a string with the result name to store :param value: a storable object to store

add_results(_params)[source]

Add a set of results to the Workflow. If another result is present with the same name it will be overwritten. :param name: a string with the result name to store :param value: a storable object to store

append_to_report(text)[source]

Adds text to the Workflow report.

Note:Once, in case the workflow is a subworkflow of any other Workflow this method calls the parent append_to_report method; now instead this is not the case anymore
attach_calculation(calc)[source]

Adds a calculation to the caller step in the database. This is a lazy call, no calculations will be launched until the next method gets called. For a step to be completed all the calculations linked have to be in RETRIEVED state, after which the next method gets called from the workflow manager. :param calc: a JobCalculation object :raise: AiidaException: in case the input is not of JobCalculation type

attach_workflow(sub_wf)[source]

Adds a workflow to the caller step in the database. This is a lazy call, no workflow will be started until the next method gets called. For a step to be completed all the workflows linked have to be in FINISHED state, after which the next method gets called from the workflow manager. :param next_method: a Workflow object

clear_report()[source]

Wipe the Workflow report. In case the workflow is a subworflow of any other Workflow this method calls the parent clear_report method.

current_folder

Get the current repository folder, whether the temporary or the permanent.

Returns:the RepositoryFolder object.
dbworkflowinstance

Get the DbWorkflow object stored in the super class.

Returns:DbWorkflow object from the database
description

Get the description of the workflow.

Returns:a string
exit()[source]

This is the method to call in next to finish the Workflow. When exit is the next method, and no errors are found, the Workflow is set to FINISHED and removed from the execution manager duties.

get_abs_path(path, section=None)[source]

TODO: For the moment works only for one kind of files, ‘path’ (internal files)

get_all_calcs(calc_class=<class 'aiida.orm.implementation.django.calculation.job.JobCalculation'>, calc_state=None, depth=15)[source]

Get all calculations connected with this workflow and all its subworflows up to a given depth. The list of calculations can be restricted to a given calculation type and state :param calc_class: the calculation class to which the calculations should belong (default: JobCalculation)

Parameters:
  • calc_state – a specific state to filter the calculations to retrieve
  • depth – the maximum depth level the recursion on sub-workflows will try to reach (0 means we stay at the step level and don’t go into sub-workflows, 1 means we go down to one step level of the sub-workflows, etc.)
Returns:

a list of JobCalculation objects

get_attribute(_name)[source]

Get one Workflow attribute :param name: a string with the attribute name to retrieve :return: a dictionary of storable objects

get_attributes()[source]

Get the Workflow attributes :return: a dictionary of storable objects

get_folder_list(subfolder='.')[source]

Get the the list of files/directory in the repository of the object.

Parameters:subfolder – get the list of a subfolder
Returns:a list of strings.
get_parameter(_name)[source]

Get one Workflow paramenter :param name: a string with the parameters name to retrieve :return: a dictionary of storable objects

get_parameters()[source]

Get the Workflow paramenters :return: a dictionary of storable objects

get_report()[source]

Return the Workflow report.

Note:once, in case the workflow is a subworkflow of any other Workflow this method calls the parent get_report method. This is not the case anymore.
Returns:a list of strings
get_result(_name)[source]

Get one Workflow result :param name: a string with the result name to retrieve :return: a dictionary of storable objects

get_results()[source]

Get the Workflow results :return: a dictionary of storable objects

get_state()[source]

Get the Workflow’s state :return: a state from wf_states in aiida.common.datastructures

get_step(step_method)[source]

Retrieves by name a step from the Workflow. :param step_method: a string with the name of the step to retrieve or a method :raise: ObjectDoesNotExist: if there is no step with the specific name. :return: a DbWorkflowStep object.

get_step_calculations(step_method, calc_state=None)[source]

Retrieves all the calculations connected to a specific step in the database. If the step is not existent it returns None, useful for simpler grammatic in the workflow definition. :param next_method: a Workflow step (decorated) method :param calc_state: a specific state to filter the calculations to retrieve :return: a list of JobCalculations objects

get_step_workflows(step_method)[source]

Retrieves all the workflows connected to a specific step in the database. If the step is not existent it returns None, useful for simpler grammatic in the workflow definition. :param next_method: a Workflow step (decorated) method

get_steps(state=None)[source]

Retrieves all the steps from a specific workflow Workflow with the possibility to limit the list to a specific step’s state. :param state: a state from wf_states in aiida.common.datastructures :return: a list of DbWorkflowStep objects.

classmethod get_subclass_from_dbnode(wf_db)[source]

Loads the workflow object and reaoads the python script in memory with the importlib library, the main class is searched and then loaded. :param wf_db: a specific DbWorkflowNode object representing the Workflow :return: a Workflow subclass from the specific source code

classmethod get_subclass_from_pk(pk)[source]

Calls the get_subclass_from_dbnode selecting the DbWorkflowNode from the input pk. :param pk: a primary key index for the DbWorkflowNode :return: a Workflow subclass from the specific source code

classmethod get_subclass_from_uuid(uuid)[source]

Calls the get_subclass_from_dbnode selecting the DbWorkflowNode from the input uuid. :param uuid: a uuid for the DbWorkflowNode :return: a Workflow subclass from the specific source code

get_temp_folder()[source]

Get the folder of the Node in the temporary repository.

Returns:a SandboxFolder object mapping the node in the repository.
has_failed()[source]

Returns True is the Workflow’s state is ERROR

has_finished_ok()[source]

Returns True is the Workflow’s state is FINISHED

has_step(step_method)[source]

Return if the Workflow has a step with a specific name. :param step_method: a string with the name of the step to retrieve or a method

info()[source]

Returns an array with all the informations about the modules, file, class to locate the workflow source code

is_new()[source]

Returns True is the Workflow’s state is CREATED

is_running()[source]

Returns True is the Workflow’s state is RUNNING

is_subworkflow()[source]

Return True is this is a subworkflow (i.e., if it has a parent), False otherwise.

kill(verbose=False)[source]

Stop the Workflow execution and change its state to FINISHED.

This method calls the kill method for each Calculation and each subworkflow linked to each RUNNING step.

Parameters:verbose – True to print the pk of each subworkflow killed
Raises:InvalidOperation – if some calculations cannot be killed (the workflow will be also put to SLEEP so that it can be killed later on)
kill_step_calculations(step)[source]

Calls the kill method for each Calculation linked to the step method passed as argument. :param step: a Workflow step (decorated) method

label

Get the label of the workflow.

Returns:a string
logger

Get the logger of the Workflow object, so that it also logs to the DB.

Returns:LoggerAdapter object, that works like a logger, but also has the ‘extra’ embedded
next(next_method)[source]

Adds the a new step to be called after the completion of the caller method’s calculations and subworkflows.

This method must be called inside a Workflow step, otherwise an error is thrown. The code finds the caller method and stores in the database the input next_method as the next method to be called. At this point no execution in made, only configuration updates in the database.

If during the execution of the caller method the user launched calculations or subworkflows, this method will add them to the database, making them available to the workflow manager to be launched. In fact all the calculation and subworkflow submissions are lazy method, really executed by this call.

Parameters:next_method – a Workflow step method to execute after the caller method
Raise:AiidaException: in case the caller method cannot be found or validated
Returns:the wrapped methods, decorated with the correct step name
pk

Returns the DbWorkflow pk

classmethod query(*args, **kwargs)[source]

Map to the aiidaobjects manager of the DbWorkflow, that returns Workflow objects instead of DbWorkflow entities.

remove_path(path)[source]

Remove a file or directory from the repository directory.

Can be called only before storing.

repo_folder

Get the permanent repository folder. Use preferentially the current_folder method.

Returns:the permanent RepositoryFolder object
set_params(params, force=False)[source]

Adds parameters to the Workflow that are both stored and used every time the workflow engine re-initialize the specific workflow to launch the new methods.

set_state(state)[source]

Set the Workflow’s state :param name: a state from wf_states in aiida.common.datastructures

sleep()[source]

Changes the workflow state to SLEEP, only possible to call from a Workflow step decorated method.

classmethod step(fun)[source]

This method is used as a decorator for workflow steps, and handles the method’s execution, the state updates and the eventual errors.

The decorator generates a wrapper around the input function to execute, adding with the correct step name and a utility variable to make it distinguishable from non-step methods.

When a step is launched, the wrapper tries to run the function in case of error the state of the workflow is moved to ERROR and the traceback is stored in the report. In general the input method is a step obtained from the Workflow object, and the decorator simply handles a controlled execution of the step allowing the code not to break in case of error in the step’s source code.

The wrapper also tests not to run two times the same step, unless a Workflow is in ERROR state, in this case all the calculations and subworkflows of the step are killed and a new execution is allowed.

Parameters:fun – a methods to wrap, making it a Workflow step
Raise:AiidaException: in case the workflow state doesn’t allow the execution
Returns:the wrapped methods,
store()[source]

Stores the DbWorkflow object data in the database

uuid

Returns the DbWorkflow uuid

exception aiida.orm.implementation.general.workflow.WorkflowKillError(*args, **kwargs)[source]

An exception raised when a workflow failed to be killed. The error_message_list attribute contains the error messages from all the subworkflows.

exception aiida.orm.implementation.general.workflow.WorkflowUnkillable[source]

Raised when a workflow cannot be killed because it is in the FINISHED or ERROR state.

aiida.orm.implementation.general.workflow.get_workflow_info(w, tab_size=2, short=False, pre_string='', depth=16)[source]

Return a string with all the information regarding the given workflow and all its calculations and subworkflows. This is a recursive function (to print all subworkflows info as well).

Parameters:
  • w – a DbWorkflow instance
  • tab_size – number of spaces to use for the indentation
  • short – if True, provide a shorter output (only total number of calculations, rather than the state of each calculation)
  • pre_string – string appended at the beginning of each line
  • depth – the maximum depth level the recursion on sub-workflows will try to reach (0 means we stay at the step level and don’t go into sub-workflows, 1 means we go down to one step level of the sub-workflows, etc.)
Return lines:

list of lines to be outputed

aiida.orm.implementation.general.workflow.kill_all()[source]

Kills all the workflows not in FINISHED state running the kill_from_uuid method in a loop.

Parameters:uuid – the UUID of the workflow to kill
aiida.orm.implementation.general.workflow.kill_from_pk()[source]

Kills a workflow from its pk.

Parameters:pk – the Pkof the workflow to kill
Code
class aiida.orm.implementation.general.code.AbstractCode(**kwargs)[source]

A code entity. It can either be ‘local’, or ‘remote’.

  • Local code: it is a collection of files/dirs (added using the add_path() method), where one file is flagged as executable (using the set_local_executable() method).
  • Remote code: it is a pair (remotecomputer, remotepath_of_executable) set using the set_remote_computer_exec() method.

For both codes, one can set some code to be executed right before or right after the execution of the code, using the set_preexec_code() and set_postexec_code() methods (e.g., the set_preexec_code() can be used to load specific modules required for the code to be run).

can_run_on(computer)[source]

Return True if this code can run on the given computer, False otherwise.

Local codes can run on any machine; remote codes can run only on the machine on which they reside.

TODO: add filters to mask the remote machines on which a local code can run.

full_text_info

Return a (multiline) string with a human-readable detailed information on this computer.

classmethod get(pk=None, label=None, machinename=None)[source]

Get a Computer object with given identifier string, that can either be the numeric ID (pk), or the label (and computername) (if unique).

Parameters:
  • pk – the numeric ID (pk) for code
  • label – the code label identifying the code to load
  • machinename – the machine name where code is setup
Raises:
get_append_text()[source]

Return the postexec_code, or an empty string if no post-exec code was defined.

classmethod get_code_helper(label, machinename=None)[source]
Parameters:
  • label – the code label identifying the code to load
  • machinename – the machine name where code is setup
Raises:
get_desc()[source]

Returns a string with infos retrieved from PwCalculation node’s properties. :param node: :return: retsrt:

get_execname()[source]

Return the executable string to be put in the script. For local codes, it is ./LOCAL_EXECUTABLE_NAME For remote codes, it is the absolute path to the executable.

classmethod get_from_string(code_string)[source]

Get a Computer object with given identifier string in the format label@machinename. See the note below for details on the string detection algorithm.

Note

the (leftmost) ‘@’ symbol is always used to split code and computername. Therefore do not use ‘@’ in the code name if you want to use this function (‘@’ in the computer name are instead valid).

Parameters:

code_string – the code string identifying the code to load

Raises:
get_input_plugin_name()[source]

Return the name of the default input plugin (or None if no input plugin was set.

get_prepend_text()[source]

Return the code that will be put in the scheduler script before the execution, or an empty string if no pre-exec code was defined.

is_local()[source]

Return True if the code is ‘local’, False if it is ‘remote’ (see also documentation of the set_local and set_remote functions).

classmethod list_for_plugin(plugin, labels=True)[source]

Return a list of valid code strings for a given plugin.

Parameters:
  • plugin – The string of the plugin.
  • labels – if True, return a list of code names, otherwise return the code PKs (integers).
Returns:

a list of string, with the code names if labels is True, otherwise a list of integers with the code PKs.

new_calc(*args, **kwargs)[source]

Create and return a new Calculation object (unstored) with the correct plugin subclass, as obtained by the self.get_input_plugin_name() method.

Parameters are passed to the calculation __init__ method.

Note:

it also directly creates the link to this code (that will of course be cached, since the new node is not stored yet).

Raises:
  • MissingPluginError – if the specified plugin does not exist.
  • ValueError – if no plugin was specified.
set_append_text(code)[source]

Pass a string of code that will be put in the scheduler script after the execution of the code.

set_files(files)[source]

Given a list of filenames (or a single filename string), add it to the path (all at level zero, i.e. without folders). Therefore, be careful for files with the same name!

Todo:decide whether to check if the Code must be a local executable to be able to call this function.
set_input_plugin_name(input_plugin)[source]

Set the name of the default input plugin, to be used for the automatic generation of a new calculation.

set_local_executable(exec_name)[source]

Set the filename of the local executable. Implicitly set the code as local.

set_prepend_text(code)[source]

Pass a string of code that will be put in the scheduler script before the execution of the code.

set_remote_computer_exec(remote_computer_exec)[source]

Set the code as remote, and pass the computer on which it resides and the absolute path on that computer.

Args:
remote_computer_exec: a tuple (computer, remote_exec_path), where
computer is a aiida.orm.Computer or an aiida.backends.djsite.db.models.DbComputer object, and remote_exec_path is the absolute path of the main executable on remote computer.
aiida.orm.implementation.general.code.delete_code(code)[source]

Delete a code from the DB. Check before that there are no output nodes.

NOTE! Not thread safe... Do not use with many users accessing the DB at the same time.

Implemented as a function on purpose, otherwise complicated logic would be needed to set the internal state of the object after calling computer.delete().

ORM documentation: Data

Note

This list only includes the classes included in AiiDA-core. For all the plugin subclasses, check the corresponding plugin repositories.

class aiida.orm.data.Data(**kwargs)[source]

This class is base class for all data objects.

Specifications of the Data class: AiiDA Data objects are subclasses of Node and should have

Multiple inheritance must be suppoted, i.e. Data should have methods for querying and be able to inherit other library objects such as ASE for structures.

Architecture note: The code plugin is responsible for converting a raw data object produced by code to AiiDA standard object format. The data object then validates itself according to its method. This is done independently in order to allow cross-validation of plugins.

_del_attr(key)[source]

Delete an attribute.

Parameters:

key – attribute to delete.

Raises:
_exportstring(fileformat, main_file_name='', **kwargs)[source]

Converts a Data object to other text format.

Parameters:
  • fileformat – a string (the extension) to describe the file format.
  • main_file_name – empty by default, contains the (full) path and filename of the main file, if any. This is not used directly, but is used to infer useful unique names for the additional files, if any. For instance, if the main file for gnuplot is ‘../myplot.gnu’, the plugin could decide to store the dat file under ‘../myplot_data.dat’. It is up to the plugin to properly deal with this filename (or ignore it, if not relevant, e.g. if no additional files need to be created)
  • kwargs – any other parameter is passed down to the specific plugin
Returns:

a tuple of length 2. The first element is the content of the otuput file. The second is a dictionary (possibly empty) in the format {filename: filecontent} for any additional file that should be produced.

_get_converters()[source]

Get all implemented converter formats. The convention is to find all _get_object_... methods. Returns a list of strings.

_get_exporters()[source]

Get all implemented export formats. The convention is to find all _prepare_... methods. Returns a dictionary of method_name: method_function

_get_importers()[source]

Get all implemented import formats. The convention is to find all _parse_... methods. Returns a list of strings.

_linking_as_output(dest, link_type)[source]

Raise a ValueError if a link from self to dest is not allowed.

An output of a data can only be a calculation

_set_attr(key, value)[source]

Set a new attribute to the Node (in the DbAttribute table).

Parameters:
  • key (str) – key name
  • value – its value
Raises:
  • ModificationNotAllowed – if such attribute cannot be added (e.g. because the node was already stored)
  • ValidationError – if the key is not valid (e.g. it contains the separator symbol).
_validate()[source]

Perform validation of the Data object.

Note

validation of data source checks license and requires attribution to be provided in field ‘description’ of source in the case of any CC-BY* license. If such requirement is too strict, one can remove/comment it out.

convert(object_format=None, *args)[source]

Convert the AiiDA StructureData into another python object

Parameters:object_format – Specify the output format
export(path, fileformat=None, overwrite=False, **kwargs)[source]

Save a Data object to a file.

Parameters:
  • fname – string with file name. Can be an absolute or relative path.
  • fileformat – kind of format to use for the export. If not present, it will try to use the extension of the file name.
  • overwrite – if set to True, overwrites file found at path. Default=False
  • kwargs – additional parameters to be passed to the _exportstring method
Returns:

the list of files created

classmethod get_export_formats()[source]

Get the list of valid export format strings

Returns:a list of valid formats
importfile(fname, fileformat=None)[source]

Populate a Data object from a file.

Parameters:
  • fname – string with file name. Can be an absolute or relative path.
  • fileformat – kind of format to use for the export. If not present, it will try to use the extension of the file name.
importstring(inputstring, fileformat, **kwargs)[source]

Converts a Data object to other text format.

Parameters:fileformat – a string (the extension) to describe the file format.
Returns:a string with the structure description.
set_source(source)[source]

Sets the dictionary describing the source of Data object.

source

Gets the dictionary describing the source of Data object. Possible fields:

  • db_name: name of the source database.
  • db_uri: URI of the source database.
  • uri: URI of the object’s source. Should be a permanent link.
  • id: object’s source identifier in the source database.
  • version: version of the object’s source.
  • extras: a dictionary with other fields for source description.
  • source_md5: MD5 checksum of object’s source.
  • description: human-readable free form description of the
    object’s source.
  • license: a string with a type of license.

Note

some limitations for setting the data source exist, see _validate method.

Returns:dictionary describing the source of Data object.
Structure

This module defines the classes for structures and all related functions to operate on them.

class aiida.orm.data.structure.Kind(**kwargs)[source]

This class contains the information about the species (kinds) of the system.

It can be a single atom, or an alloy, or even contain vacancies.

__init__(**kwargs)[source]

Create a site. One can either pass:

Parameters:
  • raw – the raw python dictionary that will be converted to a Kind object.
  • ase – an ase Atom object
  • kind – a Kind object (to get a copy)

Or alternatively the following parameters:

Parameters:
  • symbols – a single string for the symbol of this site, or a list of symbol strings
  • (optional) (mass) – the weights for each atomic species of this site. If only a single symbol is provided, then this value is optional and the weight is set to 1.
  • (optional) – the mass for this site in atomic mass units. If not provided, the mass is set by the self.reset_mass() function.
  • name – a string that uniquely identifies the kind, and that is used to identify the sites.
compare_with(other_kind)[source]

Compare with another Kind object to check if they are different.

Note

This does NOT check the ‘type’ attribute. Instead, it compares (with reasonable thresholds, where applicable): the mass, and the list of symbols and of weights. Moreover, it compares the _internal_tag, if defined (at the moment, defined automatically only when importing the Kind from ASE, if the atom has a non-zero tag). Note that the _internal_tag is only used while the class is loaded, but is not persisted on the database.

Returns:A tuple with two elements. The first one is True if the two sites are ‘equivalent’ (same mass, symbols and weights), False otherwise. The second element of the tuple is a string, which is either None (if the first element was True), or contains a ‘human-readable’ description of the first difference encountered between the two sites.
get_raw()[source]

Return the raw version of the site, mapped to a suitable dictionary. This is the format that is actually used to store each kind of the structure in the DB.

Returns:a python dictionary with the kind.
get_symbols_string()[source]

Return a string that tries to match as good as possible the symbols of this kind. If there is only one symbol (no alloy) with 100% occupancy, just returns the symbol name. Otherwise, groups the full string in curly brackets, and try to write also the composition (with 2 precision only).

Note

If there is a vacancy (sum of weights<1), we indicate it with the X symbol followed by 1-sum(weights) (still with 2 digits precision, so it can be 0.00)

Note

Note the difference with respect to the symbols and the symbol properties!

has_vacancies()[source]

Returns True if the sum of the weights is less than one. It uses the internal variable _sum_threshold as a threshold.

Returns:a boolean
is_alloy()[source]

To understand if kind is an alloy.

Returns:True if the kind has more than one element (i.e., len(self.symbols) != 1), False otherwise.
mass

The mass of this species kind.

Returns:a float
name

Return the name of this kind. The name of a kind is used to identify the species of a site.

Returns:a string
reset_mass()[source]

Reset the mass to the automatic calculated value.

The mass can be set manually; by default, if not provided, it is the mass of the constituent atoms, weighted with their weight (after the weight has been normalized to one to take correctly into account vacancies).

This function uses the internal _symbols and _weights values and thus assumes that the values are validated.

It sets the mass to None if the sum of weights is zero.

set_automatic_kind_name(tag=None)[source]

Set the type to a string obtained with the symbols appended one after the other, without spaces, in alphabetical order; if the site has a vacancy, a X is appended at the end too.

set_symbols_and_weights(symbols, weights)[source]

Set the chemical symbols and the weights for the site.

Note

Note that the kind name remains unchanged.

symbol

If the kind has only one symbol, return it; otherwise, raise a ValueError.

symbols

List of symbols for this site. If the site is a single atom, pass a list of one element only, or simply the string for that atom. For alloys, a list of elements.

Note

Note that if you change the list of symbols, the kind name remains unchanged.

weights

Weights for this species kind. Refer also to :func:validate_symbols_tuple for the validation rules on the weights.

class aiida.orm.data.structure.Site(**kwargs)[source]

This class contains the information about a given site of the system.

It can be a single atom, or an alloy, or even contain vacancies.

__init__(**kwargs)[source]

Create a site.

Parameters:
  • kind_name – a string that identifies the kind (species) of this site. This has to be found in the list of kinds of the StructureData object. Validation will be done at the StructureData level.
  • position – the absolute position (three floats) in angstrom
get_ase(kinds)[source]

Return a ase.Atom object for this site.

Parameters:kinds – the list of kinds from the StructureData object.

Note

If any site is an alloy or has vacancies, a ValueError is raised (from the site.get_ase() routine).

get_raw()[source]

Return the raw version of the site, mapped to a suitable dictionary. This is the format that is actually used to store each site of the structure in the DB.

Returns:a python dictionary with the site.
kind_name

Return the kind name of this site (a string).

The type of a site is used to decide whether two sites are identical (same mass, symbols, weights, ...) or not.

position

Return the position of this site in absolute coordinates, in angstrom.

class aiida.orm.data.structure.StructureData(**kwargs)[source]

This class contains the information about a given structure, i.e. a collection of sites together with a cell, the boundary conditions (whether they are periodic or not) and other related useful information.

append_atom(**kwargs)[source]

Append an atom to the Structure, taking care of creating the corresponding kind.

Parameters:
  • ase – the ase Atom object from which we want to create a new atom (if present, this must be the only parameter)
  • position – the position of the atom (three numbers in angstrom)
  • symbols – passed to the constructor of the Kind object.
  • weights – passed to the constructor of the Kind object.
  • name – passed to the constructor of the Kind object. See also the note below.

Note

Note on the ‘name’ parameter (that is, the name of the kind):

  • if specified, no checks are done on existing species. Simply, a new kind with that name is created. If there is a name clash, a check is done: if the kinds are identical, no error is issued; otherwise, an error is issued because you are trying to store two different kinds with the same name.
  • if not specified, the name is automatically generated. Before adding the kind, a check is done. If other species with the same properties already exist, no new kinds are created, but the site is added to the existing (identical) kind. (Actually, the first kind that is encountered). Otherwise, the name is made unique first, by adding to the string containing the list of chemical symbols a number starting from 1, until an unique name is found

Note

checks of equality of species are done using the compare_with() method.

append_kind(kind)[source]

Append a kind to the StructureData. It makes a copy of the kind.

Parameters:kind – the site to append, must be a Kind object.
append_site(site)[source]

Append a site to the StructureData. It makes a copy of the site.

Parameters:site – the site to append. It must be a Site object.
cell

Returns the cell shape.

Returns:a 3x3 list of lists.
cell_angles

Get the angles between the cell lattice vectors in degrees.

cell_lengths

Get the lengths of cell lattice vectors in angstroms.

clear_kinds()[source]

Removes all kinds for the StructureData object.

Note

Also clear all sites!

clear_sites()[source]

Removes all sites for the StructureData object.

get_ase()[source]

Get the ASE object. Requires to be able to import ase.

Returns:an ASE object corresponding to this StructureData object.

Note

If any site is an alloy or has vacancies, a ValueError is raised (from the site.get_ase() routine).

get_cell_volume()[source]

Returns the cell volume in Angstrom^3.

Returns:a float.
get_composition()[source]

Returns the chemical composition of this structure as a dictionary, where each key is the kind symbol (e.g. H, Li, Ba), and each value is the number of occurences of that element in this structure. For BaZrO3 it would return {‘Ba’:1, ‘Zr’:1, ‘O’:3}. No reduction with smallest common divisor!

Returns:a dictionary with the composition
get_desc()[source]

Returns a string with infos retrieved from StructureData node’s properties :param self: the StructureData node :return: retsrt: the description string

get_formula(mode='hill', separator='')[source]

Return a string with the chemical formula.

Parameters:
  • mode

    a string to specify how to generate the formula, can assume one of the following values:

    • ‘hill’ (default): count the number of atoms of each species, then use Hill notation, i.e. alphabetical order with C and H first if one or several C atom(s) is (are) present, e.g. ['C','H','H','H','O','C','H','H','H'] will return 'C2H6O' ['S','O','O','H','O','H','O'] will return 'H2O4S' From E. A. Hill, J. Am. Chem. Soc., 22 (8), pp 478–494 (1900)
    • ‘hill_compact’: same as hill but the number of atoms for each species is divided by the greatest common divisor of all of them, e.g. ['C','H','H','H','O','C','H','H','H','O','O','O'] will return 'CH3O2'
    • ‘reduce’: group repeated symbols e.g. ['Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'Ti', 'O', 'O', 'O'] will return 'BaTiO3BaTiO3BaTi2O3'
    • ‘group’: will try to group as much as possible parts of the formula e.g. ['Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'Ti', 'O', 'O', 'O'] will return '(BaTiO3)2BaTi2O3'
    • ‘count’: same as hill (i.e. one just counts the number of atoms of each species) without the re-ordering (take the order of the atomic sites), e.g. ['Ba', 'Ti', 'O', 'O', 'O','Ba', 'Ti', 'O', 'O', 'O'] will return 'Ba2Ti2O6'
    • ‘count_compact’: same as count but the number of atoms for each species is divided by the greatest common divisor of all of them, e.g. ['Ba', 'Ti', 'O', 'O', 'O','Ba', 'Ti', 'O', 'O', 'O'] will return 'BaTiO3'
  • separator – a string used to concatenate symbols. Default empty.
Returns:

a string with the formula

Note

in modes reduce, group, count and count_compact, the initial order in which the atoms were appended by the user is used to group and/or order the symbols in the formula

get_kind(kind_name)[source]

Return the kind object associated with the given kind name.

Parameters:kind_name – String, the name of the kind you want to get
Returns:The Kind object associated with the given kind_name, if a Kind with the given name is present in the structure.
Raise:ValueError if the kind_name is not present.
get_kind_names()[source]

Return a list of kind names (in the same order of the self.kinds property, but return the names rather than Kind objects)

Note

This is NOT necessarily a list of chemical symbols! Use get_symbols_set for chemical symbols

Returns:a list of strings.
get_pymatgen()[source]

Get pymatgen object. Returns Structure for structures with periodic boundary conditions (in three dimensions) and Molecule otherwise.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

get_pymatgen_molecule()[source]

Get the pymatgen Molecule object.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

Returns:a pymatgen Molecule object corresponding to this StructureData object.
get_pymatgen_structure()[source]

Get the pymatgen Structure object.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

Returns:a pymatgen Structure object corresponding to this StructureData object.
Raises:ValueError – if periodic boundary conditions do not hold in at least one dimension of real space.
get_site_kindnames()[source]

Return a list with length equal to the number of sites of this structure, where each element of the list is the kind name of the corresponding site.

Note

This is NOT necessarily a list of chemical symbols! Use [ self.get_kind(s.kind_name).get_symbols_string() for s in self.sites] for chemical symbols

Returns:a list of strings
get_symbols_set()[source]

Return a set containing the names of all elements involved in this structure (i.e., for it joins the list of symbols for each kind k in the structure).

Returns:a set of strings of element names.
has_vacancies()[source]

To understand if there are vacancies in the structure.

Returns:a boolean, True if at least one kind has a vacancy
is_alloy()[source]

To understand if there are alloys in the structure.

Returns:a boolean, True if at least one kind is an alloy
kinds

Returns a list of kinds.

pbc

Get the periodic boundary conditions.

Returns:a tuple of three booleans, each one tells if there are periodic boundary conditions for the i-th real-space direction (i=1,2,3)
reset_cell(new_cell)[source]

Reset the cell of a structure not yet stored to a new value.

Parameters:new_cell – list specifying the cell vectors
Raises:ModificationNotAllowed: if object is already stored
reset_sites_positions(new_positions, conserve_particle=True)[source]

Replace all the Site positions attached to the Structure

Parameters:
  • new_positions – list of (3D) positions for every sites.
  • conserve_particle – if True, allows the possibility of removing a site. currently not implemented.
Raises:

Note

it is assumed that the order of the new_positions is given in the same order of the one it’s substituting, i.e. the kind of the site will not be checked.

set_ase(aseatoms)[source]

Load the structure from a ASE object

set_pymatgen(obj, **kwargs)[source]

Load the structure from a pymatgen object.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

set_pymatgen_molecule(mol, margin=5)[source]

Load the structure from a pymatgen Molecule object.

Parameters:margin – the margin to be added in all directions of the bounding box of the molecule.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

set_pymatgen_structure(struct)[source]

Load the structure from a pymatgen Structure object.

Note

periodic boundary conditions are set to True in all three directions.

Note

Requires the pymatgen module (version >= 3.0.13, usage of earlier versions may cause errors).

sites

Returns a list of sites.

aiida.orm.data.structure.ase_refine_cell(aseatoms, **kwargs)[source]

Detect the symmetry of the structure, remove symmetric atoms and refine unit cell.

Parameters:
  • aseatoms – an ase.atoms.Atoms instance
  • symprec – symmetry precision, used by spglib
Return newase:

refined cell with reduced set of atoms

Return symmetry:
 

a dictionary describing the symmetry space group

aiida.orm.data.structure.calc_cell_volume(cell)[source]

Calculates the volume of a cell given the three lattice vectors.

It is calculated as cell[0] . (cell[1] x cell[2]), where . represents a dot product and x a cross product.

Parameters:cell – the cell vectors; the must be a 3x3 list of lists of floats, no other checks are done.
Returns:the cell volume.
aiida.orm.data.structure.get_formula(symbol_list, mode='hill', separator='')[source]

Return a string with the chemical formula.

Parameters:
  • symbol_list – a list of symbols, e.g. ['H','H','O']
  • mode

    a string to specify how to generate the formula, can assume one of the following values:

    • ‘hill’ (default): count the number of atoms of each species, then use Hill notation, i.e. alphabetical order with C and H first if one or several C atom(s) is (are) present, e.g. ['C','H','H','H','O','C','H','H','H'] will return 'C2H6O' ['S','O','O','H','O','H','O'] will return 'H2O4S' From E. A. Hill, J. Am. Chem. Soc., 22 (8), pp 478–494 (1900)
    • ‘hill_compact’: same as hill but the number of atoms for each species is divided by the greatest common divisor of all of them, e.g. ['C','H','H','H','O','C','H','H','H','O','O','O'] will return 'CH3O2'
    • ‘reduce’: group repeated symbols e.g. ['Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'Ti', 'O', 'O', 'O'] will return 'BaTiO3BaTiO3BaTi2O3'
    • ‘group’: will try to group as much as possible parts of the formula e.g. ['Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'Ti', 'O', 'O', 'O'] will return '(BaTiO3)2BaTi2O3'
    • ‘count’: same as hill (i.e. one just counts the number of atoms of each species) without the re-ordering (take the order of the atomic sites), e.g. ['Ba', 'Ti', 'O', 'O', 'O','Ba', 'Ti', 'O', 'O', 'O'] will return 'Ba2Ti2O6'
    • ‘count_compact’: same as count but the number of atoms for each species is divided by the greatest common divisor of all of them, e.g. ['Ba', 'Ti', 'O', 'O', 'O','Ba', 'Ti', 'O', 'O', 'O'] will return 'BaTiO3'
  • separator – a string used to concatenate symbols. Default empty.
Returns:

a string with the formula

Note

in modes reduce, group, count and count_compact, the initial order in which the atoms were appended by the user is used to group and/or order the symbols in the formula

aiida.orm.data.structure.get_formula_from_symbol_list(_list, separator='')[source]

Return a string with the formula obtained from the list of symbols. Examples: * [[1,'Ba'],[1,'Ti'],[3,'O']] will return 'BaTiO3' * [[2, [ [1, 'Ba'], [1, 'Ti'] ] ]] will return '(BaTi)2'

Parameters:
  • _list – a list of symbols and multiplicities as obtained from the function group_symbols
  • separator – a string used to concatenate symbols. Default empty.
Returns:

a string

aiida.orm.data.structure.get_formula_group(symbol_list, separator='')[source]

Return a string with the chemical formula from a list of chemical symbols. The formula is written in a compact” way, i.e. trying to group as much as possible parts of the formula.

Note

it works for instance very well if structure was obtained from an ASE supercell.

Example of result: ['Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'O', 'O', 'O', 'Ba', 'Ti', 'Ti', 'O', 'O', 'O'] will return '(BaTiO3)2BaTi2O3'.

Parameters:
  • symbol_list – list of symbols (e.g. [‘Ba’,’Ti’,’O’,’O’,’O’])
  • separator – a string used to concatenate symbols. Default empty.
Returns:

a string with the chemical formula for the given structure.

aiida.orm.data.structure.get_pymatgen_version()[source]
Returns:string with pymatgen version, None if can not import.
aiida.orm.data.structure.get_symbols_string(symbols, weights)[source]

Return a string that tries to match as good as possible the symbols and weights. If there is only one symbol (no alloy) with 100% occupancy, just returns the symbol name. Otherwise, groups the full string in curly brackets, and try to write also the composition (with 2 precision only). If (sum of weights<1), we indicate it with the X symbol followed by 1-sum(weights) (still with 2 digits precision, so it can be 0.00)

Parameters:
  • symbols – the symbols as obtained from <kind>._symbols
  • weights – the weights as obtained from <kind>._weights

Note

Note the difference with respect to the symbols and the symbol properties!

aiida.orm.data.structure.get_valid_pbc(inputpbc)[source]

Return a list of three booleans for the periodic boundary conditions, in a valid format from a generic input.

Raises:ValueError – if the format is not valid.
aiida.orm.data.structure.group_symbols(_list)[source]

Group a list of symbols to a list containing the number of consecutive identical symbols, and the symbol itself.

Examples:

  • ['Ba','Ti','O','O','O','Ba'] will return [[1,'Ba'],[1,'Ti'],[3,'O'],[1,'Ba']]
  • [ [ [1,'Ba'],[1,'Ti'] ],[ [1,'Ba'],[1,'Ti'] ] ] will return [[2, [ [1, 'Ba'], [1, 'Ti'] ] ]]
Parameters:_list – a list of elements representing a chemical formula
Returns:a list of length-2 lists of the form [ multiplicity , element ]
aiida.orm.data.structure.has_ase()[source]
Returns:True if the ase module can be imported, False otherwise.
aiida.orm.data.structure.has_pymatgen()[source]
Returns:True if the pymatgen module can be imported, False otherwise.
aiida.orm.data.structure.has_spglib()[source]
Returns:True if the spglib module can be imported, False otherwise.
aiida.orm.data.structure.has_vacancies(weights)[source]

Returns True if the sum of the weights is less than one. It uses the internal variable _sum_threshold as a threshold. :param weights: the weights :return: a boolean

aiida.orm.data.structure.is_ase_atoms(ase_atoms)[source]

Check if the ase_atoms parameter is actually a ase.Atoms object.

Parameters:ase_atoms – an object, expected to be an ase.Atoms.
Returns:a boolean.

Requires the ability to import ase, by doing ‘import ase’.

aiida.orm.data.structure.is_valid_symbol(symbol)[source]

Validates the chemical symbol name.

Returns:True if the symbol is a valid chemical symbol (with correct capitalization), False otherwise.

Recognized symbols are for elements from hydrogen (Z=1) to lawrencium (Z=103).

aiida.orm.data.structure.symop_fract_from_ortho(cell)[source]

Creates a matrix for conversion from fractional to orthogonal coordinates.

Taken from svn://www.crystallography.net/cod-tools/trunk/lib/perl5/Fractional.pm, revision 850.

Parameters:cell – array of cell parameters (three lengths and three angles)
aiida.orm.data.structure.symop_ortho_from_fract(cell)[source]

Creates a matrix for conversion from orthogonal to fractional coordinates.

Taken from svn://www.crystallography.net/cod-tools/trunk/lib/perl5/Fractional.pm, revision 850.

Parameters:cell – array of cell parameters (three lengths and three angles)
aiida.orm.data.structure.validate_symbols_tuple(symbols_tuple)[source]

Used to validate whether the chemical species are valid.

Parameters:symbols_tuple – a tuple (or list) with the chemical symbols name.
Raises:ValueError if any symbol in the tuple is not a valid chemical symbols (with correct capitalization).

Refer also to the documentation of :func:is_valid_symbol

aiida.orm.data.structure.validate_weights_tuple(weights_tuple, threshold)[source]

Validates the weight of the atomic kinds.

Raise:

ValueError if the weights_tuple is not valid.

Parameters:
  • weights_tuple – the tuple to validate. It must be a a tuple of floats (as created by :func:_create_weights_tuple).
  • threshold – a float number used as a threshold to check that the sum of the weights is <= 1.

If the sum is less than one, it means that there are vacancies. Each element of the list must be >= 0, and the sum must be <= 1.

Folder
class aiida.orm.data.folder.FolderData(**kwargs)[source]

Stores a folder with subfolders and files.

No special attributes are set.

get_file_content(path)[source]

Return the content of a path stored inside the folder as a string.

Raises:NotExistent – if the path does not exist.
replace_with_folder(folder, overwrite=True)[source]

Replace the data with another folder, always copying and not moving the original files.

Args:
folder: the folder to copy from overwrite: if to overwrite the current content or not
Singlefile

Implement subclass for a single file in the permanent repository files = [one_single_file] jsons = {}

methods: * get_content * get_path * get_aiidaurl (?) * get_md5 * ...

To discuss: do we also need a simple directory class for full directories in the perm repo?

class aiida.orm.data.singlefile.SinglefileData(**kwargs)[source]

Pass as input a file parameter with the (absolute) path of a file on the hard drive. It will get copied inside the node.

Internally must have a single file, and stores as internal attribute the filename in the ‘filename’ attribute.

add_path(src_abs, dst_filename=None)[source]

Add a single file

del_file(filename)[source]

Remove a file from SingleFileData :param filename: name of the file stored in the DB

filename

Returns the name of the file stored

get_file_abs_path()[source]

Return the absolute path to the file in the repository

set_file(filename)[source]

Add a file to the singlefiledata :param filename: absolute path to the file

Upf

This module manages the UPF pseudopotentials in the local repository.

class aiida.orm.data.upf.UpfData(**kwargs)[source]

Function not yet documented.

classmethod from_md5(md5)[source]

Return a list of all UPF pseudopotentials that match a given MD5 hash.

Note that the hash has to be stored in a _md5 attribute, otherwise the pseudo will not be found.

classmethod get_or_create(filename, use_first=False, store_upf=True)[source]

Pass the same parameter of the init; if a file with the same md5 is found, that UpfData is returned.

Parameters:
  • filename – an absolute filename on disk
  • use_first – if False (default), raise an exception if more than one potential is found. If it is True, instead, use the first available pseudopotential.
  • store_upf (bool) – If false, the UpfData objects are not stored in the database. default=True.
Return (upf, created):
 

where upf is the UpfData object, and create is either True if the object was created, or False if the object was retrieved from the DB.

get_upf_family_names()[source]

Get the list of all upf family names to which the pseudo belongs

classmethod get_upf_group(group_name)[source]

Return the UpfFamily group with the given name.

classmethod get_upf_groups(filter_elements=None, user=None)[source]

Return all names of groups of type UpfFamily, possibly with some filters.

Parameters:
  • filter_elements – A string or a list of strings. If present, returns only the groups that contains one Upf for every element present in the list. Default=None, meaning that all families are returned.
  • user – if None (default), return the groups for all users. If defined, it should be either a DbUser instance, or a string for the username (that is, the user email).
set_file(filename)[source]

I pre-parse the file to store the attributes.

store(*args, **kwargs)[source]

Store the node, reparsing the file so that the md5 and the element are correctly reset.

aiida.orm.data.upf.get_pseudos_dict(structure, family_name)[source]

Get a dictionary of {kind: pseudo} for all the elements within the given structure using a the given pseudo family name.

Parameters:
  • structure – The structure that will be used.
  • family_name – the name of the group containing the pseudos
aiida.orm.data.upf.get_pseudos_from_structure(structure, family_name)[source]

Given a family name (a UpfFamily group in the DB) and a AiiDA structure, return a dictionary associating each kind name with its UpfData object.

Raises:
  • MultipleObjectsError – if more than one UPF for the same element is found in the group.
  • NotExistent – if no UPF for an element in the group is found in the group.
aiida.orm.data.upf.parse_upf(fname, check_filename=True)[source]

Try to get relevant information from the UPF. For the moment, only the element name. Note that even UPF v.2 cannot be parsed with the XML minidom! (e.g. due to the & characters in the human-readable section).

If check_filename is True, raise a ParsingError exception if the filename does not start with the element name.

aiida.orm.data.upf.upload_upf_family(folder, group_name, group_description, stop_if_existing=True)[source]

Upload a set of UPF files in a given group.

Parameters:
  • folder – a path containing all UPF files to be added. Only files ending in .UPF (case-insensitive) are considered.
  • group_name – the name of the group to create. If it exists and is non-empty, a UniquenessError is raised.
  • group_description – a string to be set as the group description. Overwrites previous descriptions, if the group was existing.
  • stop_if_existing – if True, check for the md5 of the files and, if the file already exists in the DB, raises a MultipleObjectsError. If False, simply adds the existing UPFData node to the group.
Cif
class aiida.orm.data.cif.CifData(**kwargs)[source]

Wrapper for Crystallographic Interchange File (CIF)

Note

the file (physical) is held as the authoritative source of information, so all conversions are done through the physical file: when setting ase or values, a physical CIF file is generated first, the values are updated from the physical CIF file.

ase

ASE object, representing the CIF.

Note

requires ASE module.

classmethod from_md5(md5)[source]

Return a list of all CIF files that match a given MD5 hash.

Note

the hash has to be stored in a _md5 attribute, otherwise the CIF file will not be found.

generate_md5()[source]

Generate MD5 hash of the file’s contents on-the-fly.

get_ase(**kwargs)[source]

Returns ASE object, representing the CIF. This function differs from the property ase by the possibility to pass the keyworded arguments (kwargs) to ase.io.cif.read_cif().

Note

requires ASE module.

get_formulae(mode='sum')[source]

Get the formula.

classmethod get_or_create(filename, use_first=False, store_cif=True)[source]

Pass the same parameter of the init; if a file with the same md5 is found, that CifData is returned.

Parameters:
  • filename – an absolute filename on disk
  • use_first – if False (default), raise an exception if more than one CIF file is found. If it is True, instead, use the first available CIF file.
  • store_cif (bool) – If false, the CifData objects are not stored in the database. default=True.
Return (cif, created):
 

where cif is the CifData object, and create is either True if the object was created, or False if the object was retrieved from the DB.

get_spacegroup_numbers()[source]

Get the spacegroup international number.

has_attached_hydrogens()[source]

Check if there are hydrogens without coordinates, specified as attached to the atoms of the structure. :return: True if there are attached hydrogens, False otherwise.

has_partial_occupancies()[source]

Check if there are float values in the atom occupancies. :return: True if there are partial occupancies, False otherwise.

static read_cif(fileobj, index=-1, **kwargs)[source]

A wrapper method that simulates the behavior of the old function ase.io.cif.read_cif by using the new generic ase.io.read function.

set_file(filename)[source]

Set the file. If the source is set and the MD5 checksum of new file is different from the source, the source has to be deleted.

store(*args, **kwargs)[source]

Store the node.

values

PyCifRW structure, representing the CIF datablocks.

Note

requires PyCifRW module.

aiida.orm.data.cif.cif_from_ase(ase, full_occupancies=False, add_fake_biso=False)[source]

Construct a CIF datablock from the ASE structure. The code is taken from https://wiki.fysik.dtu.dk/ase/epydoc/ase.io.cif-pysrc.html#write_cif, as the original ASE code contains a bug in printing the Hermann-Mauguin symmetry space group symbol.

Parameters:ase – ASE “images”
Returns:array of CIF datablocks
aiida.orm.data.cif.has_pycifrw()[source]
Returns:True if the PyCifRW module can be imported, False otherwise.
aiida.orm.data.cif.parse_formula(formula)[source]

Parses the Hill formulae, written with spaces for separators.

aiida.orm.data.cif.pycifrw_from_cif(datablocks, loops={}, names=None)[source]

Constructs PyCifRW’s CifFile from an array of CIF datablocks.

Parameters:
  • datablocks – an array of CIF datablocks
  • loops – optional list of lists of CIF tag loops.
  • names – optional list of datablock names
Returns:

CifFile

aiida.orm.data.cif.symop_string_from_symop_matrix_tr(matrix, tr=(0, 0, 0), eps=0)[source]

Construct a CIF representation of symmetry operator plus translation. See International Tables for Crystallography Vol. A. (2002) for definition.

Parameters:
  • matrix – 3x3 matrix, representing the symmetry operator
  • tr – translation vector of length 3 (default 0)
  • eps – epsilon parameter for fuzzy comparison x == 0
Returns:

CIF representation of symmetry operator

Parameter
class aiida.orm.data.parameter.ParameterData(**kwargs)[source]

Pass as input in the init a dictionary, and it will get stored as internal attributes.

Usual rules for attribute names apply (in particular, keys cannot start with an underscore). If this is the case, a ValueError will be raised.

You can then change/delete/add more attributes before storing with the usual methods of aiida.orm.Node

dict

To be used to get direct access to the underlying dictionary with the syntax node.dict.key or node.dict[‘key’].

Returns:an instance of the AttributeResultManager.
get_dict()[source]

Return a dict with the parameters

keys()[source]

Iterator of valid keys stored in the ParameterData object

set_dict(dict)[source]

Replace the current dictionary with another one.

Parameters:dict – The dictionary to set.
update_dict(dict)[source]

Update the current dictionary with the keys provided in the dictionary.

Parameters:dict – a dictionary with the keys to substitute. It works like dict.update(), adding new keys and overwriting existing keys.
Remote
class aiida.orm.data.remote.RemoteData(**kwargs)[source]

Store a link to a file or folder on a remote machine.

Remember to pass a computer!

add_path(src_abs, dst_filename=None)[source]

Disable adding files or directories to a RemoteData

getfile(relpath, destpath)[source]

Connects to the remote folder and gets a string with the (full) content of the file.

Parameters:
  • relpath – The relative path of the file to show.
  • destpath – A path on the local computer to get the file
Returns:

a string with the file content

is_empty()[source]

Check if remote folder is empty

listdir(relpath='.')[source]

Connects to the remote folder and lists the directory content.

Parameters:relpath – If ‘relpath’ is specified, lists the content of the given subfolder.
Returns:a flat list of file/directory names (as strings).
listdir_withattributes(path='.')[source]

Connects to the remote folder and lists the directory content.

Parameters:relpath – If ‘relpath’ is specified, lists the content of the given subfolder.
Returns:a list of dictionaries, where the documentation is in :py:class:Transport.listdir_withattributes.
OrbitalData
class aiida.orm.data.orbital.OrbitalData(**kwargs)[source]

Used for storing collections of orbitals, as well as providing methods for accessing them internally.

clear_orbitals()[source]

Remove all orbitals that were added to the class Cannot work if OrbitalData has been already stored

get_orbitals(with_tags=False, **kwargs)[source]

Returns all orbitals by default. If a site is provided, returns all orbitals cooresponding to the location of that site, additional arguments may be provided, which act as filters on the retrieved orbitals.

Parameters:
  • site – if provided, returns all orbitals with position of site
  • with_tags – if provided returns all tags stored
Kwargs:

attributes than can filter the set of returned orbitals

Return list_of_outputs:
 

a list of orbitals and also tags if with_tags was set to True

set_orbitals(orbital, tag=None)[source]

Sets the orbitals into the database. Uses the orbital’s inherent set_orbital_dict method to generate a orbital dict string at is stored along with the tags, if provided.

Parameters:
  • orbital – an orbital or list of orbitals to be set
  • tag – a list of strings must be of length orbital
ArrayData
class aiida.orm.data.array.ArrayData(*args, **kwargs)[source]

Store a set of arrays on disk (rather than on the database) in an efficient way using numpy.save() (therefore, this class requires numpy to be installed).

Each array is stored within the Node folder as a different .npy file.

Note:Before storing, no caching is done: if you perform a get_array() call, the array will be re-read from disk. If instead the ArrayData node has already been stored, the array is cached in memory after the first read, and the cached array is used thereafter. If too much RAM memory is used, you can clear the cache with the clear_internal_cache() method.
arraynames()[source]

Return a list of all arrays stored in the node, listing the files (and not relying on the properties).

Deprecated since version 0.7: Use get_arraynames() instead.

clear_internal_cache()[source]

Clear the internal memory cache where the arrays are stored after being read from disk (used in order to reduce at minimum the readings from disk). This function is useful if you want to keep the node in memory, but you do not want to waste memory to cache the arrays in RAM.

delete_array(name)[source]

Delete an array from the node. Can only be called before storing.

Parameters:name – The name of the array to delete from the node.
get_array(name)[source]

Return an array stored in the node

Parameters:name – The name of the array to return.
get_arraynames()[source]

Return a list of all arrays stored in the node, listing the files (and not relying on the properties).

New in version 0.7: Renamed from arraynames

get_shape(name)[source]

Return the shape of an array (read from the value cached in the properties for efficiency reasons).

Parameters:name – The name of the array.
iterarrays()[source]

Iterator that returns tuples (name, array) for each array stored in the node.

set_array(name, array)[source]

Store a new numpy array inside the node. Possibly overwrite the array if it already existed.

Internally, it stores a name.npy file in numpy format.

Parameters:
  • name – The name of the array.
  • array – The numpy array to store.
ArrayData subclasses

The following are Data classes inheriting from ArrayData.

KpointsData

This module defines the classes related to band structures or dispersions in a Brillouin zone, and how to operate on them.

class aiida.orm.data.array.kpoints.KpointsData(*args, **kwargs)[source]

Class to handle array of kpoints in the Brillouin zone. Provide methods to generate either user-defined k-points or path of k-points along symmetry lines. Internally, all k-points are defined in terms of crystal (fractional) coordinates. Cell and lattice vector coordinates are in Angstroms, reciprocal lattice vectors in Angstrom^-1 . :note: The methods setting and using the Bravais lattice info assume the PRIMITIVE unit cell is provided in input to the set_cell or set_cell_from_structure methods.

bravais_lattice

The dictionary containing informations about the cell symmetry

cell

The crystal unit cell. Rows are the crystal vectors in Angstroms. :return: a 3x3 numpy.array

find_bravais_lattice(epsilon_length=1e-05, epsilon_angle=1e-05)[source]

Analyze the symmetry of the cell. Allows to relax or tighten the thresholds used to compare angles and lengths of the cell. Save the information of the cell used for later use (like getting special points). It has to be used if the user wants to be sure the right symmetries are recognized. Otherwise, this function is automatically called with the default values.

If the right symmetry is not found, be sure also you are providing cells with enough digits.

If node is already stored, just returns the symmetry found before storing (if any).

Return (str) lattice_name:
 the name of the bravais lattice and its eventual variation
get_desc()[source]

Returns a string with infos retrieved from kpoints node’s properties. :param node: :return: retstr

get_kpoints(also_weights=False, cartesian=False)[source]

Return the list of kpoints

Parameters:
  • also_weights – if True, returns also the list of weights. Default = False
  • cartesian – if True, returns points in cartesian coordinates, otherwise, returns in crystal coordinates. Default = False.
get_kpoints_mesh(print_list=False)[source]

Get the mesh of kpoints.

Parameters:print_list – default=False. If True, prints the mesh of kpoints as a list
Raises:AttributeError – if no mesh has been set
Return mesh,offset:
 (if print_list=False) a list of 3 integers and a list of three floats 0<x<1, representing the mesh and the offset of kpoints
Return kpoints:(if print_list = True) an explicit list of kpoints coordinates, similar to what returned by get_kpoints()
get_special_points(cartesian=False, epsilon_length=1e-05, epsilon_angle=1e-05)[source]

Get the special point and path of a given structure.

In 2D, coordinates are based on the paper: R. Ramirez and M. C. Bohm, Int. J. Quant. Chem., XXX, pp. 391-411 (1986)

In 3D, coordinates are based on the paper: arXiv:1004.2974, W. Setyawan, S. Curtarolo

Parameters:
  • cartesian – If true, returns points in cartesian coordinates. Crystal coordinates otherwise. Default=False
  • epsilon_length – threshold on lengths comparison, used to get the bravais lattice info
  • epsilon_angle – threshold on angles comparison, used to get the bravais lattice info
Return special_points,path:
 

special_points: a dictionary of point_name:point_coords key,values.

path: the suggested path which goes through all high symmetry lines. A list of lists for all path segments. e.g. [(‘G’,’X’),(‘X’,’M’),...] It’s not necessarily a continuous line.

Note:

We assume that the cell given by the cell property is the primitive unit cell

labels

Labels associated with the list of kpoints. List of tuples with kpoint index and kpoint name: [(0,’G’),(13,’M’),...]

pbc

The periodic boundary conditions along the vectors a1,a2,a3.

Returns:a tuple of three booleans, each one tells if there are periodic boundary conditions for the i-th real-space direction (i=1,2,3)
set_cell(cell, pbc=None)[source]

Set a cell to be used for symmetry analysis. To set a cell from an AiiDA structure, use “set_cell_from_structure”.

Parameters:
  • cell – 3x3 matrix of cell vectors. Orientation: each row represent a lattice vector. Units are Angstroms.
  • pbc – list of 3 booleans, True if in the nth crystal direction the structure is periodic. Default = [True,True,True]
set_cell_from_structure(structuredata)[source]

Set a cell to be used for symmetry analysis from an AiiDA structure. Inherits both the cell and the pbc’s. To set manually a cell, use “set_cell”

Parameters:structuredata – an instance of StructureData
set_kpoints(kpoints, cartesian=False, labels=None, weights=None, fill_values=0)[source]

Set the list of kpoints. If a mesh has already been stored, raise a ModificationNotAllowed

Parameters:
  • kpoints

    a list of kpoints, each kpoint being a list of one, two or three coordinates, depending on self.pbc: if structure is 1D (only one True in self.pbc) one allows singletons or scalars for each k-point, if it’s 2D it can be a length-2 list, and in all cases it can be a length-3 list. Examples:

    • [[0.,0.,0.],[0.1,0.1,0.1],...] for 1D, 2D or 3D
    • [[0.,0.],[0.1,0.1,],...] for 1D or 2D
    • [[0.],[0.1],...] for 1D
    • [0., 0.1, ...] for 1D (list of scalars)

    For 0D (all pbc are False), the list can be any of the above or empty - then only Gamma point is set. The value of k for the non-periodic dimension(s) is set by fill_values

  • cartesian – if True, the coordinates given in input are treated as in cartesian units. If False, the coordinates are crystal, i.e. in units of b1,b2,b3. Default = False
  • labels – optional, the list of labels to be set for some of the kpoints. See labels for more info
  • weights – optional, a list of floats with the weight associated to the kpoint list
  • fill_values – scalar to be set to all non-periodic dimensions (indicated by False in self.pbc), or list of values for each of the non-periodic dimensions.
set_kpoints_mesh(mesh, offset=[0.0, 0.0, 0.0])[source]

Set KpointsData to represent a uniformily spaced mesh of kpoints in the Brillouin zone. This excludes the possibility of set/get kpoints

Parameters:
  • mesh – a list of three integers, representing the size of the kpoint mesh along b1,b2,b3.
  • offset – (optional) a list of three floats between 0 and 1. [0.,0.,0.] is Gamma centered mesh [0.5,0.5,0.5] is half shifted [1.,1.,1.] by periodicity should be equivalent to [0.,0.,0.] Default = [0.,0.,0.].
set_kpoints_mesh_from_density(distance, offset=[0.0, 0.0, 0.0], force_parity=False)[source]

Set a kpoints mesh using a kpoints density, expressed as the maximum distance between adjacent points along a reciprocal axis

Parameters:
  • distance – distance (in 1/Angstrom) between adjacent kpoints, i.e. the number of kpoints along each reciprocal axis i is |b_i|/distance where |b_i| is the norm of the reciprocal cell vector.
  • offset – (optional) a list of three floats between 0 and 1. [0.,0.,0.] is Gamma centered mesh [0.5,0.5,0.5] is half shifted Default = [0.,0.,0.].
  • force_parity – (optional) if True, force each integer in the mesh to be even (except for the non-periodic directions).
Note:

a cell should be defined first.

Note:

the number of kpoints along non-periodic axes is always 1.

set_kpoints_path(value=None, kpoint_distance=None, cartesian=False, epsilon_length=1e-05, epsilon_angle=1e-05)[source]

Set a path of kpoints in the Brillouin zone.

Parameters:
  • value

    description of the path, in various possible formats.

    None: automatically sets all irreducible high symmetry paths. Requires that a cell was set

    or

    [(‘G’,’M’), (...), ...] [(‘G’,’M’,30), (...), ...] [(‘G’,(0,0,0),’M’,(1,1,1)), (...), ...] [(‘G’,(0,0,0),’M’,(1,1,1),30), (...), ...]

  • cartesian (bool) – if set to true, reads the coordinates eventually passed in value as cartesian coordinates. Default: False.
  • kpoint_distance (float) – parameter controlling the distance between kpoints. Distance is given in crystal coordinates, i.e. the distance is computed in the space of b1,b2,b3. The distance set will be the closest possible to this value, compatible with the requirement of putting equispaced points between two special points (since extrema are included).
  • epsilon_length (float) – threshold on lengths comparison, used to get the bravais lattice info. It has to be used if the user wants to be sure the right symmetries are recognized.
  • epsilon_angle (float) – threshold on angles comparison, used to get the bravais lattice info. It has to be used if the user wants to be sure the right symmetries are recognized.
BandsData

This module defines the classes related to band structures or dispersions in a Brillouin zone, and how to operate on them.

class aiida.orm.data.array.bands.BandsData(*args, **kwargs)[source]

Class to handle bands data

array_labels

Get the labels associated with the band arrays

get_bands(also_occupations=False, also_labels=False)[source]

Returns an array (nkpoints x num_bands or nspins x nkpoints x num_bands) of energies. :param also_occupations: if True, returns also the occupations array. Default = False

set_bands(bands, units=None, occupations=None, labels=None)[source]

Set an array of band energies of dimension (nkpoints x nbands). Kpoints must be set in advance. Can contain floats or None. :param bands: a list of nkpoints lists of nbands bands, or a 2D array of shape (nkpoints x nbands), with band energies for each kpoint :param units: optional, energy units :param occupations: optional, a 2D list or array of floats of same shape as bands, with the occupation associated to each band

set_kpointsdata(kpointsdata)[source]

Load the kpoints from a kpoint object. :param kpointsdata: an instance of KpointsData class

show_mpl(**kwargs)[source]

Call a show() command for the band structure using matplotlib. This uses internally the ‘mpl_singlefile’ format, with empty main_file_name.

Other kwargs are passed to self._exportstring.

units

Units in which the data in bands were stored. A string

aiida.orm.data.array.bands.find_bandgap(bandsdata, number_electrons=None, fermi_energy=None)[source]

Tries to guess whether the bandsdata represent an insulator. This method is meant to be used only for electronic bands (not phonons) By default, it will try to use the occupations to guess the number of electrons and find the Fermi Energy, otherwise, it can be provided explicitely. Also, there is an implicit assumption that the kpoints grid is “sufficiently” dense, so that the bandsdata are not missing the intersection between valence and conduction band if present. Use this function with care!

Parameters:
  • number_electrons – (optional, float) number of electrons in the unit cell
  • fermi_energy – (optional, float) value of the fermi energy.
Note:

By default, the algorithm uses the occupations array to guess the number of electrons and the occupied bands. This is to be used with care, because the occupations could be smeared so at a non-zero temperature, with the unwanted effect that the conduction bands might be occupied in an insulator. Prefer to pass the number_of_electrons explicitly

Note:

Only one between number_electrons and fermi_energy can be specified at the same time.

Returns:

(is_insulator, gap), where is_insulator is a boolean, and gap a float. The gap is None in case of a metal, zero when the homo is equal to the lumo (e.g. in semi-metals).

ProjectionData
class aiida.orm.data.array.projection.ProjectionData(*args, **kwargs)[source]

A class to handle arrays of projected wavefunction data. That is projections of a orbitals, usually an atomic-hydrogen orbital, onto a given bloch wavefunction, the bloch wavefunction being indexed by s, n, and k. E.g. the elements are the projections described as < orbital | Bloch wavefunction (s,n,k) >

get_pdos(**kwargs)[source]

Retrieves all the pdos arrays corresponding to the input kwargs

Parameters:kwargs – inputs describing the orbitals associated with the pdos arrays
Returns:a list of tuples containing the orbital, energy array and pdos array associated with all orbitals that correspond to kwargs
get_projections(**kwargs)[source]

Retrieves all the pdos arrays corresponding to the input kwargs

Parameters:kwargs – inputs describing the orbitals associated with the pdos arrays
Returns:a list of tuples containing the orbital, and projection arrays associated with all orbitals that correspond to kwargs
get_reference_bandsdata()[source]

Returns the reference BandsData, using the set uuid via set_reference_bandsdata

Returns:

a BandsData instance

Raises:
  • AttributeError – if the bandsdata has not been set yet
  • NotExistent – if the bandsdata uuid did not retrieve bandsdata
set_orbitals(**kwargs)[source]

This method is inherited from OrbitalData, but is blocked here. If used will raise a NotImplementedError

set_projectiondata(list_of_orbitals, list_of_projections=None, list_of_energy=None, list_of_pdos=None, tags=None, bands_check=True)[source]

Stores the projwfc_array using the projwfc_label, after validating both.

Parameters:
  • list_of_orbitals – list of orbitals, of class orbital data. They should be the ones up on which the projection array corresponds with.
  • list_of_projections – list of arrays of projections of a atomic wavefunctions onto bloch wavefunctions. Since the projection is for every bloch wavefunction which can be specified by its spin (if used), band, and kpoint the dimensions must be nspin x nbands x nkpoints for the projwfc array. Or nbands x nkpoints if spin is not used.
  • energy_axis – list of energy axis for the list_of_pdos
  • list_of_pdos – a list of projected density of states for the atomic wavefunctions, units in states/eV
  • tags – A list of tags, not supported currently.
  • bands_check – if false, skips checks of whether the bands has been already set, and whether the sizes match. For use in parsers, where the BandsData has not yet been stored and therefore get_reference_bandsdata cannot be called
set_reference_bandsdata(value)[source]

Sets a reference bandsdata, creates a uuid link between this data object and a bandsdata object, must be set before any projection arrays

Parameters:value – a BandsData instance, a uuid or a pk
Raise:NotExistent if there was no BandsData associated with uuid or pk
TrajectoryData
class aiida.orm.data.array.trajectory.TrajectoryData(*args, **kwargs)[source]

Stores a trajectory (a sequence of crystal structures with timestamps, and possibly with velocities).

get_cells()[source]

Return the array of cells, if it has already been set.

Raises:KeyError – if the trajectory has not been set yet.
get_index_from_stepid(stepid)[source]

Given a value for the stepid (i.e., a value among those of the steps array), return the array index of that stepid, that can be used in other methods such as get_step_data() or get_step_structure().

New in version 0.7: Renamed from get_step_index

Note

Note that this function returns the first index found (i.e. if multiple steps are present with the same value, only the index of the first one is returned).

Raises:ValueError – if no step with the given value is found.
get_positions()[source]

Return the array of positions, if it has already been set.

Raises:KeyError – if the trajectory has not been set yet.
get_step_data(index)[source]

Return a tuple with all information concerning the stepid with given index (0 is the first step, 1 the second step and so on). If you know only the step value, use the get_index_from_stepid() method to get the corresponding index.

If no velocities were specified, None is returned as the last element.

Returns:

A tuple in the format (stepid, time, cell, symbols, positions, velocities), where stepid is an integer, time is a float, cell is a 3 \times 3 matrix, symbols is an array of length n, positions is a n \times 3 array, and velocities is either None or a n \times 3 array

Parameters:

index – The index of the step that you want to retrieve, from 0 to self.numsteps - 1.

Raises:
  • IndexError – if you require an index beyond the limits.
  • KeyError – if you did not store the trajectory yet.
get_step_index(step)[source]

Deprecated since version 0.7: Use get_index_from_stepid() instead.

get_step_structure(index, custom_kinds=None)[source]

Return an AiiDA aiida.orm.data.structure.StructureData node (not stored yet!) with the coordinates of the given step, identified by its index. If you know only the step value, use the get_index_from_stepid() method to get the corresponding index.

Note

The periodic boundary conditions are always set to True.

New in version 0.7: Renamed from step_to_structure

Parameters:
  • index – The index of the step that you want to retrieve, from 0 to self.numsteps- 1.
  • custom_kinds – (Optional) If passed must be a list of aiida.orm.data.structure.Kind objects. There must be one kind object for each different string in the symbols array, with kind.name set to this string. If this parameter is omitted, the automatic kind generation of AiiDA aiida.orm.data.structure.StructureData nodes is used, meaning that the strings in the symbols array must be valid chemical symbols.
get_stepids()[source]

Return the array of steps, if it has already been set.

New in version 0.7: Renamed from get_steps

Raises:KeyError – if the trajectory has not been set yet.
get_steps()[source]

Deprecated since version 0.7: Use get_stepids() instead.

get_symbols()[source]

Return the array of symbols, if it has already been set.

Raises:KeyError – if the trajectory has not been set yet.
get_times()[source]

Return the array of times (in ps), if it has already been set.

Raises:KeyError – if the trajectory has not been set yet.
get_velocities()[source]

Return the array of velocities, if it has already been set.

Note

This function (differently from all other get_* functions, will not raise an exception if the velocities are not set, but rather return None (both if no trajectory was not set yet, and if it the trajectory was set but no velocities were specified).

numsites

Return the number of stored sites, or zero if nothing has been stored yet.

numsteps

Return the number of stored steps, or zero if nothing has been stored yet.

set_structurelist(structurelist)[source]

Create trajectory from the list of aiida.orm.data.structure.StructureData instances.

Parameters:structurelist – a list of aiida.orm.data.structure.StructureData instances.
Raises:ValueError – if symbol lists of supplied structures are different
set_trajectory(stepids, cells, symbols, positions, times=None, velocities=None)[source]

Store the whole trajectory, after checking that types and dimensions are correct. Velocities are optional, if they are not passed, nothing is stored.

Parameters:
  • stepids – integer array with dimension s, where s is the number of steps. Typically represents an internal counter within the code. For instance, if you want to store a trajectory with one step every 10, starting from step 65, the array will be [65,75,85,...]. No checks are done on duplicate elements or on the ordering, but anyway this array should be sorted in ascending order, without duplicate elements. If your code does not provide an internal counter, just provide for instance arange(s). It is internally stored as an array named ‘steps’.
  • cells – float array with dimension s \times 3 \times 3, where s is the length of the stepids array. Units are angstrom. In particular, cells[i,j,k] is the k-th component of the j-th cell vector at the time step with index i (identified by step number stepid[i] and with timestamp times[i]).
  • symbols – string array with dimension n, where n is the number of atoms (i.e., sites) in the structure. The same array is used for each step. Normally, the string should be a valid chemical symbol, but actually any unique string works and can be used as the name of the atomic kind (see also the get_step_structure() method).
  • positions – float array with dimension s \times n \times 3, where s is the length of the stepids array and n is the length of the symbols array. Units are angstrom. In particular, positions[i,j,k] is the k-th component of the j-th atom (or site) in the structure at the time step with index i (identified by step number step[i] and with timestamp times[i]).
  • times – if specified, float array with dimension s, where s is the length of the stepids array. Contains the timestamp of each step in picoseconds (ps).
  • velocities – if specified, must be a float array with the same dimensions of the positions array. The array contains the velocities in the atoms.

Todo

Choose suitable units for velocities

show_mpl_pos(**kwargs)[source]

Shows the positions as a function of time, separate for XYZ coordinates

Parameters:
  • stepsize (int) – The stepsize for the trajectory, set higher than 1 to reduce number of points
  • mintime (int) – Time to start from
  • maxtime (int) – Maximum time
  • elements (list) – A list of atomic symbols that should be displayed. If not specified, all atoms are displayed.
  • indices (list) – A list of indices of that atoms that can be displayed. If not specified, all atoms of the correct species are displayed.
  • dont_block (bool) – If True, interpreter is not blocked when figure is displayed.
Todo:

save to file?

step_to_structure(index, custom_kinds=None)[source]

Deprecated since version 0.7: Use get_step_structure() instead.

XyData

This module defines the classes related to Xy data. That is data that contains collections of y-arrays bound to a single x-array, and the methods to operate on them.

class aiida.orm.data.array.xy.XyData(*args, **kwargs)[source]

A subclass designed to handle arrays that have an “XY” relationship to each other. That is there is one array, the X array, and there are several Y arrays, which can be considered functions of X.

get_x()[source]

Tries to retrieve the x array and x name raises a NotExistent exception if no x array has been set yet. :return x_name: the name set for the x_array :return x_array: the x array set earlier :return x_units: the x units set earlier

get_y()[source]

Tries to retrieve the y arrays and the y names, raises a NotExistent exception if they have not been set yet, or cannot be retrieved :return y_names: list of strings naming the y_arrays :return y_arrays: list of y_arrays :return y_units: list of strings giving the units for the y_arrays

set_x(x_array, x_name, x_units)[source]

Sets the array and the name for the x values.

Parameters:
  • x_array – A numpy.ndarray, containing only floats
  • x_name – a string for the x array name
  • x_units – the units of x
set_y(y_arrays, y_names, y_units)[source]

Set array(s) for the y part of the dataset. Also checks if the x_array has already been set, and that, the shape of the y_arrays agree with the x_array. :param y_arrays: A list of y_arrays, numpy.ndarray :param y_names: A list of strings giving the names of the y_arrays :param y_units: A list of strings giving the units of the y_arrays

aiida.orm.data.array.xy.check_convert_single_to_tuple(item)[source]

Checks if the item is a list or tuple, and converts it to a list if it is not already a list or tuple

Parameters:item – an object which may or may not be a list or tuple
Returns:item_list: the input item unchanged if list or tuple and [item] otherwise
Base data types
class aiida.orm.data.base.BaseType(*args, **kwargs)[source]

Store a base python type as a AiiDA node in the DB.

Provide the .value property to get the actual value.

class aiida.orm.data.base.Bool(*args, **kwargs)[source]

Class to store booleans as AiiDA nodes

class aiida.orm.data.base.Float(*args, **kwargs)[source]

Class to store float numbers as AiiDA nodes

class aiida.orm.data.base.Int(*args, **kwargs)[source]

Class to store integer numbers as AiiDA nodes

class aiida.orm.data.base.List(**kwargs)[source]

Class to store python lists as AiiDA nodes

class aiida.orm.data.base.NumericType(*args, **kwargs)[source]

Specific subclass of BaseType to store numbers, overloading common operators (+, *, ...)

class aiida.orm.data.base.Str(*args, **kwargs)[source]

Class to store strings as AiiDA nodes

aiida.orm.data.base.get_false_node()[source]

Return a Bool Data node, with value False

Cannot be done as a singleton in the module, because it would be generated at import time, with the risk that (e.g. in the tests, or at the very first use of AiiDA) a user is not yet defined in the DB (but a user is mandatory in the DB before you can create new Nodes in AiiDA).

aiida.orm.data.base.get_true_node()[source]

Return a Bool Data node, with value True

Cannot be done as a singleton in the module, because it would be generated at import time, with the risk that (e.g. in the tests, or at the very first use of AiiDA) a user is not yet defined in the DB (but a user is mandatory in the DB before you can create new Nodes in AiiDA).

aiida.orm.data.base.to_aiida_type(*args, **kw)[source]

Turns basic Python types (str, int, float, bool) into the corresponding AiiDA types.

ORM documentation: Calculations

Note

This list only includes the classes included in AiiDA-core. For all the plugin subclasses, check the corresponding plugin repositories.

class aiida.orm.implementation.general.calculation.AbstractCalculation[source]

This class provides the definition of an “abstract” AiiDA calculation. A calculation in this sense is any computation that converts data into data.

You will typically use one of its subclasses, often a JobCalculation for calculations run via a scheduler.

Add a link with a code as destination.

You can use the parameters of the base Node class, in particular the label parameter to label the link.

Parameters:
  • src – a node of the database. It cannot be a Calculation object.
  • label (str) – Name of the link. Default=None
  • link_type – The type of link, must be one of the enum values form LinkType
get_code()[source]

Return the code for this calculation, or None if the code was not set.

get_linkname(link, *args, **kwargs)[source]

Return the linkname used for a given input link

Pass as parameter “NAME” if you would call the use_NAME method. If the use_NAME method requires a further parameter, pass that parameter as the second parameter.

logger

Get the logger of the Calculation object, so that it also logs to the DB.

Returns:LoggerAdapter object, that works like a logger, but also has the ‘extra’ embedded
class aiida.orm.calculation.inline.InlineCalculation(**kwargs)[source]

Here I put all the attributes/method that are common to all backends

get_desc()[source]

Returns a string with infos retrieved from a InlineCalculation node’s properties. :return: description string

aiida.orm.calculation.inline.optional_inline(func)[source]

optional_inline wrapper/decorator takes a function, which can be called either as wrapped in InlineCalculation or a simple function, depending on ‘store’ keyworded argument (True stands for InlineCalculation, False for simple function). The wrapped function has to adhere to the requirements by make_inline wrapper/decorator.

Usage example:

@optional_inline
def copy_inline(source=None):
  return {'copy': source.copy()}

Function copy_inline will be wrapped in InlineCalculation when invoked in following way:

copy_inline(source=node,store=True)

while it will be called as a simple function when invoked:

copy_inline(source=node)

In any way the copy_inline will return the same results.

class aiida.orm.implementation.general.calculation.job.AbstractJobCalculation[source]

This class provides the definition of an AiiDA calculation that is run remotely on a job scheduler.

Add a link with a code as destination. Add the additional contraint that this is only possible if the calculation is in state NEW.

You can use the parameters of the base Node class, in particular the label parameter to label the link.

Parameters:
  • src – a node of the database. It cannot be a Calculation object.
  • label (str) – Name of the link. Default=None
  • link_type – The type of link, must be one of the enum values form LinkType
get_append_text()[source]

Get the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution.

get_custom_scheduler_commands()[source]

Return a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. See also the documentation of the corresponding set_ method.

Returns:the custom scheduler command, or an empty string if no custom command was defined.
get_desc()[source]

Returns a string with infos retrieved from a JobCalculation node’s properties.

get_environment_variables()[source]

Return a dictionary of the environment variables that are set for this calculation.

Return an empty dictionary if no special environment variables have to be set for this calculation.

get_import_sys_environment()[source]

To check if it’s loading the system environment on the submission script.

Returns:a boolean. If True the system environment will be load.
get_job_id()[source]

Get the scheduler job id of the calculation.

Returns:a string
get_max_memory_kb()[source]

Get the memory (in KiloBytes) requested to the scheduler.

Returns:an integer
get_max_wallclock_seconds()[source]

Get the max wallclock time in seconds requested to the scheduler.

Returns:an integer
Return type:int
get_mpirun_extra_params()[source]

Return a list of strings, that are the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] ... exec.x

Return an empty list if no parameters have been defined.

get_parser_name()[source]

Return a string locating the module that contains the output parser of this calculation, that will be searched in the ‘aiida/parsers/plugins’ directory. None if no parser is needed/set.

Returns:a string.
get_parserclass()[source]

Return the output parser object for this calculation, or None if no parser is set.

Returns:a Parser class.
Raise:MissingPluginError from ParserFactory no plugin is found.
get_prepend_text()[source]

Get the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution.

get_priority()[source]

Get the priority, if set, of the job on the cluster.

Returns:a string or None
get_queue_name()[source]

Get the name of the queue on cluster.

Returns:a string or None.
get_resources(full=False)[source]

Returns the dictionary of the job resources set.

Parameters:full – if True, also add the default values, e.g. default_mpiprocs_per_machine
Returns:a dictionary
get_retrieved_node()[source]

Return the retrieved data folder, if present.

Returns:the retrieved data folder object, or None if no such output node is found.
Raises:MultipleObjectsError – if more than one output node is found.
get_scheduler_error()[source]

Return the output of the scheduler error (a string) if the calculation has finished, and output node is present, and the output of the scheduler was retrieved.

Return None otherwise.

get_scheduler_output()[source]

Return the output of the scheduler output (a string) if the calculation has finished, and output node is present, and the output of the scheduler was retrieved.

Return None otherwise.

get_scheduler_state()[source]

Return the status of the calculation according to the cluster scheduler.

Returns:a string.
get_state(from_attribute=False)[source]

Get the state of the calculation.

Note

the ‘most recent’ state is obtained using the logic in the aiida.common.datastructures.sort_states function.

Todo

Understand if the state returned when no state entry is found in the DB is the best choice.

Parameters:from_attribute – if set to True, read it from the attributes (the attribute is also set with set_state, unless the state is set to IMPORTED; in this way we can also see the state before storing).
Returns:a string. If from_attribute is True and no attribute is found, return None. If from_attribute is False and no entry is found in the DB, also return None.
get_withmpi()[source]

Get whether the job is set with mpi execution.

Returns:a boolean. Default=True.
has_failed()[source]

Get whether the calculation is in a failed status, i.e. SUBMISSIONFAILED, RETRIEVALFAILED, PARSINGFAILED or FAILED.

Returns:a boolean
has_finished()[source]

Determine if the calculation is finished for whatever reason. This may be because it finished successfully or because of a failure.

Returns:True if the job has finished running, False otherwise.
Return type:bool
has_finished_ok()[source]

Get whether the calculation is in the FINISHED status.

Returns:a boolean
kill()[source]

Kill a calculation on the cluster.

Can only be called if the calculation is in status WITHSCHEDULER.

The command tries to run the kill command as provided by the scheduler, and raises an exception is something goes wrong. No changes of calculation status are done (they will be done later by the calculation manager).

res

To be used to get direct access to the parsed parameters.

Returns:an instance of the CalculationResultManager.
Note:a practical example on how it is meant to be used: let’s say that there is a key ‘energy’ in the dictionary of the parsed results which contains a list of floats. The command calc.res.energy will return such a list.
set_append_text(val)[source]

Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution.

Parameters:val – a (possibly multiline) string
set_custom_scheduler_commands(val)[source]

Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler.

The difference of this method with respect to the set_prepend_text is the position in the scheduler submission file where such text is inserted: with this method, the string is inserted before any non-scheduler command.

set_environment_variables(env_vars_dict)[source]

Set a dictionary of custom environment variables for this calculation.

Both keys and values must be strings.

In the remote-computer submission script, it’s going to export variables as export 'keys'='values'

set_import_sys_environment(val)[source]

If set to true, the submission script will load the system environment variables.

Parameters:val (bool) – load the environment if True
set_max_memory_kb(val)[source]

Set the maximum memory (in KiloBytes) to be asked to the scheduler.

Parameters:val – an integer. Default=None
set_max_wallclock_seconds(val)[source]

Set the wallclock in seconds asked to the scheduler.

Parameters:val – An integer. Default=None
set_mpirun_extra_params(extra_params)[source]

Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] ... exec.x

Parameters:extra_params – must be a list of strings, one for each extra parameter
set_parser_name(parser)[source]

Set a string for the output parser Can be None if no output plugin is available or needed.

Parameters:parser – a string identifying the module of the parser. Such module must be located within the folder ‘aiida/parsers/plugins’
set_prepend_text(val)[source]

Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution.

See also set_custom_scheduler_commands

Parameters:val – a (possibly multiline) string
set_priority(val)[source]

Set the priority of the job to be queued.

Parameters:val – the values of priority as accepted by the cluster scheduler.
set_queue_name(val)[source]

Set the name of the queue on the remote computer.

Parameters:val (str) – the queue name
set_resources(resources_dict)[source]

Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus, ... This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler. (scheduler type can be found with calc.get_computer().get_scheduler_type() )

set_withmpi(val)[source]

Set the calculation to use mpi.

Parameters:val – A boolean. Default=True
store(*args, **kwargs)[source]

Override the store() method to store also the calculation in the NEW state as soon as this is stored for the first time.

submit()[source]

Puts the calculation in the TOSUBMIT status.

Actual submission is performed by the daemon.

submit_test(folder=None, subfolder_name=None)[source]

Test submission, creating the files in a local folder.

Note:

this submit_test function does not require any node (neither the calculation nor the input links) to be stored yet.

Parameters:
  • folder – A Folder object, within which each calculation files are created; if not passed, a subfolder ‘submit_test’ of the current folder is used.
  • subfolder_name – the name of the subfolder to use for this calculation (within Folder). If not passed, a unique string starting with the date and time in the format yymmdd-HHMMSS- is used.
class aiida.orm.implementation.general.calculation.job.CalculationResultManager(calc)[source]

An object used internally to interface the calculation object with the Parser and consequentially with the ParameterData object result. It shouldn’t be used explicitly by a user.

__init__(calc)[source]
Parameters:calc – the calculation object.
class aiida.orm.implementation.general.calculation.inline.InlineCalculation(**kwargs)[source]

Subclass used for calculations that are automatically generated using the make_inline wrapper/decorator.

This is used to automatically create a calculation node for a simple calculation

get_function_name()[source]

Get the function name.

Returns:a string
aiida.orm.implementation.general.calculation.inline.make_inline(func)[source]

This make_inline wrapper/decorator takes a function with specific requirements, runs it and stores the result as an InlineCalculation node. It will also store all other nodes, including any possibly unstored input node! The return value of the wrapped calculation will also be slightly changed, see below.

The wrapper:

  • checks that the function name ends with the string '_inline'
  • checks that each input parameter is a valid Data node (can be stored or unstored)
  • runs the actual function
  • gets the result values
  • checks that the result value is a dictionary, where the key are all strings and the values are all unstored data nodes
  • creates an InlineCalculation node, links all the kwargs as inputs and the returned nodes as outputs, using the keys as link labels
  • stores all the nodes (including, possibly, unstored input nodes given as kwargs)
  • returns a length-two tuple, where the first element is the InlineCalculation node, and the second is the dictionary returned by the wrapped function

To use this function, you can use it as a decorator of a wrapped function:

@make_inline
def copy_inline(source):
    return {copy: source.copy()}

In this way, every time you call copy_inline, the wrapped version is actually called, and the return value will be a tuple with the InlineCalculation instance, and the returned dictionary. For instance, if s is a valid Data node, with the following lines:

c, s_copy_dict = copy_inline(source=s)
s_copy = s_copy_dict['copy']

c will contain the new InlineCalculation instance, s_copy the (stored) copy of s (with the side effect that, if s was not stored, after the function call it will be automatically stored).

Note:If you use a wrapper, make sure to write explicitly in the docstrings that the function is going to store the nodes.

The second possibility, if you want that by default the function does not store anything, but can be wrapped when it is necessary, is the following. You simply define the function you want to wrap (copy_inline in the example above) without decorator:

def copy_inline(source):
   return {copy: source.copy()}

This is a normal function, so to call it you will normally do:

s_copy_dict = copy_inline(s)

while if you want to wrap it, so that an InlineCalculation is created, and everything is stored, you will run:

c, s_copy_dict = make_inline(f)(s=s)

Note that, with the wrapper, all the parameters to f() have to be passed as keyworded arguments. Moreover, the return value is different, i.e. (c, s_copy_dict) instead of simply s_copy_dict.

Note

EXTREMELY IMPORTANT! The wrapped function MUST have the following requirements in order to be reproducible. These requirements cannot be enforced, but must be followed when writing the wrapped function.

  • The function MUST NOT USE information that is not passed in the kwargs. In particular, it cannot read files from the hard-drive (that will not be present in another user’s computer), it cannot connect to external databases and retrieve the current entries in that database (that could change over time), etc.
  • The only exception to the above rule is the access to the AiiDA database for the parents of the input nodes. That is, you can take the input nodes passed as kwargs, and use also the data given in their inputs, the inputs of their inputs, ... but you CANNOT use any output of any of the above-mentioned nodes (that could change over time).
  • The function MUST NOT have side effects (creating files on the disk, adding entries to an external database, ...).

Note

The function will also store:

  • the source of the function in an attribute “source_code”, and the first line at which the function appears (attribute “first_line_source_code”), as returned by inspect.getsourcelines;
  • the full source file in “source_file”, if it is possible to retrieve it (this will be set to None otherwise, e.g. if the function was defined in the interactive shell).

For this reason, try to keep, if possible, all the code to be run within the same file, so that it is possible to keep the provenance of the functions that were run (if you instead call a function in a different file, you will never know in the future what that function did). If you call external modules and you matter about provenance, if would be good to also return in a suitable dictionary the version of these modules (e.g., after importing a module XXX, you can check if the module defines a variable XXX.__version__ or XXX.VERSION or something similar, and store it in an output node).

Todo:

For the time being, I am storing the function source code and the full source code file in the attributes of the calculation. To be moved to an input Code node!

Note:

All nodes will be stored, including unstored input nodes!!

Parameters:

kwargs – all kwargs are passed to the wrapped function

Returns:

a length-two tuple, where the first element is the InlineCalculation node, and the second is the dictionary returned by the wrapped function. All nodes are stored.

Raises:
  • TypeError – if the return value is not a dictionary, the keys are not strings, or the values are not data nodes. Raise also if the input values are not data nodes.
  • ModificationNotAllowed – if the returned Data nodes are already stored.
  • Exception – All other exceptions from the wrapped function are not catched.
TemplateReplacer
class aiida.orm.calculation.job.simpleplugins.templatereplacer.TemplatereplacerCalculation(**kwargs)[source]

Simple stub of a plugin that can be used to replace some text in a given template. Can be used for many different codes, or as a starting point to develop a new plugin.

This simple plugin takes two node inputs, both of type ParameterData, with the labels ‘parameters’ and ‘template’

You can also add other SinglefileData nodes as input, that will be copied according to what is written in ‘template’ (see below).

  • parameters: a set of parameters that will be used for substitution.

  • template: can contain the following parameters:

    • input_file_template: a string with substitutions to be managed with the format() function of python, i.e. if you want to substitute a variable called ‘varname’, you write {varname} in the text. See http://www.python.org/dev/peps/pep-3101/ for more details. The replaced file will be the input file.
    • input_file_name: a string with the file name for the input. If it is not provided, no file will be created.
    • output_file_name: a string with the file name for the output. If it is not provided, no redirection will be done and the output will go in the scheduler output file.
    • cmdline_params: a list of strings, to be passed as command line parameters. Each one is substituted with the same rule of input_file_template. Optional
    • input_through_stdin: if True, the input file name is passed via stdin. Default is False if missing.
    • files_to_copy: if defined, a list of tuple pairs, with format (‘link_name’, ‘dest_rel_path’);
      for each tuple, an input link to this calculation is looked for, with link labeled ‘link_label’, and with file type ‘Singlefile’, and the content is copied to a remote file named ‘dest_rel_path’ Errors are raised in the input links are non-existent, or of the wrong type, or if there are unused input files.
    • retrieve_temporary_files: a list of relative filepaths, that if defined, will be retrieved and
      temporarily stored in an unstored FolderData node that will be available during the Parser.parser_with_retrieved call under the key specified by the Parser.retrieved_temporary_folder key

TODO: probably use Python’s Template strings instead?? TODO: catch exceptions TODO: write a ‘input_type_checker’ routine to automatically check the existence and type of inputs + default values etc.

Calculation parsers

This section describes the different parsers classes for calculations.

Note

This list only includes the classes included in AiiDA-core. For all the plugin subclasses, check the corresponding plugin repositories.

Automodules Example

This module defines the main data structures used by the Calculation.

class aiida.common.datastructures.CalcInfo(init=None)[source]

This object will store the data returned by the calculation plugin and to be passed to the ExecManager.

In the following descriptions all paths have to be considered relative

  • retrieve_list: a list of strings or tuples that indicate files that are to be retrieved from the remote

    after the calculation has finished and stored in the repository in a FolderData. If the entry in the list is just a string, it is assumed to be the filepath on the remote and it will be copied to ‘.’ of the repository with name os.path.split(item)[1] If the entry is a tuple it is expected to have the following format

    (‘remotepath’, ‘localpath’, depth)

    If the ‘remotepath’ is a file or folder, it will be copied in the repository to ‘localpath’. However, if the ‘remotepath’ contains file patterns with wildcards, the ‘localpath’ should be set to ‘.’ and the depth parameter should be an integer that decides the localname. The ‘remotepath’ will be split on file separators and the local filename will be determined by joining the N last elements, where N is given by the depth variable.

    Example: (‘some/remote/path/files/pattern*[0-9].xml’, ‘.’, 2)

    Will result in all files that match the pattern to be copied to the local repository with path

    ‘files/pattern*[0-9].xml’

  • retrieve_temporary_list: a list of strings or tuples that indicate files that will be retrieved

    and stored temporarily in a FolderData, that will be available only during the parsing call. The format of the list is the same as that of ‘retrieve_list’

  • retrieve_singlefile_list: a list of tuples with format

    (‘linkname_from calc to singlefile’, ‘subclass of singlefile’, ‘filename’) Each tuple represents a file that will be retrieved from cluster and saved in SinglefileData nodes

  • local_copy_list: a list of tuples with format (‘localabspath’, ‘relativedestpath’)

  • remote_copy_list: a list of tuples with format (‘remotemachinename’, ‘remoteabspath’, ‘relativedestpath’)

  • remote_symlink_list: a list of tuples with format (‘remotemachinename’, ‘remoteabspath’, ‘relativedestpath’)

  • codes_info: a list of dictionaries used to pass the info of the execution of a code

  • codes_run_mode: a string used to specify the order in which multi codes can be executed

class aiida.common.datastructures.CodeInfo(init=None)[source]

This attribute-dictionary contains the information needed to execute a code. Possible attributes are:

  • cmdline_params: a list of strings, containing parameters to be written on the command line right after the call to the code, as for example:

    code.x cmdline_params[0] cmdline_params[1] ... < stdin > stdout
    
  • stdin_name: (optional) the name of the standard input file. Note, it is only possible to use the stdin with the syntax:

    code.x < stdin_name
    

    If no stdin_name is specified, the string “< stdin_name” will not be passed to the code. Note: it is not possible to substitute/remove the ‘<’ if stdin_name is specified; if that is needed, avoid stdin_name and use instead the cmdline_params to specify a suitable syntax.

  • stdout_name: (optional) the name of the standard output file. Note, it is only possible to pass output to stdout_name with the syntax:

    code.x ... > stdout_name
    

    If no stdout_name is specified, the string “> stdout_name” will not be passed to the code. Note: it is not possible to substitute/remove the ‘>’ if stdout_name is specified; if that is needed, avoid stdout_name and use instead the cmdline_params to specify a suitable syntax.

  • stderr_name: (optional) a string, the name of the error file of the code.

  • join_files: (optional) if True, redirects the error to the output file. If join_files=True, the code will be called as:

    code.x ... > stdout_name 2>&1
    

    otherwise, if join_files=False and stderr is passed:

    code.x ... > stdout_name 2> stderr_name
    
  • withmpi: if True, executes the code with mpirun (or another MPI installed on the remote computer)

  • code_uuid: the uuid of the code associated to the CodeInfo

aiida.common.datastructures.sort_states(list_states, use_key=False)[source]

Given a list of state names, return a sorted list of states (the first is the most recent) sorted according to their logical appearance in the DB (i.e., NEW before of SUBMITTING before of FINISHED).

Note

The order of the internal variable _sorted_datastates is used.

Parameters:
  • list_states – a list (or tuple) of state strings.
  • use_key – if True, expects that each element is not just a string, but a pair (someobject, string). Only string is used to sort (should be the state string), and only someobject is returned in the final list.
Returns:

a sorted list of the given data states.

Raises:

ValueError – if any of the given states is not a valid state.

Note

A :noindex: directive was added to avoid duplicate object description for this example. Do not put the keyword in a real documentation.

How To Format Docstrings

Much of the work will be done automatically by Sphinx, just format the docstrings with the same syntax used here, a few extra examples of use would include:

:param parameters: some notes on input parameters

:return returned: some note on what is returned

:raise Errors: Notes on warnings raised
Changing The Docs

If you are creating a new .rst file, make sure to add it in the relevant index.rst tree. This can be done by:

  • Modifying relevant doc strings or .rst files (be sure to modify them in the /doc/source/ folder and not /doc/build)
  • Making sure that all relevant .rst files are added to the relevant index.rst file
  • Running make html in /aiida/docs/ folder
  • Be sure to check for any warnings and correct if possible
This Page
Sphinx cheatsheet
+++++++++++++++++

A brief overview of some of the main functions of Sphinx
as used in the aiida documentation. View :ref:`this-page` to see
how this page was formatted. This is only a brief outline for more
please see `the Sphinx documentation <http://sphinx-doc.org/contents.html>`_


Main Titles and Subtitles
-------------------------

This is an example of a main title.

subtitles are made like this
============================

This is an example of a subtitle.

Formatting
----------

Basic Paragraph Formatting
==========================

Words can be written in *italics* or in **bold**. Text describing a specific
``computer_thing`` can be formatted as well.


Paragraph and Indentation
=========================

Much like in regular python, the indentation plays a strong role in the formatting.

For example all of this sentence will
appear on the same line.

While this sentence will appear
 differently because there is an indent.

Terminal and Code Formatting
============================

Something to be run in command line can be formatted like this::

 >> Some command

As can be seen above, while snippets of python on code can be done like this::

 import module
 print('hello world')

Notes
=====
.. note:: Notes can be added like this.

Bullet Points and Lists
=======================

* Bullet points can be added
* Just like this
  * With sub-bullets like this

#. While numerical bullets
#. Can be added
#. Like this


Links, Code Display, Cross References
-------------------------------------
External Links
==============
Can be done like here for `AiiDA <www.aiida.net/>`_

Code Download
=============

Code can be downloaded like this.

Download: :download:`this example script <devel_tutorial/sum_executable.py>`

Code Display
============

Can be done like this. This entire document can be seen unformated below using this method.

.. literalinclude:: devel_tutorial/sum_executable.py

.. _self-reference:

Math
====

Math formulas can be added as follows :math:`<g_i|`, see
`the Sphinx documentation on math <http://sphinx-doc.org/latest/ext/math.html#module-sphinx.ext.mathbase>`_


Cross Reference Docs
====================

Here is an example of a reference to the :ref:`structure_tutorial` which is on *another page*

Here is an example of a reference to something on the same page, :ref:`self-reference`

.. note:: References within the same document need a reference label, see `.. _self-reference:`
          used in this section for an example. *Hidden in formatted page, can only be seen in the
          input text.*

Cross Reference Classes and Methods
===================================

Any class can be referenced for example :py:class:`~aiida.orm.data.structure.StructureData` references the
StructureData data class.

Similarily any method can be referenced for example :py:meth:`~aiida.orm.data.structure.StructureData.append_atom`
shows the StructureData class' append atom method.


Table of Contents Docs and Code
-------------------------------

Table of Contents for Docs
==========================
An example of the table of contents syntax for the :ref:`git-cheatsheet` can be seen here
note that these are especially important in the global structure of the
document, as found in index.rst files.

.. toctree::
   :maxdepth: 2

   git_cheatsheet

.. note:: The `maxdepth` parameter can be used to change how deep the title indexing goes. See :ref:`this-page`.


Table of Contents for Code
==========================

Table of contents, that cross reference code, can be done very similarly to how
it is done for documents. For example the parser docs can be indexed like this

.. toctree::
   :maxdepth: 1

   aiida.orm <../orm/dev>
   ../parsers/dev


Automodules Example
====================

.. toctree::
   :maxdepth: 2

.. automodule:: aiida.common.datastructures
   :members:
   :noindex:

.. note:: A `:noindex:` directive was added to avoid duplicate object
          description for this example. Do not put the keyword in a real
          documentation.

How To Format Docstrings
------------------------

Much of the work will be done automatically by Sphinx, just format the docstrings with the same syntax used here,
a few extra examples of use would include::

    :param parameters: some notes on input parameters

    :return returned: some note on what is returned

    :raise Errors: Notes on warnings raised



Changing The Docs
-----------------

If you are creating a new .rst file, make sure to add it in
the relevant index.rst tree. This can be done by:

* Modifying relevant doc strings or .rst files (be sure to modify them in the /doc/source/ folder and not /doc/build)

* Making sure that all relevant .rst files are added
  to the relevant index.rst file

* Running `make html` in /aiida/docs/ folder

* Be sure to check for any warnings and correct if possible

.. _this-page:

This Page
=========

.. literalinclude:: sphinx_cheatsheet.rst

Properties

Properties are configuration options that are stored in the config.json file (within the .aiida directory). They can be accessed and modified thanks to verdi devel commands:

  • delproperty: delete a given property.
  • describeproperty: describe the content of a given property.
  • getproperty: get the value of a given property.
  • listproperties: list all user defined properties. With -a option, list all of them including those still at the default values.
  • setproperty: set a given property (usage: verdi devel setproperty PROPERTYNAME PROPERTYVALUE).

For instance, modules to be loaded automatically in the verdi shell can be added by putting their paths (separated by colons :) in the property verdishell.modules, e.g. by typing something like:

verdi devel setproperty verdishell.modules aiida.common.exceptions.NotExistent:aiida.orm.autogroup.Autogroup

More information can be found in the source code: see setup.py.

Database schema

In this section we will give an overview of the database schema and more specifically, we will focus on the SQLAlchemy specific schema.

In the following image you can see the tables of a fairly empty and new AiiDA database.

_images/relationships.sqla.png

The database table names are marked at the top of every table (in deep green), the names of the the columns are below. The primary key of every table is highlighted with light green and the columns that participate to an index (apart from the index of the primary key) are highlighted with yellow color.

We will now briefly describe some of the database tables.

Note

the kombu_queue and the kombu_message tables are generated by the celery and not by the AiiDA. In AiiDA the celery is used by daemon to handle the distributed task queue and real-time messaging. To know more about celery please refer celery documentation.

db_dbnode

This table contains the information about all the Node instances stored in AiiDA. Since the Node class has various sub-classes, there is a column (named type) at the db_dbnode table that indicates the type of the node.

Other columns are the ctime and``mtime`` that correspond to the creation time and the modification time of the node. There are foreign keys that link this table with the db_dbcomputer and the db_dbuser tables.

It is also worth noting the attributes and extras columns that are of type JSONB. This type allows to store directly JSON snippets in such columns while they can be indexed for fast searching.

db_dbuser

The db_dbuser table contains user related information. It is worth noting that the stored password column contained hashed information.

db_dbgroup & db_dbgroup_dbnodes

The nodes can be grouped into groups. In the db_dbgroup table contains information about the group like the name, the type and the description of the group. The nodes that are contained in the group are stored in db_dbgroup_dbnodes table.

db_dbcalcstate

The calculations have a specific state which is needed for the user to verify their status but also for the daemon to submit them and/or fetch the results. This table contains the calculation id, the state that the calculation had and the timestamp of that state.

db_dbcomment

In the db_dbcomment table comments made by users to specific nodes are stored.

db_dbworkflow

In the db_dbworkflow table various information of the workflow are stored. These are (among others) the name and the id of the workflow as well as various metadata.

db_dbworkflowdata

The data related to each workflow is stored at the db_dbworkflowdata table. This data could be the input or output of the workflow and can be stored as a JSON value in database or in a text file.

db_dbworkflowstep

The workflow is divided into multiple steps. Each step could be the new calculation or the subworkplow. The general information like parent workflow id, status, name is stored at db_dbworkflowstep table.

db_dbworkflowstep_calculations

If the workflow step contains the calculation(s), the ids of the workflow step and the calculation node are stored at db_dbworkflowstep_calculations table.

db_dbworkflowstep_sub_workflows

If the workflow step contains the subworkflow(s), the ids of the workflow step and the subworkflow are stored at db_dbworkflowstep_sub_workflows table.

Warning

This is work in progress and only available on the development branch

Scripting API

User facing APIs to control AiiDA from the verdi cli, scripts or plugins

Warning

This is work in progress and only available on the development branch

Postgres Database Manipulation API

This is the API for creating and dropping postgres users and databases used by the verdi quicksetup commandline tool. It allows convenient access to this functionality from within python without knowing details about how postgres is installed by default on various systems. If the postgres setup is not the default installation, additional information will have to be provided.

The Postgres Class
class aiida.control.postgres.Postgres(port=None, interactive=False, quiet=True)[source]

Provides postgres database manipulation assuming no prior setup

  • Can be used to create the initial aiida db user and database.
  • Works in every reasonable environment, provided the user can sudo

Tries to use psychopg2 with a fallback to psql subcommands (using sudo su to run as postgres user).

Parameters:
  • port – (str) Assume the database server runs on this port
  • interactive – (bool) Allow prompting the user for information Will also be passed to sudo (if using psycopg2 fails) and to the callback that can be set to be called when automatic setup detection fails
  • quiet – (bool) Suppress messages

Simple Example:

postgres = Postgres()
postgres.determine_setup()
postgres.create_dbuser('username', 'password')
if not postgres.db_exists('dbname'):
    postgres.create_db('username', 'dbname')

Complex Example:

postgres = Postgres(port=5433, interactive=True)
postgres.setup_fail_callback = prompt_db_info
postgres.determine_setup()
if postgres.pg_execute:
    print('setup sucessful!')
create_db(dbuser, dbname)[source]

Create a database in postgres

Parameters:
  • dbuser – (str), Name of the user which should own the db.
  • dbname – (str), Name of the database.
create_dbuser(dbuser, dbpass)[source]

Create a database user in postgres

Parameters:
  • dbuser – (str), Name of the user to be created.
  • dbpass – (str), Password the user should be given.
db_exists(dbname)[source]

Check wether a postgres database with dbname exists

Parameters:dbname – Name of the database to check for
Returns:(bool), True if database exists, False otherwise
dbuser_exists(dbuser)[source]

Find out if postgres user with name dbuser exists

Parameters:dbuser – (str) database user to check for
Returns:(bool) True if user exists, False otherwise
determine_setup()[source]

Find out how postgres can be accessed.

Depending on how postgres is set up, psycopg2 can be used to create dbs and db users, otherwise a subprocess has to be used that executes psql as an os user with the right permissions.

drop_db(dbname)[source]

Drop a database in postgres

Parameters:dbname – (str), Name of the database.
drop_dbuser(dbuser)[source]

Drop a database user in postgres

Parameters:dbuser – (str), Name of the user to be dropped.
set_port(port)[source]

Set the port manually

set_setup_fail_callback(callback)[source]

Set a callback to be called when setup cannot be determined automatically

Parameters:callback – a callable with signature callback(interactive, dbinfo)
Further utilities
aiida.control.postgres.manual_setup_instructions(dbuser, dbname)[source]

Create a message with instructions for manually creating a database

aiida.control.postgres.prompt_db_info(*args)[source]

Prompt interactively for postgres database connecting details

Can be used as a setup fail callback for aiida.control.postgres.Postgres

Returns:dictionary with the following keys: host, port, database, user

Modules provided with aiida (API reference)

Modules (API documentation)

aiida.common documentation

Internal functionality that is needed by multiple modules of AiiDA

Calculation datastructures

This module defines the main data structures used by the Calculation.

class aiida.common.datastructures.CalcInfo(init=None)[source]

This object will store the data returned by the calculation plugin and to be passed to the ExecManager.

In the following descriptions all paths have to be considered relative

  • retrieve_list: a list of strings or tuples that indicate files that are to be retrieved from the remote

    after the calculation has finished and stored in the repository in a FolderData. If the entry in the list is just a string, it is assumed to be the filepath on the remote and it will be copied to ‘.’ of the repository with name os.path.split(item)[1] If the entry is a tuple it is expected to have the following format

    (‘remotepath’, ‘localpath’, depth)

    If the ‘remotepath’ is a file or folder, it will be copied in the repository to ‘localpath’. However, if the ‘remotepath’ contains file patterns with wildcards, the ‘localpath’ should be set to ‘.’ and the depth parameter should be an integer that decides the localname. The ‘remotepath’ will be split on file separators and the local filename will be determined by joining the N last elements, where N is given by the depth variable.

    Example: (‘some/remote/path/files/pattern*[0-9].xml’, ‘.’, 2)

    Will result in all files that match the pattern to be copied to the local repository with path

    ‘files/pattern*[0-9].xml’

  • retrieve_temporary_list: a list of strings or tuples that indicate files that will be retrieved

    and stored temporarily in a FolderData, that will be available only during the parsing call. The format of the list is the same as that of ‘retrieve_list’

  • retrieve_singlefile_list: a list of tuples with format

    (‘linkname_from calc to singlefile’, ‘subclass of singlefile’, ‘filename’) Each tuple represents a file that will be retrieved from cluster and saved in SinglefileData nodes

  • local_copy_list: a list of tuples with format (‘localabspath’, ‘relativedestpath’)

  • remote_copy_list: a list of tuples with format (‘remotemachinename’, ‘remoteabspath’, ‘relativedestpath’)

  • remote_symlink_list: a list of tuples with format (‘remotemachinename’, ‘remoteabspath’, ‘relativedestpath’)

  • codes_info: a list of dictionaries used to pass the info of the execution of a code

  • codes_run_mode: a string used to specify the order in which multi codes can be executed

class aiida.common.datastructures.CodeInfo(init=None)[source]

This attribute-dictionary contains the information needed to execute a code. Possible attributes are:

  • cmdline_params: a list of strings, containing parameters to be written on the command line right after the call to the code, as for example:

    code.x cmdline_params[0] cmdline_params[1] ... < stdin > stdout
    
  • stdin_name: (optional) the name of the standard input file. Note, it is only possible to use the stdin with the syntax:

    code.x < stdin_name
    

    If no stdin_name is specified, the string “< stdin_name” will not be passed to the code. Note: it is not possible to substitute/remove the ‘<’ if stdin_name is specified; if that is needed, avoid stdin_name and use instead the cmdline_params to specify a suitable syntax.

  • stdout_name: (optional) the name of the standard output file. Note, it is only possible to pass output to stdout_name with the syntax:

    code.x ... > stdout_name
    

    If no stdout_name is specified, the string “> stdout_name” will not be passed to the code. Note: it is not possible to substitute/remove the ‘>’ if stdout_name is specified; if that is needed, avoid stdout_name and use instead the cmdline_params to specify a suitable syntax.

  • stderr_name: (optional) a string, the name of the error file of the code.

  • join_files: (optional) if True, redirects the error to the output file. If join_files=True, the code will be called as:

    code.x ... > stdout_name 2>&1
    

    otherwise, if join_files=False and stderr is passed:

    code.x ... > stdout_name 2> stderr_name
    
  • withmpi: if True, executes the code with mpirun (or another MPI installed on the remote computer)

  • code_uuid: the uuid of the code associated to the CodeInfo

aiida.common.datastructures.sort_states(list_states, use_key=False)[source]

Given a list of state names, return a sorted list of states (the first is the most recent) sorted according to their logical appearance in the DB (i.e., NEW before of SUBMITTING before of FINISHED).

Note

The order of the internal variable _sorted_datastates is used.

Parameters:
  • list_states – a list (or tuple) of state strings.
  • use_key – if True, expects that each element is not just a string, but a pair (someobject, string). Only string is used to sort (should be the state string), and only someobject is returned in the final list.
Returns:

a sorted list of the given data states.

Raises:

ValueError – if any of the given states is not a valid state.

Constants

This module defines the (physical) constants that are used throughout the code. Note that

Exceptions
exception aiida.common.exceptions.AiidaException[source]

Base class for all AiiDA exceptions.

Each module will have its own subclass, inherited from this (e.g. ExecManagerException, TransportException, ...)

exception aiida.common.exceptions.AuthenticationError[source]

Raised when a user tries to access a resource for which it is not authenticated, e.g. an aiidauser tries to access a computer for which there is no entry in the AuthInfo table.

exception aiida.common.exceptions.ConfigurationError[source]

Error raised when there is a configuration error in AiiDA.

exception aiida.common.exceptions.ContentNotExistent[source]

Raised when trying to access an attribute, a key or a file in the result nodes that is not present

exception aiida.common.exceptions.DbContentError[source]

Raised when the content of the DB is not valid. This should never happen if the user does not play directly with the DB.

exception aiida.common.exceptions.FailedError[source]

Raised when accessing a calculation that is in the FAILED status

exception aiida.common.exceptions.FeatureDisabled[source]

Raised when a feature is requested, but the user has chosen to disable it (e.g., for submissions on disabled computers).

exception aiida.common.exceptions.FeatureNotAvailable[source]

Raised when a feature is requested from a plugin, that is not available.

exception aiida.common.exceptions.InputValidationError[source]

The input data for a calculation did not validate (e.g., missing required input data, wrong data, ...)

exception aiida.common.exceptions.InternalError[source]

Error raised when there is an internal error of AiiDA.

exception aiida.common.exceptions.InvalidOperation[source]

The allowed operation is not valid (e.g., when trying to add a non-internal attribute before saving the entry), or deleting an entry that is protected (e.g., because it is referenced by foreign keys)

exception aiida.common.exceptions.LicensingException[source]

Raised when requirements for data licensing are not met.

exception aiida.common.exceptions.LoadingPluginFailed[source]

Raised when loading a plugin through the plugin loader fails

exception aiida.common.exceptions.LockPresent[source]

Raised when a lock is requested, but cannot be acquired.

exception aiida.common.exceptions.MissingPluginError[source]

Raised when the user tries to use a plugin that is not available or does not exist.

exception aiida.common.exceptions.ModificationNotAllowed[source]

Raised when the user tries to modify a field, object, property, ... that should not be modified.

exception aiida.common.exceptions.MultipleObjectsError[source]

Raised when more than one entity is found in the DB, but only one was expected.

exception aiida.common.exceptions.NotExistent[source]

Raised when the required entity does not exist.

exception aiida.common.exceptions.ParsingError[source]

Generic error raised when there is a parsing error

exception aiida.common.exceptions.PluginInternalError[source]

Error raised when there is an internal error which is due to a plugin and not to the AiiDA infrastructure.

exception aiida.common.exceptions.ProfileConfigurationError[source]

Configuration error raised when a wrong/inexistent profile is requested.

exception aiida.common.exceptions.RemoteOperationError[source]

Raised when an error in a remote operation occurs, as in a failed kill() of a scheduler job.

exception aiida.common.exceptions.TestsNotAllowedError[source]

Raised when tests are required to be run/loaded, but we are not in a testing environment.

This is to prevent data loss.

exception aiida.common.exceptions.UniquenessError[source]

Raised when the user tries to violate a uniqueness constraint (on the DB, for instance).

exception aiida.common.exceptions.ValidationError[source]

Error raised when there is an error during the validation phase of a property.

exception aiida.common.exceptions.WorkflowInputValidationError[source]

The input data for a workflow did not validate (e.g., missing required input data, wrong data, ...)

Extended dictionaries
class aiida.common.extendeddicts.AttributeDict(init=None)[source]

This class internally stores values in a dictionary, but exposes the keys also as attributes, i.e. asking for attrdict.key will return the value of attrdict[‘key’] and so on.

Raises an AttributeError if the key does not exist, when called as an attribute, while the usual KeyError if the key does not exist and the dictionary syntax is used.

copy()[source]

Shallow copy.

class aiida.common.extendeddicts.DefaultFieldsAttributeDict(init=None)[source]

A dictionary with access to the keys as attributes, and with an internal value storing the ‘default’ keys to be distinguished from extra fields.

Extra methods defaultkeys() and extrakeys() divide the set returned by keys() in default keys (i.e. those defined at definition time) and other keys. There is also a method get_default_fields() to return the internal list.

Moreover, for undefined default keys, it returns None instead of raising a KeyError/AttributeError exception.

Remember to define the _default_fields in a subclass! E.g.:

class TestExample(DefaultFieldsAttributeDict):
    _default_fields = ('a','b','c')

When the validate() method is called, it calls in turn all validate_KEY methods, where KEY is one of the default keys. If the method is not present, the field is considered to be always valid. Each validate_KEY method should accept a single argument ‘value’ that will contain the value to be checked.

It raises a ValidationError if any of the validate_KEY function raises an exception, otherwise it simply returns. NOTE: the validate_ functions are called also for unset fields, so if the field can be empty on validation, you have to start your validation function with something similar to:

if value is None:
    return

Todo

Decide behavior if I set to None a field. Current behavior, if a is an instance and ‘def_field’ one of the default fields, that is undefined, we get:

  • a.get('def_field'): None
  • a.get('def_field','whatever'): ‘whatever’
  • Note that a.defaultkeys() does NOT contain ‘def_field’

if we do a.def_field = None, then the behavior becomes

  • a.get('def_field'): None
  • a.get('def_field','whatever'): None
  • Note that a.defaultkeys() DOES contain ‘def_field’

See if we want that setting a default field to None means deleting it.

defaultkeys()[source]

Return the default keys defined in the instance.

extrakeys()[source]

Return the extra keys defined in the instance.

classmethod get_default_fields()[source]

Return the list of default fields, either defined in the instance or not.

validate()[source]

Validate the keys, if any validate_* method is available.

class aiida.common.extendeddicts.FixedFieldsAttributeDict(init=None)[source]

A dictionary with access to the keys as attributes, and with filtering of valid attributes. This is only the base class, without valid attributes; use a derived class to do the actual work. E.g.:

class TestExample(FixedFieldsAttributeDict):
    _valid_fields = ('a','b','c')
classmethod get_valid_fields()[source]

Return the list of valid fields.

Folders
class aiida.common.folders.Folder(abspath, folder_limit=None)[source]

A class to manage generic folders, avoiding to get out of specific given folder borders.

Todo

fix this, os.path.commonprefix of /a/b/c and /a/b2/c will give a/b, check if this is wanted or if we want to put trailing slashes. (or if we want to use os.path.relpath and check for a string starting with os.pardir?)

Todo

rethink whether the folder_limit option is still useful. If not, remove it alltogether (it was a nice feature, but unfortunately all the calls to os.path.abspath or normpath are quite slow).

abspath

The absolute path of the folder.

create()[source]

Creates the folder, if it does not exist on the disk yet.

It will also create top directories, if absent.

It is always safe to call it, it will do nothing if the folder already exists.

create_file_from_filelike(src_filelike, dest_name)[source]

Create a file from a file-like object.

Note:

if the current file position in src_filelike is not 0, only the contents from the current file position to the end of the file will be copied in the new file.

Parameters:
  • src_filelike – the file-like object (e.g., if you have a string called s, you can pass StringIO.StringIO(s))
  • dest_name – the destination filename will have this file name.

Create a symlink inside the folder to the location ‘src’.

Parameters:
  • src – the location to which the symlink must point. Can be either a relative or an absolute path. Should, however, be relative to work properly also when the repository is moved!
  • name – the filename of the symlink to be created.
erase(create_empty_folder=False)[source]

Erases the folder. Should be called only in very specific cases, in general folder should not be erased!

Doesn’t complain if the folder does not exist.

Parameters:create_empty_folder – if True, after erasing, creates an empty dir.
exists()[source]

Return True if the folder exists, False otherwise.

folder_limit

The folder limit that cannot be crossed when creating files and folders.

get_abs_path(relpath, check_existence=False)[source]

Return an absolute path for a file or folder in this folder.

The advantage of using this method is that it checks that filename is a valid filename within this folder, and not something e.g. containing slashes.

Parameters:
  • filename – The file or directory.
  • check_existence – if False, just return the file path. Otherwise, also check if the file or directory actually exists. Raise OSError if it does not.
get_content_list(pattern='*', only_paths=True)[source]

Return a list of files (and subfolders) in the folder, matching a given pattern.

Example: If you want to exclude files starting with a dot, you can call this method with pattern='[!.]*'

Parameters:
  • pattern – a pattern for the file/folder names, using Unix filename pattern matching (see Python standard module fnmatch). By default, pattern is ‘*’, matching all files and folders.
  • only_paths – if False (default), return pairs (name, is_file). if True, return only a flat list.
Returns:

a list of tuples of two elements, the first is the file name and the second is True if the element is a file, False if it is a directory.

get_subfolder(subfolder, create=False, reset_limit=False)[source]

Return a Folder object pointing to a subfolder.

Parameters:
  • subfolder – a string with the relative path of the subfolder, relative to the absolute path of this object. Note that this may also contain ‘..’ parts, as far as this does not go beyond the folder_limit.
  • create – if True, the new subfolder is created, if it does not exist.
  • reset_limit – when doing b = a.get_subfolder('xxx', reset_limit=False), the limit of b will be the same limit of a. if True, the limit will be set to the boundaries of folder b.
Returns:

a Folder object pointing to the subfolder.

insert_path(src, dest_name=None, overwrite=True)[source]

Copy a file to the folder.

Parameters:
  • src – the source filename to copy
  • dest_name – if None, the same basename of src is used. Otherwise, the destination filename will have this file name.
  • overwrite – if False, raises an error on existing destination; otherwise, delete it first.
isdir(relpath)[source]

Return True if ‘relpath’ exists inside the folder and is a directory, False otherwise.

isfile(relpath)[source]

Return True if ‘relpath’ exists inside the folder and is a file, False otherwise.

mode_dir

Return the mode with which the folders should be created

mode_file

Return the mode with which the files should be created

open(name, mode='r')[source]

Open a file in the current folder and return the corresponding file object.

remove_path(filename)[source]

Remove a file or folder from the folder.

Parameters:filename – the relative path name to remove
replace_with_folder(srcdir, move=False, overwrite=False)[source]

This routine copies or moves the source folder ‘srcdir’ to the local folder pointed by this Folder object.

Parameters:
  • srcdir – the source folder on the disk; this must be a string with an absolute path
  • move – if True, the srcdir is moved to the repository. Otherwise, it is only copied.
  • overwrite – if True, the folder will be erased first. if False, a IOError is raised if the folder already exists. Whatever the value of this flag, parent directories will be created, if needed.
Raises:

OSError or IOError: in case of problems accessing or writing the files.

Raises:

ValueError: if the section is not recognized.

class aiida.common.folders.RepositoryFolder(section, uuid, subfolder='.')[source]

A class to manage the local AiiDA repository folders.

__init__(section, uuid, subfolder='.')[source]

Initializes the object by pointing it to a folder in the repository.

Pass the uuid as a string.

get_topdir()[source]

Returns the top directory, i.e., the section/uuid folder object.

section

The section to which this folder belongs.

subfolder

The subfolder within the section/uuid folder.

uuid

The uuid to which this folder belongs.

class aiida.common.folders.SandboxFolder(sandbox_in_repo=True)[source]

A class to manage the creation and management of a sandbox folder.

Note: this class must be used within a context manager, i.e.:

with SandboxFolder as f:
## do something with f

In this way, the sandbox folder is removed from disk (if it wasn’t removed already) when exiting the ‘with’ block.

Todo

Implement check of whether the folder has been removed.

__enter__()[source]

Called when entering in the with statement

__exit__(exc_type, exc_value, traceback)[source]

In exit, I remove the sandbox folder from disk, if it still exists

__init__(sandbox_in_repo=True)[source]

Initializes the object by creating a new temporary folder in the sandbox.

Parameters:sandbox_in_repo (bool) – If True (default), creates the folder in the repository. If false, relies on the defaults of tempfile.mkdtemp
Orbitals
class aiida.common.orbital.Orbital[source]

Base class for Orbitals. Can handle certain basic fields, their setting and validation. More complex Orbital objects should then inherit from this class

Parameters:
  • position – the absolute position (three floats) units in angstrom
  • x_orientation – x,y,z unit vector defining polar angle theta in spherical coordinates unitless
  • z_orientation – x,y,z unit vector defining azimuthal angle phi in spherical coordinates unitless
  • orientation_spin – x,y,z unit vector defining the spin orientation unitless
  • diffusivity – Float controls the radial term in orbital equation units are reciprocal Angstrom.
  • module_name – internal parameter, stores orbital type
get_orbital_dict()[source]

returns the internal keys as a dictionary

set_orbital_dict(init_dict)[source]

Sets the orbital_dict, which can vary depending on the particular implementation of this base class.

Parameters:init_dict – the initialization dictionary
aiida.common.orbital.OrbitalFactory(module)[source]

Factory method that returns a suitable Orbital subclass.

Parameters:module – a valid string recognized as a Orbital subclass
Orbital subclasses

The following are Orbital classes inheriting from Orbitals.

RealHydrogen
class aiida.common.orbital.realhydrogen.RealhydrogenOrbital[source]

Orbitals for hydrogen, largely follows the conventions used by wannier90 Object to handle the generation of real hydrogen orbitals and their hybrids, has methods for producing s, p, d, f, and sp, sp2, sp3, sp3d, sp3d2 hybrids. This method does not deal with the cannonical hydrogen orbitals which contain imaginary components.

The orbitals described here are chiefly concerned with the symmetric aspects of the oribitals without the context of space. Therefore diffusitivity, position and atomic labels should be handled in the OrbitalData class.

Following the notation of table 3.1, 3.2 of Wannier90 user guide http://www.wannier.org/doc/user_guide.pdf A brief description of what is meant by each of these labels:

Parameters:
  • radial_nodes – the number of radial nodes (or inflections) if no radial nodes are supplied, defaults to 0
  • angular_momentum – Angular quantum number, using real orbitals
  • magnetic_number – Magnetic quantum number, using real orbitals
  • spin – spin, up (1) down (-1) or unspecified (0)

The conventions regarding L and M correpsond to those used in wannier90 for all L greater than 0 the orbital is not hyrbridized see table 3.1 and for L less than 0 the orbital is hybridized see table 3.2. M then indexes all the possible orbitals from 0 to 2L for L > 0 and from 0 to (-L) for L < 0.

classmethod get_name_from_quantum_numbers(angular_momentum, magnetic_number=None)[source]

Returns the orbital_name correponding to the angular_momentum alone, or to both angular_number with magnetic_number. For example using angular_momentum=1 and magnetic_number=1 will return “Px”

classmethod get_quantum_numbers_from_name(name)[source]

Returns all the angular and magnetic numbers corresponding to name. For example, using “P” as name will return all quantum numbers associated with a “P” orbital, while “Px” will return only one set of quantum numbers, the ones associated with “Px”

Plugin loaders

Extension and eventually replacement of the aiida.common.old_pluginloader module

Allows loading plugins registered as setuptools entry points by separate pip-installed packages. defines drop-in replacement functionality to use the old filesystem based and the new setuptools based plugin systems in parallel.

aiida.common.pluginloader.BaseFactory(module, base_class, base_modname, suffix=None)[source]

Return a plugin class, also find external plugins

This is a front end to aiida.common.old_pluginloader.BaseFactory with a fallback to aiida.common.pluginloader.get_plugin check their relative docs for more info.

Note not all possible notations work for new plugins. Example:

BaseFactory('quantumespresso.cp', JobCalculation,
            'aiida.orm.calculation.job',
            suffix='Calculation') # <- finds cp also if new style

BaseFactory('job.quantumespresso.cp', Calculation,
            'aiida.orm.calculation') <- will find cp only if old style
aiida.common.pluginloader.all_plugins(category)[source]

find old and new plugins

If both are available for a given name, the old style plugin takes precedence.

aiida.common.pluginloader.entry_point_tpstr_from(plugin_class)[source]

gives group and entry point name for a given module if module is registered

Returns:tuple (group, entrypoint-name) if one entry point is found
aiida.common.pluginloader.get_class_to_entry_point_map(short_group_name=False)[source]

create a mapping of modules to entry point groups / names

Parameters:short_group_name – bool, if True the leading ‘aiida.’ is cut off group names
Returns:dictionary, keys are modules, values are (group, entrypoint-name)
aiida.common.pluginloader.get_plugin(category, name)[source]

Return an instance of the class registered under the given name and for the specified plugin category.

Parameters:
  • category – the plugin category to load the plugin from, e.g. ‘transports’.
  • name – the name of the plugin
aiida.common.pluginloader.load_plugin(base_class, plugins_module, plugin_type)[source]

load file or extension point plugins using the file plugin interface.

Prefer file plugins and if not found, translate the arguments to extension point notation

Params:Look at the docstring of aiida.common.old_pluginloader.load_plugin for more Info
Returns:The plugin class
aiida.common.pluginloader.plugin_list(category)[source]

Get a list of plugins for the given category.

Passing example for the category will list all plugins registered under the entry point aiida.example.

Utilities
class aiida.common.utils.ArrayCounter[source]

A counter & a method that increments it and returns its value. It is used in various tests.

class aiida.common.utils.Prettifier(format)[source]

Class to manage prettifiers (typically for labels of kpoints in band plots)

classmethod get_prettifiers()[source]

Return a list of valid prettifier strings

Returns:a list of strings
prettify(label)[source]

Prettify a label using the format passed in the initializer

Parameters:label – the string to prettify
Returns:a prettified string
class aiida.common.utils.abstractclassmethod(callable)[source]

A decorator indicating abstract classmethods.

Backported from python3.

class aiida.common.utils.abstractstaticmethod(callable)[source]

A decorator indicating abstract staticmethods.

Similar to abstractmethod. Backported from python3.

aiida.common.utils.are_dir_trees_equal(dir1, dir2)[source]

Compare two directories recursively. Files in each directory are assumed to be equal if their names and contents are equal.

@param dir1: First directory path @param dir2: Second directory path

@return: True if the directory trees are the same and
there were no errors while accessing the directories or files, False otherwise.
aiida.common.utils.ask_question(question, reply_type, allow_none_as_answer=True)[source]

This method asks a specific question, tries to parse the given reply and then it verifies the parsed answer. :param question: The question to be asked. :param reply_type: The type of the expected answer (int, datetime etc). It is needed for the parsing of the answer. :param allow_none_as_answer: Allow empty answers? :return: The parsed reply.

class aiida.common.utils.classproperty(getter)[source]

A class that, when used as a decorator, works as if the two decorators @property and @classmethod where applied together (i.e., the object works as a property, both for the Class and for any of its instance; and is called with the class cls rather than with the instance as its first argument).

class aiida.common.utils.combomethod(method)[source]

A decorator that wraps a function that can be both a classmethod or instancemethod and behaves accordingly:

class A():

    @combomethod
    def do(self, **kwargs):
        isclass = kwargs.get('isclass')
        if isclass:
            print "I am a class", self
        else:
            print "I am an instance", self

A.do()
A().do()

>>> I am a class __main__.A
>>> I am an instance <__main__.A instance at 0x7f2efb116e60>

Attention: For ease of handling, pass keyword isclass equal to True if this was called as a classmethod and False if this was called as an instance. The argument self is therefore ambiguous!

aiida.common.utils.conv_to_fortran(val, quote_strings=True)[source]
Parameters:val – the value to be read and converted to a Fortran-friendly string.
aiida.common.utils.conv_to_fortran_withlists(val, quote_strings=True)[source]

Same as conv_to_fortran but with extra logic to handle lists :param val: the value to be read and converted to a Fortran-friendly string.

aiida.common.utils.create_display_name(field)[source]

Given a string, creates the suitable “default” display name: replace underscores with spaces, and capitalize each word.

Returns:the converted string
aiida.common.utils.escape_for_bash(str_to_escape)[source]

This function takes any string and escapes it in a way that bash will interpret it as a single string.

Explanation:

At the end, in the return statement, the string is put within single quotes. Therefore, the only thing that I have to escape in bash is the single quote character. To do this, I substitute every single quote ‘ with ‘”’”’ which means:

First single quote: exit from the enclosing single quotes

Second, third and fourth character: “’” is a single quote character, escaped by double quotes

Last single quote: reopen the single quote to continue the string

Finally, note that for python I have to enclose the string ‘”’”’ within triple quotes to make it work, getting finally: the complicated string found below.

aiida.common.utils.export_shard_uuid(uuid)[source]

Sharding of the UUID for the import/export

aiida.common.utils.flatten_list(value)[source]

Flattens a list or a tuple In [2]: flatten_list([[[[[4],3]],[3],[‘a’,[3]]]]) Out[2]: [4, 3, 3, ‘a’, 3]

Parameters:value – A value, whether iterable or not
Returns:a list of nesting level 1
aiida.common.utils.get_class_string(obj)[source]

Return the string identifying the class of the object (module + object name, joined by dots).

It works both for classes and for class instances.

aiida.common.utils.get_configured_user_email()[source]

Return the email (that is used as the username) configured during the first verdi install.

aiida.common.utils.get_extremas_from_positions(positions)[source]

returns the minimum and maximum value for each dimension in the positions given

aiida.common.utils.get_fortfloat(key, txt, be_case_sensitive=True)[source]

Matches a fortran compatible specification of a float behind a defined key in a string. :param key: The key to look for :param txt: The string where to search for the key :param be_case_sensitive: An optional boolean whether to search case-sensitive, defaults to True

If abc is a key, and f is a float, number, than this regex will match t and return f in the following cases:

  • charsbefore, abc = f, charsafter
  • charsbefore abc = f charsafter
  • charsbefore, abc = f charsafter

and vice-versa. If no float is matched, returns None

Exampes of matchable floats are:

  • 0.1d2
  • 0.D-3
  • .2e1
  • -0.23
  • 232
aiida.common.utils.get_mode_string(mode)[source]

Convert a file’s mode to a string of the form ‘-rwxrwxrwx’. Taken (simplified) from cpython 3.3 stat module: https://hg.python.org/cpython/file/3.3/Lib/stat.py

aiida.common.utils.get_new_uuid()[source]

Return a new UUID (typically to be used for new nodes). It uses the UUID version specified in aiida.backends.settings.AIIDANODES_UUID_VERSION

aiida.common.utils.get_object_from_string(string)[source]

Given a string identifying an object (as returned by the get_class_string method) load and return the actual object.

aiida.common.utils.get_object_string(obj)[source]

Get a string that identifies this object which can be used to retrieve it via get_object_from_string().

Parameters:obj – The object to get the string for
Returns:The string that identifies the object
aiida.common.utils.get_repository_folder(subfolder=None)[source]

Return the top folder of the local repository.

aiida.common.utils.get_suggestion(provided_string, allowed_strings)[source]

Given a string and a list of allowed_strings, it returns a string to print on screen, with sensible text depending on whether no suggestion is found, or one or more than one suggestions are found.

Args:
provided_string: the string to compare allowed_strings: a list of valid strings
Returns:
A string to print on output, to suggest to the user a possible valid value.
aiida.common.utils.get_unique_filename(filename, list_of_filenames)[source]

Return a unique filename that can be added to the list_of_filenames.

If filename is not in list_of_filenames, it simply returns the filename string itself. Otherwise, it appends a integer number to the filename (before the extension) until it finds a unique filename.

Parameters:
  • filename – the filename to add
  • list_of_filenames – the list of filenames to which filename should be added, without name duplicates
Returns:

Either filename or its modification, with a number appended between the name and the extension.

aiida.common.utils.grouper(n, iterable)[source]

Given an iterable, returns an iterable that returns tuples of groups of elements from iterable of length n, except the last one that has the required length to exaust iterable (i.e., there is no filling applied).

Parameters:
  • n – length of each tuple (except the last one,that will have length <= n
  • iterable – the iterable to divide in groups
aiida.common.utils.gunzip_string(string)[source]

Gunzip string contents.

Parameters:string – a gzipped string
Returns:a string
aiida.common.utils.gzip_string(string)[source]

Gzip string contents.

Parameters:string – a string
Returns:a gzipped string
aiida.common.utils.issingular(singularForm)[source]

Checks whether a noun is singular :param pluralForm: a string defining an English noun :return: the t-ple (singular, pluralform). singular: True/Flalse if noun is singular/plural pluralfrom: (a string with the noun in the plural form))

aiida.common.utils.join_labels(labels, join_symbol='|', threshold=1e-06)[source]

Join labels with a joining symbol when they are very close

Parameters:
  • labels – a list of length-2 tuples, in the format(position, label)
  • join_symbol – the string to use to join different paths. By default, a pipe
  • threshold – the threshold to decide if two float values are the same and should be joined
Returns:

the same list as labels, but with the second value possibly replaced with strings joined when close enough

aiida.common.utils.md5_file(filename, block_size_factor=128)[source]

Open a file and return its md5sum (hexdigested).

Parameters:
  • filename – the filename of the file for which we want the md5sum
  • block_size_factor – the file is read at chunks of size block_size_factor * md5.block_size, where md5.block_size is the block_size used internally by the hashlib module.
Returns:

a string with the hexdigest md5.

Raises:

No checks are done on the file, so if it doesn’t exists it may raise IOError.

aiida.common.utils.prettify_labels(labels, format=None)[source]

Prettify label for typesetting in various formats

Parameters:
  • labels – a list of length-2 tuples, in the format(position, label)
  • format – a string with the format for the prettifier (e.g. ‘agr’, ‘matplotlib’, ...)
Returns:

the same list as labels, but with the second value possibly replaced with a prettified version that typesets nicely in the selected format

aiida.common.utils.query_string(question, default)[source]

Asks a question (with the option to have a default, predefined answer, and depending on the default answer and the answer of the user the following options are available: - If the user replies (with a non empty answer), then his answer is returned. - If the default answer is None then the user has to reply with a non-empty answer. - If the default answer is not None, then it is returned if the user gives an empty answer. In the case of empty default answer and empty reply from the user, None is returned. :param question: The question that we want to ask the user. :param default: The default answer (if there is any) to the question asked. :return: The returned reply.

aiida.common.utils.query_yes_no(question, default='yes')[source]

Ask a yes/no question via raw_input() and return their answer.

“question” is a string that is presented to the user. “default” is the presumed answer if the user just hits <Enter>. It must be “yes” (the default), “no” or None (meaning an answer is required of the user).

The “answer” return value is True for “yes” or False for “no”.

aiida.common.utils.sha1_file(filename, block_size_factor=128)[source]

Open a file and return its sha1sum (hexdigested).

Parameters:
  • filename – the filename of the file for which we want the sha1sum
  • block_size_factor – the file is read at chunks of size block_size_factor * sha1.block_size, where sha1.block_size is the block_size used internally by the hashlib module.
Returns:

a string with the hexdigest sha1.

Raises:

No checks are done on the file, so if it doesn’t exists it may raise IOError.

aiida.common.utils.str_timedelta(dt, max_num_fields=3, short=False, negative_to_zero=False)[source]

Given a dt in seconds, return it in a HH:MM:SS format.

Parameters:
  • dt – a TimeDelta object
  • max_num_fields – maximum number of non-zero fields to show (for instance if the number of days is non-zero, shows only days, hours and minutes, but not seconds)
  • short – if False, print always max_num_fields fields, even if they are zero. If True, do not print the first fields, if they are zero.
  • negative_to_zero – if True, set dt = 0 if dt < 0.
aiida.common.utils.validate_list_of_string_tuples(val, tuple_length)[source]

Check that:

  1. val is a list or tuple
  2. each element of the list:
  1. is a list or tuple
  2. is of length equal to the parameter tuple_length
  3. each of the two elements is a string

Return if valid, raise ValidationError if invalid

aiida.common.utils.xyz_parser_iterator(string)[source]

Yields a tuple (natoms, comment, atomiter)`for each frame in a XYZ file where `atomiter is an iterator yielding a nested tuple (symbol, (x, y, z)) for each entry.

Parameters:string – a string containing XYZ-structured text

aiida.plugins documentation

Highlevel Registry Querying

High level interface to read information from the cached registry, find plugins, find details about plugins.

utilities to provide information about available plugins

The plugin registry (in cache) is expected to be a dict where the keys are base entry point names of plugins (unique for registered plugins)

example registry:

registry = {
    'quantumespresso': {
        'name': 'aiida-quantumespresso',
        'package_name': 'aiida_quantumespresso',
        'pip_url': 'git+https://...',
        'other_key': 'other_value'
    }
    'vasp': {
        'name': aiida-vasp',
        'package_name': 'aiida_vasp',
        '...': '...'
    }
}
aiida.plugins.info.find_by_name(plugin_key)[source]

returns the pickled RegistryEntry object for a given plugin_key

aiida.plugins.info.find_by_pattern(pattern, ranking=False)[source]

returns a list of RegistryEntry objects for all matches

aiida.plugins.info.find_for_typestring(typestring)[source]

find the plugin with the base entry point name of the given typestring

Returns:dict with plugin keys if found, None if not found
aiida.plugins.info.plugin_ep_iterator()[source]

return an iterator over the plugin entrypoint base strings

Lowlevel Registry Interface

This is the low level interface for operations that have to connect to the registry online and that operate on the cache folder.

functionality to access / cache the plugin registry

aiida.plugins.registry.cleanup_info(registry=None)[source]

delete any plugin info files that do not correspond to a registry entry

aiida.plugins.registry.load_cached()[source]

load the registry from the local cache if the local cache is not readable, create or update it

aiida.plugins.registry.load_online(errorhandler=None)[source]

loads the registry file and returns the list of plugins

aiida.plugins.registry.registry_cache_exists()[source]

check if the registry cache exists

Return bool:True if exists, False if not
aiida.plugins.registry.registry_cache_file_name()[source]

return the name of the registry cache file

aiida.plugins.registry.registry_cache_file_path()[source]

return the path to the cache file

aiida.plugins.registry.registry_cache_openable()[source]

return true if the registry cache file can be opened

aiida.plugins.registry.registry_file_url()[source]

return the url for the plugins.json file

aiida.plugins.registry.update(with_info=True, registry_err_handler=None, info_err_handler=None)[source]

Load the registry from its online location and pickle it.

Creates the cache file if necessary. By default updates the entry details cache for each entry as well.

Parameters:
  • with_info – default: True, update info cache for each entry as well
  • registry_err_handler – callable(exception) -> dict. Must either raise or return a registry dict
  • info_err_handler – callable(exception, plugin, data) -> None. Can raise or just print an error / warning.

If none of the error handlers are given, the function will stop execution if any broken links are encountered.

aiida.plugins.registry.update_info(registry=None, errorhandler=None)[source]

iterate through plugins, download setup info and return as dict

Cached Info for each Plugin

This module and the RegistryEntry class should be the sole location for the (implicit) definition of the registry format.

class aiida.plugins.entry.RegistryEntry(**kwargs)[source]

Can be created from an entry in the online aiida plugin registry. An instance will be created and cached for each entry on update.

cli_apps

A list of cli apps installed by this plugin

entry_point_categories

A list of categories for which this plugin exposes entry points

entry_points

A dict of entry point names by category

entry_points_raw

The full entry point spec in setuptools.setup() format

format_info(**kwargs)[source]

format and return a datastructure containing all known information about the plugin

Parameters:
  • format – str, one of [tabulate | dict] tabulate: use tabulate to create and return a table of properties as a string dict: create a dict of properties
  • as_str – bool format=’dict’ and as_str=True: return a pretty printed string version of the dict format=’dict’ and as_str=False: return a dictionary format=’tabulate’: as_str is ignored
gui_apps

A list of GUI apps installed by this plugin

install(**opts)[source]

Call on pip to install the package if not yet installed

load_setup_info(info_url)[source]

Load setup kwargs from the link in the registry

package_name

The name used to import the package

test_installed()[source]

Return wether the plugin is installed

First, this checks wether the package_name can be imported. If not, we know that at least no new style plugin with that name is installed.

Secondly, tests wether all the entry points are currently found by the plugin loader. If not, it is considered not installed.

potential failures:
  • loading of the entry points is not tested
  • not properly uninstalled plugins might show up as
    installed if the entry points are still around.
  • it does not distinguish between not installed and
    an old version is installed
version

The version of the plugin package

aiida.transport documentation

This chapter describes the generic implementation of a transport plugin. The currently implemented are the local and the ssh plugin. The local plugin makes use only of some standard python modules like os and shutil. The ssh plugin is a wrapper to the library paramiko, that you installed with AiiDA.

A generic set of tests is contained in plugin_test.py, while plugin-specific tests are written separately.

Generic transport class
class aiida.transport.FileAttribute(init=None)[source]

A class, resembling a dictionary, to describe the attributes of a file, that is returned by get_attribute(). Possible keys: st_size, st_uid, st_gid, st_mode, st_atime, st_mtime

class aiida.transport.Transport(*args, **kwargs)[source]

Abstract class for a generic transport (ssh, local, ...) Contains the set of minimal methods

__enter__()[source]

For transports that require opening a connection, opens all required channels (used in ‘with’ statements)

__exit__(type, value, traceback)[source]

Closes connections, if needed (used in ‘with’ statements).

chdir(path)[source]

Change directory to ‘path’

Parameters:path (str) – path to change working directory into.
Raises:IOError, if the requested path does not exist
Return type:str
chmod(path, mode)[source]

Change permissions of a path.

Parameters:
  • path (str) – path to file
  • mode (int) – new permissions
chown(path, uid, gid)[source]

Change the owner (uid) and group (gid) of a file. As with python’s os.chown function, you must pass both arguments, so if you only want to change one, use stat first to retrieve the current owner and group.

Parameters:
  • path (str) – path to the file to change the owner and group of
  • uid (int) – new owner’s uid
  • gid (int) – new group id
close()[source]

Closes the local transport channel

copy(remotesource, remotedestination, *args, **kwargs)[source]

Copy a file or a directory from remote source to remote destination (On the same remote machine)

Parameters:
  • remotesource (str) – path of the remote source directory / file
  • remotedestination (str) – path of the remote destination directory / file
Raises:

IOError, if one of src or dst does not exist

copy_from_remote_to_remote(transportdestination, remotesource, remotedestination, **kwargs)[source]

Copy files or folders from a remote computer to another remote computer.

Parameters:
  • transportdestination – transport to be used for the destination computer
  • remotesource (str) – path to the remote source directory / file
  • remotedestination (str) – path to the remote destination directory / file
  • kwargs – keyword parameters passed to the call to transportdestination.put, except for ‘dereference’ that is passed to self.get

Note

the keyword ‘dereference’ SHOULD be set to False for the final put (onto the destination), while it can be set to the value given in kwargs for the get from the source. In that way, a symbolic link would never be followed in the final copy to the remote destination. That way we could avoid getting unknown (potentially malicious) files into the destination computer. HOWEVER, since dereference=False is currently NOT supported by all plugins, we still force it to True for the final put.

Note

the supported keys in kwargs are callback, dereference, overwrite and ignore_nonexisting.

copyfile(remotesource, remotedestination, *args, **kwargs)[source]

Copy a file from remote source to remote destination (On the same remote machine)

Parameters:
  • remotesource (str) – path of the remote source directory / file
  • remotedestination (str) – path of the remote destination directory / file
Raises:

IOError – if one of src or dst does not exist

copytree(remotesource, remotedestination, *args, **kwargs)[source]

Copy a folder from remote source to remote destination (On the same remote machine)

Parameters:
  • remotesource (str) – path of the remote source directory / file
  • remotedestination (str) – path of the remote destination directory / file
Raises:

IOError – if one of src or dst does not exist

exec_command_wait(command, **kwargs)[source]

Execute the command on the shell, waits for it to finish, and return the retcode, the stdout and the stderr.

Enforce the execution to be run from the pwd (as given by self.getcwd), if this is not None.

Parameters:command (str) – execute the command given as a string
Returns:a list: the retcode (int), stdout (str) and stderr (str).
get(remotepath, localpath, *args, **kwargs)[source]

Retrieve a file or folder from remote source to local destination dst must be an absolute path (src not necessarily)

Parameters:
  • remotepath – (str) remote_folder_path
  • localpath – (str) local_folder_path
get_attribute(path)[source]

Return an object FixedFieldsAttributeDict for file in a given path, as defined in aiida.common.extendeddicts Each attribute object consists in a dictionary with the following keys:

  • st_size: size of files, in bytes
  • st_uid: user id of owner
  • st_gid: group id of owner
  • st_mode: protection bits
  • st_atime: time of most recent access
  • st_mtime: time of most recent modification
Parameters:path (str) – path to file
Returns:object FixedFieldsAttributeDict
get_mode(path)[source]

Return the portion of the file’s mode that can be set by chmod().

Parameters:path (str) – path to file
Returns:the portion of the file’s mode that can be set by chmod()
classmethod get_short_doc()[source]

Return the first non-empty line of the class docstring, if available

classmethod get_valid_auth_params()[source]

Return the internal list of valid auth_params

classmethod get_valid_transports()[source]
Returns:a list of existing plugin names
getcwd()[source]

Get working directory

Returns:a string identifying the current working directory
getfile(remotepath, localpath, *args, **kwargs)[source]

Retrieve a file from remote source to local destination dst must be an absolute path (src not necessarily)

Parameters:
  • remotepath (str) – remote_folder_path
  • localpath (str) – local_folder_path
gettree(remotepath, localpath, *args, **kwargs)[source]

Retrieve a folder recursively from remote source to local destination dst must be an absolute path (src not necessarily)

Parameters:
  • remotepath (str) – remote_folder_path
  • localpath (str) – local_folder_path
glob(pathname)[source]

Return a list of paths matching a pathname pattern.

The pattern may contain simple shell-style wildcards a la fnmatch.

gotocomputer_command(remotedir)[source]

Return a string to be run using os.system in order to connect via the transport to the remote directory.

Expected behaviors:

  • A new bash session is opened
  • A reasonable error message is produced if the folder does not exist
Parameters:remotedir (str) – the full path of the remote directory
iglob(pathname)[source]

Return an iterator which yields the paths matching a pathname pattern.

The pattern may contain simple shell-style wildcards a la fnmatch.

isdir(path)[source]

True if path is an existing directory.

Parameters:path (str) – path to directory
Returns:boolean
isfile(path)[source]

Return True if path is an existing file.

Parameters:path (str) – path to file
Returns:boolean
listdir(path='.', pattern=None)[source]

Return a list of the names of the entries in the given path. The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.

Parameters:
  • path (str) – path to list (default to ‘.’)
  • pattern (str) – if used, listdir returns a list of files matching filters in Unix style. Unix only.
Returns:

a list of strings with the file/directory names

listdir_withattributes(path='.', pattern=None)[source]

Return a list of the names of the entries in the given path. The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.

Parameters:
  • path (str) – path to list (default to ‘.’)
  • pattern (str) – if used, listdir returns a list of files matching filters in Unix style. Unix only.
Returns:

a list of dictionaries, one per entry. The schema of the dictionary is the following:

{
   'name': String,
   'attributes': FileAttributeObject,
   'isdir': Bool
}

where ‘name’ is the file or folder directory, and any other information is metadata (if the file is a folder, a directory, ...). ‘attributes’ behaves as the output of transport.get_attribute(); isdir is a boolean indicating if the object is a directory or not.

logger

Return the internal logger. If you have set extra parameters using _set_logger_extra(), a suitable LoggerAdapter instance is created, bringing with itself also the extras.

makedirs(path, ignore_existing=False)[source]

Super-mkdir; create a leaf directory and all intermediate ones. Works like mkdir, except that any intermediate path segment (not just the rightmost) will be created if it does not exist.

Parameters:
  • path (str) – directory to create
  • ignore_existing (bool) – if set to true, it doesn’t give any error if the leaf directory does already exist
Raises:

OSError, if directory at path already exists

mkdir(path, ignore_existing=False)[source]

Create a folder (directory) named path.

Parameters:
  • path (str) – name of the folder to create
  • ignore_existing (bool) – if True, does not give any error if the directory already exists
Raises:

OSError, if directory at path already exists

normalize(path='.')[source]

Return the normalized path (on the server) of a given path. This can be used to quickly resolve symbolic links or determine what the server is considering to be the “current folder”.

Parameters:path (str) – path to be normalized
Raises:IOError – if the path can’t be resolved on the server
open()[source]

Opens a local transport channel

path_exists(path)[source]

Returns True if path exists, False otherwise.

put(localpath, remotepath, *args, **kwargs)[source]

Put a file or a directory from local src to remote dst. src must be an absolute path (dst not necessarily)) Redirects to putfile and puttree.

Parameters:
  • localpath (str) – absolute path to local source
  • remotepath (str) – path to remote destination
putfile(localpath, remotepath, *args, **kwargs)[source]

Put a file from local src to remote dst. src must be an absolute path (dst not necessarily))

Parameters:
  • localpath (str) – absolute path to local file
  • remotepath (str) – path to remote file
puttree(localpath, remotepath, *args, **kwargs)[source]

Put a folder recursively from local src to remote dst. src must be an absolute path (dst not necessarily))

Parameters:
  • localpath (str) – absolute path to local folder
  • remotepath (str) – path to remote folder
remove(path)[source]

Remove the file at the given path. This only works on files; for removing folders (directories), use rmdir.

Parameters:path (str) – path to file to remove
Raises:IOError – if the path is a directory
rename(oldpath, newpath)[source]

Rename a file or folder from oldpath to newpath.

Parameters:
  • oldpath (str) – existing name of the file or folder
  • newpath (str) – new name for the file or folder
Raises:
  • IOError – if oldpath/newpath is not found
  • ValueError – if oldpath/newpath is not a valid string
rmdir(path)[source]

Remove the folder named path. This works only for empty folders. For recursive remove, use rmtree.

Parameters:path (str) – absolute path to the folder to remove
rmtree(path)[source]

Remove recursively the content at path

Parameters:path (str) – absolute path to remove

Create a symbolic link between the remote source and the remote destination.

Parameters:
  • remotesource – remote source
  • remotedestination – remote destination
whoami()[source]

Get the remote username

Returns:list of username (str), retval (int), stderr (str)
aiida.transport.TransportFactory(module)[source]

Used to return a suitable Transport subclass.

Parameters:module (str) – name of the module containing the Transport subclass
Returns:the transport subclass located in module ‘module’
exception aiida.transport.TransportInternalError[source]

Raised if there is a transport error that is raised to an internal error (e.g. a transport method called without opening the channel first).

aiida.transport.copy_from_remote_to_remote(transportsource, transportdestination, remotesource, remotedestination, **kwargs)[source]

Copy files or folders from a remote computer to another remote computer.

Parameters:
  • transportsource – transport to be used for the source computer
  • transportdestination – transport to be used for the destination computer
  • remotesource (str) – path to the remote source directory / file
  • remotedestination (str) – path to the remote destination directory / file
  • kwargs – keyword parameters passed to the final put, except for ‘dereference’ that is passed to the initial get

Note

it uses the method transportsource.copy_from_remote_to_remote

Existing plugins
class aiida.transport.plugins.ssh.SshTransport(machine, **kwargs)[source]

Support connection, command execution and data transfer to remote computers via SSH+SFTP.

chdir(path)[source]

Change directory of the SFTP session. Emulated internally by paramiko.

Differently from paramiko, if you pass None to chdir, nothing happens and the cwd is unchanged.

chmod(path, mode)[source]

Change permissions to path

Parameters:
  • path – path to file
  • mode – new permission bits (integer)
close()[source]

Close the SFTP channel, and the SSHClient.

Todo:correctly manage exceptions
Raises:InvalidOperation – if the channel is already open
copy(remotesource, remotedestination, dereference=False)[source]

Copy a file or a directory from remote source to remote destination. Flags used: -r: recursive copy; -f: force, makes the command non interactive; -L follows symbolic links

Parameters:
  • remotesource – file to copy from
  • remotedestination – file to copy to
  • dereference – if True, copy content instead of copying the symlinks only Default = False.
Raises:

IOError – if the cp execution failed.

Note

setting dereference equal to True could cause infinite loops.

copyfile(remotesource, remotedestination, dereference=False, pattern=None)[source]

Copy a file from remote source to remote destination Redirects to copy().

Parameters:
  • remotesource
  • remotedestination
  • dereference
  • pattern
copytree(remotesource, remotedestination, dereference=False, pattern=None)[source]

copy a folder recursively from remote source to remote destination Redirects to copy()

Parameters:
  • remotesource
  • remotedestination
  • dereference
  • pattern
exec_command_wait(command, stdin=None, combine_stderr=False, bufsize=-1)[source]

Executes the specified command and waits for it to finish.

Parameters:
  • command – the command to execute
  • stdin – (optional,default=None) can be a string or a file-like object.
  • combine_stderr – (optional, default=False) see docstring of self._exec_command_internal()
  • bufsize – same meaning of paramiko.
Returns:

a tuple with (return_value, stdout, stderr) where stdout and stderr are strings.

get(remotepath, localpath, callback=None, dereference=True, overwrite=True, ignore_nonexisting=False)[source]

Get a file or folder from remote to local. Redirects to getfile or gettree.

Parameters:
  • remotepath – a remote path
  • localpath – an (absolute) local path
  • dereference – follow symbolic links. Default = True (default behaviour in paramiko). False is not implemented.
  • overwrite – if True overwrites files and folders. Default = False
Raises:
  • ValueError – if local path is invalid
  • IOError – if the remotepath is not found
get_attribute(path)[source]

Returns the object Fileattribute, specified in aiida.transport Receives in input the path of a given file.

getcwd()[source]

Return the current working directory for this SFTP session, as emulated by paramiko. If no directory has been set with chdir, this method will return None. But in __enter__ this is set explicitly, so this should never happen within this class.

getfile(remotepath, localpath, callback=None, dereference=True, overwrite=True)[source]

Get a file from remote to local.

Parameters:
  • remotepath – a remote path
  • localpath – an (absolute) local path
  • overwrite – if True overwrites files and folders. Default = False
Raises:
  • ValueError – if local path is invalid
  • OSError – if unintentionally overwriting
gettree(remotepath, localpath, callback=None, dereference=True, overwrite=True)[source]

Get a folder recursively from remote to local.

Parameters:
  • remotepath – a remote path
  • localpath – an (absolute) local path
  • dereference – follow symbolic links. Default = True (default behaviour in paramiko). False is not implemented.
  • overwrite – if True overwrites files and folders. Default = False
Raises:
  • ValueError – if local path is invalid
  • IOError – if the remotepath is not found
  • OSError – if unintentionally overwriting
gotocomputer_command(remotedir)[source]

Specific gotocomputer string to connect to a given remote computer via ssh and directly go to the calculation folder.

isdir(path)[source]

Return True if the given path is a directory, False otherwise. Return False also if the path does not exist.

isfile(path)[source]

Return True if the given path is a file, False otherwise. Return False also if the path does not exist.

listdir(path='.', pattern=None)[source]

Get the list of files at path.

Parameters:
  • path – default = ‘.’
  • pattern – returns the list of files matching pattern. Unix only. (Use to emulate ls * for example)
makedirs(path, ignore_existing=False)[source]

Super-mkdir; create a leaf directory and all intermediate ones. Works like mkdir, except that any intermediate path segment (not just the rightmost) will be created if it does not exist.

NOTE: since os.path.split uses the separators as the host system (that could be windows), I assume the remote computer is Linux-based and use ‘/’ as separators!

Parameters:
  • path – directory to create (string)
  • ignore_existing – if set to true, it doesn’t give any error if the leaf directory does already exist (bool)
Raises:

OSError – If the directory already exists.

mkdir(path, ignore_existing=False)[source]

Create a folder (directory) named path.

Parameters:
  • path – name of the folder to create
  • ignore_existing – if True, does not give any error if the directory already exists
Raises:

OSError – If the directory already exists.

normalize(path)[source]

Returns the normalized path (removing double slashes, etc...)

open()[source]

Open a SSHClient to the machine possibly using the parameters given in the __init__.

Also opens a sftp channel, ready to be used. The current working directory is set explicitly, so it is not None.

Raises:InvalidOperation – if the channel is already open
path_exists(path)[source]

Check if path exists

put(localpath, remotepath, callback=None, dereference=True, overwrite=True, ignore_nonexisting=False)[source]

Put a file or a folder from local to remote. Redirects to putfile or puttree.

Parameters:
  • localpath – an (absolute) local path
  • remotepath – a remote path
  • dereference – follow symbolic links (boolean). Default = True (default behaviour in paramiko). False is not implemented.
  • overwrite – if True overwrites files and folders (boolean). Default = False.
Raises:
  • ValueError – if local path is invalid
  • OSError – if the localpath does not exist
putfile(localpath, remotepath, callback=None, dereference=True, overwrite=True)[source]

Put a file from local to remote.

Parameters:
  • localpath – an (absolute) local path
  • remotepath – a remote path
  • overwrite – if True overwrites files and folders (boolean). Default = True.
Raises:
  • ValueError – if local path is invalid
  • OSError – if the localpath does not exist, or unintentionally overwriting
puttree(localpath, remotepath, callback=None, dereference=True, overwrite=True)[source]

Put a folder recursively from local to remote.

Parameters:
  • localpath – an (absolute) local path
  • remotepath – a remote path
  • dereference – follow symbolic links (boolean) Default = True (default behaviour in paramiko). False is not implemented.
  • overwrite – if True overwrites files and folders (boolean). Default = True
Raises:
  • ValueError – if local path is invalid
  • OSError – if the localpath does not exist, or trying to overwrite
  • IOError – if remotepath is invalid

Note

setting dereference equal to True could cause infinite loops. see os.walk() documentation

remove(path)[source]

Remove a single file at ‘path’

rename(src, dst)[source]

Rename a file or folder from src to dst.

Parameters:
  • oldpath (str) – existing name of the file or folder
  • newpath (str) – new name for the file or folder
Raises:
  • IOError – if src/dst is not found
  • ValueError – if src/dst is not a valid string
rmdir(path)[source]

Remove the folder named ‘path’ if empty.

rmtree(path)[source]

Remove a file or a directory at path, recursively Flags used: -r: recursive copy; -f: force, makes the command non interactive;

Parameters:path – remote path to delete
Raises:IOError – if the rm execution failed.

Create a symbolic link between the remote source and the remote destination.

Parameters:
  • remotesource – remote source. Can contain a pattern.
  • remotedestination – remote destination
class aiida.transport.plugins.local.LocalTransport(**kwargs)[source]

Support copy and command execution on the same host on which AiiDA is running via direct file copy and execution commands.

chdir(path)[source]

Changes directory to path, emulated internally. :param path: path to cd into :raise OSError: if the directory does not have read attributes.

chmod(path, mode)[source]

Changes permission bits of object at path :param path: path to modify :param mode: permission bits

Raises:IOError – if path does not exist.
close()[source]

Closes the local transport channel

Raises:InvalidOperation – if the channel is already open
copy(source, destination, dereference=False)[source]

Copies a file or a folder from ‘remote’ source to ‘remote’ destination. Automatically redirects to copyfile or copytree.

Parameters:
  • source – path to local file
  • destination – path to remote file
  • dereference – follow symbolic links. Default = False
Raises:
  • ValueError – if ‘remote’ source or destination is not valid
  • OSError – if source does not exist
copyfile(source, destination)[source]

Copies a file from ‘remote’ source to ‘remote’ destination.

Parameters:
  • source – path to local file
  • destination – path to remote file
Raises:
  • ValueError – if ‘remote’ source or destination is not valid
  • OSError – if source does not exist
copytree(source, destination, dereference=False)[source]

Copies a folder from ‘remote’ source to ‘remote’ destination.

Parameters:
  • source – path to local file
  • destination – path to remote file
  • dereference – follow symbolic links. Default = False
Raises:
  • ValueError – if ‘remote’ source or destination is not valid
  • OSError – if source does not exist
curdir

Returns the _internal_dir, if the channel is open. If possible, use getcwd() instead!

exec_command_wait(command, stdin=None)[source]

Executes the specified command and waits for it to finish.

Parameters:command – the command to execute
Returns:a tuple with (return_value, stdout, stderr) where stdout and stderr are strings.
get(source, destination, dereference=True, overwrite=True, ignore_nonexisting=False)[source]

Copies a folder or a file recursively from ‘remote’ source to ‘local’ destination. Automatically redirects to getfile or gettree.

Parameters:
  • source – path to local file
  • destination – absolute path to remote file
  • dereference – follow symbolic links default = True
  • overwrite – if True overwrites destination default = False
Raises:
  • IOError – if ‘remote’ source is not valid
  • ValueError – if ‘local’ destination is not valid
get_attribute(path)[source]

Returns an object FileAttribute, as specified in aiida.transport. :param path: the path of the given file.

getcwd()[source]

Returns the current working directory, emulated by the transport

getfile(source, destination, overwrite=True)[source]

Copies a file recursively from ‘remote’ source to ‘local’ destination.

Parameters:
  • source – path to local file
  • destination – absolute path to remote file
  • overwrite – if True overwrites destination. Default = False

:raise IOError if ‘remote’ source is not valid or not found :raise ValueError: if ‘local’ destination is not valid :raise OSError: if unintentionally overwriting

gettree(source, destination, dereference=True, overwrite=True)[source]

Copies a folder recursively from ‘remote’ source to ‘local’ destination.

Parameters:
  • source – path to local file
  • destination – absolute path to remote file
  • dereference – follow symbolic links. Default = True
  • overwrite – if True overwrites destination. Default = False
Raises:
  • IOError – if ‘remote’ source is not valid
  • ValueError – if ‘local’ destination is not valid
  • OSError – if unintentionally overwriting
gotocomputer_command(remotedir)[source]

Return a string to be run using os.system in order to connect via the transport to the remote directory.

Expected behaviors:

  • A new bash session is opened
  • A reasonable error message is produced if the folder does not exist
Parameters:remotedir (str) – the full path of the remote directory
isdir(path)[source]

Checks if ‘path’ is a directory. :return: a boolean

isfile(path)[source]

Checks if object at path is a file. Returns a boolean.

listdir(path='.', pattern=None)[source]
Returns:

a list containing the names of the entries in the directory.

Parameters:
  • path – default =’.’
  • pattern – if set, returns the list of files matching pattern. Unix only. (Use to emulate ls * for example)
makedirs(path, ignore_existing=False)[source]

Super-mkdir; create a leaf directory and all intermediate ones. Works like mkdir, except that any intermediate path segment (not just the rightmost) will be created if it does not exist.

Parameters:
  • path – directory to create
  • ignore_existing – if set to true, it doesn’t give any error if the leaf directory does already exist
Raises:

OSError – If the directory already exists and is not ignore_existing

mkdir(path, ignore_existing=False)[source]

Create a folder (directory) named path.

Parameters:
  • path – name of the folder to create
  • ignore_existing – if True, does not give any error if the directory already exists
Raises:

OSError – If the directory already exists.

normalize(path)[source]

Normalizes path, eliminating double slashes, etc.. :param path: path to normalize

open()[source]

Opens a local transport channel

Raises:InvalidOperation – if the channel is already open
path_exists(path)[source]

Check if path exists

put(source, destination, dereference=True, overwrite=True, ignore_nonexisting=False)[source]

Copies a file or a folder from source to destination. Automatically redirects to putfile or puttree.

Parameters:
  • source – absolute path to local file
  • destination – path to remote file
  • dereference – if True follows symbolic links. Default = True
  • overwrite – if True overwrites destination. Default = False
Raises:
  • IOError – if destination is not valid
  • ValueError – if source is not valid
putfile(source, destination, overwrite=True)[source]

Copies a file from source to destination. Automatically redirects to putfile or puttree.

Parameters:
  • source – absolute path to local file
  • destination – path to remote file
  • overwrite – if True overwrites destination Default = False
Raises:
  • IOError – if destination is not valid
  • ValueError – if source is not valid
  • OSError – if source does not exist
puttree(source, destination, dereference=True, overwrite=True)[source]

Copies a folder recursively from source to destination. Automatically redirects to putfile or puttree.

Parameters:
  • source – absolute path to local file
  • destination – path to remote file
  • dereference – follow symbolic links. Default = True
  • overwrite – if True overwrites destination. Default = False
Raises:
  • IOError – if destination is not valid
  • ValueError – if source is not valid
  • OSError – if source does not exist
remove(path)[source]

Removes a file at position path.

rename(src, dst)[source]

Rename a file or folder from oldpath to newpath.

Parameters:
  • oldpath (str) – existing name of the file or folder
  • newpath (str) – new name for the file or folder
Raises:
  • IOError – if src/dst is not found
  • ValueError – if src/dst is not a valid string
rmdir(path)[source]

Removes a folder at location path. :param path: path to remove

rmtree(path)[source]

Remove tree as rm -r would do

Parameters:path – a string to path

Create a symbolic link between the remote source and the remote destination

Parameters:
  • remotesource – remote source. Can contain a pattern.
  • remotedestination – remote destination
Developing a plugin

The transport class is actually almost never used in first person by the user. It is mostly utilized by the ExecutionManager, that use the transport plugin to connect to the remote computer to manage the calculation. The ExecutionManager has to be able to use always the same function, or the same interface, regardless of which kind of connection is actually really using.

The generic transport class contains a set of minimal methods that an implementation must support, in order to be fully compatible with the other plugins. If not, a NotImplementedError will be raised, interrupting the managing of the calculation or whatever is using the transport plugin.

Since it is important that all plugins have the same interface, or the same response behavior, a set of generic tests has been written (alongside with set of tests that are implementation specific). After every modification, or when implementing a new plugin, it is crucial to run the tests and verify that everything is passed. The modification of tests possibly means breaking back-compatibility and/or modifications to every piece of code using a transport plugin.

If an unexpected behavior is observed during the usage, the way of fixing it is:

  1. Write a new test that shows the problem (one test for one problem when possible)
  2. Fix the bug
  3. Verify that the test is passed correctly

The importance of point 1) is often neglected, but unittesting is a useful tool that helps you avoiding the repetition of errors. Despite the appearence, it’s a time-saver! Not only, the tests help you seeing how the plugin is used.

As for the general functioning of the plugin, the __init__ method is used only to initialize the class instance, without actually opening the transport channel. The connection must be opened only by the __enter__ method, (and closed by __exit__. The __enter__ method let you use the transport class using the with statement (see Python docs), in a way similar to the following:

t = TransportPlugin()
with open(t):
    t.do_something_remotely

To ensure this, for example, the local plugin uses a hidden boolean variable _is_open that is set when the __enter__ and __exit__ methods are called. The Ssh logic is instead given by the property sftp.

The other functions that require some care are the copying functions, called using the following terminology:

  1. put: from local source to remote destination
  2. get: from remote source to local destination
  3. copy: copying files from remote source to remote destination

Note that these functions must copy files or folders regardless, internally, they will fallback to functions like putfile or puttree.

The last function requiring care is exec_command_wait, which is an analogue to the subprocess Python module. The function gives the freedom to execute a string as a remote command, thus it could produce nasty effects if not written with care. Be sure to escape any string for bash!

Currently, the implemented plugins are the Local and the Ssh transports. The Local one is simply a wrapper to some standard Python modules, like shutil or os, those functions are simply interfaced in a different way with AiiDA. The SSh instead is an interface to the Paramiko library.

Below, you can find a template to fill for a new transport plugin, with a minimal docstring that also work for the sphinx documentation.

class NewTransport(aiida.transport.Transport):

  def __init__(self, machine, **kwargs):
      """
      Initialize the Transport class.

      :param machine: the machine to connect to
      """

  def __enter__(self):
      """
      Open the connection
      """

  def __exit__(self, type, value, traceback):
      """
      Close the connection
      """

  def chdir(self,path):
      """
      Change directory to 'path'

      :param str path: path to change working directory into.
      :raises: IOError, if the requested path does not exist
      :rtype: string
      """

  def chmod(self,path,mode):
      """
      Change permissions of a path.

      :param str path: path to file
      :param int mode: new permissions
      """

  def copy(self,remotesource,remotedestination,*args,**kwargs):
      """
      Copy a file or a directory from remote source to remote destination
      (On the same remote machine)

      :param str remotesource: path of the remote source directory / file
      :param str remotedestination: path of the remote destination directory / file

      :raises: IOError, if source or destination does not exist
      """
      raise NotImplementedError

  def copyfile(self,remotesource,remotedestination,*args,**kwargs):
      """
      Copy a file from remote source to remote destination
      (On the same remote machine)

      :param str remotesource: path of the remote source directory / file
      :param str remotedestination: path of the remote destination directory / file

      :raises IOError: if one of src or dst does not exist
      """

  def copytree(self,remotesource,remotedestination,*args,**kwargs):
      """
      Copy a folder from remote source to remote destination
      (On the same remote machine)

      :param str remotesource: path of the remote source directory / file
      :param str remotedestination: path of the remote destination directory / file

      :raise IOError: if one of src or dst does not exist
      """

  def exec_command_wait(self,command, **kwargs):
      """
      Execute the command on the shell, waits for it to finish,
      and return the retcode, the stdout and the stderr.

      Enforce the execution to be run from the pwd (as given by
      self.getcwd), if this is not None.

      :param str command: execute the command given as a string
      :return: a tuple: the retcode (int), stdout (str) and stderr (str).
      """

  def get_attribute(self,path):
      """
      Return an object FixedFieldsAttributeDict for file in a given path,
      as defined in aiida.common.extendeddicts
      Each attribute object consists in a dictionary with the following keys:

      * st_size: size of files, in bytes

      * st_uid: user id of owner

      * st_gid: group id of owner

      * st_mode: protection bits

      * st_atime: time of most recent access

      * st_mtime: time of most recent modification

      :param str path: path to file
      :return: object FixedFieldsAttributeDict
      """

  def getcwd(self):
      """
      Get working directory

      :return: a string identifying the current working directory
      """

  def get(self, remotepath, localpath, *args, **kwargs):
      """
      Retrieve a file or folder from remote source to local destination
      dst must be an absolute path (src not necessarily)

      :param remotepath: (str) remote_folder_path
      :param localpath: (str) local_folder_path
      """

  def getfile(self, remotepath, localpath, *args, **kwargs):
      """
      Retrieve a file from remote source to local destination
      dst must be an absolute path (src not necessarily)

      :param str remotepath: remote_folder_path
      :param str localpath: local_folder_path
      """

  def gettree(self, remotepath, localpath, *args, **kwargs):
      """
      Retrieve a folder recursively from remote source to local destination
      dst must be an absolute path (src not necessarily)

      :param str remotepath: remote_folder_path
      :param str localpath: local_folder_path
      """

  def gotocomputer_command(self, remotedir):
      """
      Return a string to be run using os.system in order to connect
      via the transport to the remote directory.

      Expected behaviors:

      * A new bash session is opened

      * A reasonable error message is produced if the folder does not exist

      :param str remotedir: the full path of the remote directory
      """

  def isdir(self,path):
      """
      True if path is an existing directory.

      :param str path: path to directory
      :return: boolean
      """

  def isfile(self,path):
      """
      Return True if path is an existing file.

      :param str path: path to file
      :return: boolean
      """

  def listdir(self, path='.',pattern=None):
      """
      Return a list of the names of the entries in the given path.
      The list is in arbitrary order. It does not include the special
      entries '.' and '..' even if they are present in the directory.

      :param str path: path to list (default to '.')
      :param str pattern: if used, listdir returns a list of files matching
                          filters in Unix style. Unix only.
      :return: a list of strings
      """

  def makedirs(self,path,ignore_existing=False):
      """
      Super-mkdir; create a leaf directory and all intermediate ones.
      Works like mkdir, except that any intermediate path segment (not
      just the rightmost) will be created if it does not exist.

      :param str path: directory to create
      :param bool ignore_existing: if set to true, it doesn't give any error
                                   if the leaf directory does already exist

      :raises: OSError, if directory at path already exists
      """

  def mkdir(self,path,ignore_existing=False):
      """
      Create a folder (directory) named path.

      :param str path: name of the folder to create
      :param bool ignore_existing: if True, does not give any error if the
                                   directory already exists

      :raises: OSError, if directory at path already exists
      """

  def normalize(self,path='.'):
      """
      Return the normalized path (on the server) of a given path.
      This can be used to quickly resolve symbolic links or determine
      what the server is considering to be the "current folder".

      :param str path: path to be normalized

      :raise IOError: if the path can't be resolved on the server
      """

  def put(self, localpath, remotepath, *args, ** kwargs):
      """
      Put a file or a directory from local src to remote dst.
      src must be an absolute path (dst not necessarily))
      Redirects to putfile and puttree.

      :param str localpath: path to remote destination
      :param str remotepath: absolute path to local source
      """

  def putfile(self, localpath, remotepath, *args, ** kwargs):
      """
      Put a file from local src to remote dst.
      src must be an absolute path (dst not necessarily))

      :param str localpath: path to remote file
      :param str remotepath: absolute path to local file
      """

  def puttree(self, localpath, remotepath, *args, ** kwargs):
      """
      Put a folder recursively from local src to remote dst.
      src must be an absolute path (dst not necessarily))

      :param str localpath: path to remote folder
      :param str remotepath: absolute path to local folder
      """

 def rename(src,dst):
      """
      Rename a file or folder from src to dst.

      :param str oldpath: existing name of the file or folder
      :param str newpath: new name for the file or folder

      :raises IOError: if src/dst is not found
      :raises ValueError: if src/dst is not a valid string
      """

  def remove(self,path):
      """
      Remove the file at the given path. This only works on files;
      for removing folders (directories), use rmdir.

      :param str path: path to file to remove

      :raise IOError: if the path is a directory
      """

  def rmdir(self,path):
      """
      Remove the folder named path.
      This works only for empty folders. For recursive remove, use rmtree.

      :param str path: absolute path to the folder to remove
      """
      raise NotImplementedError

  def rmtree(self,path):
      """
      Remove recursively the content at path

      :param str path: absolute path to remove
      """

aiida.scheduler documentation

We report here the generic AiiDA scheduler implementation.

Generic scheduler class
class aiida.scheduler.Scheduler[source]

Base class for all schedulers.

classmethod create_job_resource(**kwargs)[source]

Create a suitable job resource from the kwargs specified

getJobs(jobs=None, user=None, as_dict=False)[source]

Get the list of jobs and return it.

Typically, this function does not need to be modified by the plugins.

Parameters:
  • jobs (list) – a list of jobs to check; only these are checked
  • user (str) – a string with a user: only jobs of this user are checked
  • as_dict (list) – if False (default), a list of JobInfo objects is returned. If True, a dictionary is returned, having as key the job_id and as value the JobInfo object.

Note: typically, only either jobs or user can be specified. See also comments in _get_joblist_command.

get_detailed_jobinfo(jobid)[source]

Return a string with the output of the detailed_jobinfo command.

At the moment, the output text is just retrieved and stored for logging purposes, but no parsing is performed.

classmethod get_short_doc()[source]

Return the first non-empty line of the class docstring, if available

get_submit_script(job_tmpl)[source]

Return the submit script as a string. :parameter job_tmpl: a aiida.scheduler.datastrutures.JobTemplate object.

The plugin returns something like

#!/bin/bash <- this shebang line could be configurable in the future scheduler_dependent stuff to choose numnodes, numcores, walltime, ... prepend_computer [also from calcinfo, joined with the following?] prepend_code [from calcinfo] output of _get_script_main_content postpend_code postpend_computer

kill(jobid)[source]

Kill a remote job, and try to parse the output message of the scheduler to check if the scheduler accepted the command.

..note:: On some schedulers, even if the command is accepted, it may take some seconds for the job to actually disappear from the queue.

Parameters:jobid (str) – the job id to be killed
Returns:True if everything seems ok, False otherwise.
logger

Return the internal logger.

set_transport(transport)[source]

Set the transport to be used to query the machine or to submit scripts. This class assumes that the transport is open and active.

submit_from_script(working_directory, submit_script)[source]

Goes in the working directory and submits the submit_script.

Return a string with the JobID in a valid format to be used for querying.

Typically, this function does not need to be modified by the plugins.

transport

Return the transport set for this scheduler.

aiida.scheduler.SchedulerFactory(module)[source]

Used to load a suitable Scheduler subclass.

Parameters:module (str) – a string with the module name
Returns:the scheduler subclass contained in module ‘module’
Scheduler datastructures

This module defines the main data structures used by the Scheduler.

In particular, there is the definition of possible job states (job_states), the data structure to be filled for job submission (JobTemplate), and the data structure that is returned when querying for jobs in the scheduler (JobInfo).

class aiida.scheduler.datastructures.JobInfo(init=None)[source]

Contains properties for a job in the queue. Most of the fields are taken from DRMAA v.2.

Note that default fields may be undefined. This is an expected behavior and the application must cope with this case. An example for instance is the exit_status for jobs that have not finished yet; or features not supported by the given scheduler.

Fields:

  • job_id: the job ID on the scheduler
  • title: the job title, as known by the scheduler
  • exit_status: the exit status of the job as reported by the operating system on the execution host
  • terminating_signal: the UNIX signal that was responsible for the end of the job.
  • annotation: human-readable description of the reason for the job being in the current state or substate.
  • job_state: the job state (one of those defined in aiida.scheduler.datastructures.job_states)
  • job_substate: a string with the implementation-specific sub-state
  • allocated_machines: a list of machines used for the current job. This is a list of MachineInfo objects.
  • job_owner: the job owner as reported by the scheduler
  • num_mpiprocs: the total number of requested MPI procs
  • num_cpus: the total number of requested CPUs (cores) [may be undefined]
  • num_machines: the number of machines (i.e., nodes), required by the job. If allocated_machines is not None, this number must be equal to len(allocated_machines). Otherwise, for schedulers not supporting the retrieval of the full list of allocated machines, this attribute can be used to know at least the number of machines.
  • queue_name: The name of the queue in which the job is queued or running.
  • wallclock_time_seconds: the accumulated wallclock time, in seconds
  • requested_wallclock_time_seconds: the requested wallclock time, in seconds
  • cpu_time: the accumulated cpu time, in seconds
  • submission_time: the absolute time at which the job was submitted, of type datetime.datetime
  • dispatch_time: the absolute time at which the job first entered the ‘started’ state, of type datetime.datetime
  • finish_time: the absolute time at which the job first entered the ‘finished’ state, of type datetime.datetime
class aiida.scheduler.datastructures.JobResource(init=None)[source]

A class to store the job resources. It must be inherited and redefined by the specific plugin, that should contain a _job_resource_class attribute pointing to the correct JobResource subclass.

It should at least define the get_tot_num_mpiprocs() method, plus an __init__ to accept its set of variables.

Typical attributes are:

  • num_machines
  • num_mpiprocs_per_machine

or (e.g. for SGE)

  • tot_num_mpiprocs
  • parallel_env

The __init__ should take care of checking the values. The init should raise only ValueError or TypeError on invalid parameters.

classmethod accepts_default_mpiprocs_per_machine()[source]

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

Should be implemented in each subclass.

get_tot_num_mpiprocs()[source]

Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]

Return a list of valid keys to be passed to the __init__

class aiida.scheduler.datastructures.JobTemplate(init=None)[source]

A template for submitting jobs. This contains all required information to create the job header.

The required fields are: working_directory, job_name, num_machines,
num_mpiprocs_per_machine, argv.

Fields:

  • submit_as_hold: if set, the job will be in a ‘hold’ status right after the submission

  • rerunnable: if the job is rerunnable (boolean)

  • job_environment: a dictionary with environment variables to set before the execution of the code.

  • working_directory: the working directory for this job. During submission, the transport will first do a ‘chdir’ to this directory, and then possibly set a scheduler parameter, if this is supported by the scheduler.

  • email: an email address for sending emails on job events.

  • email_on_started: if True, ask the scheduler to send an email when the job starts.

  • email_on_terminated: if True, ask the scheduler to send an email when the job ends. This should also send emails on job failure, when possible.

  • job_name: the name of this job. The actual name of the job can be different from the one specified here, e.g. if there are unsupported characters, or the name is too long.

  • sched_output_path: a (relative) file name for the stdout of this job

  • sched_error_path: a (relative) file name for the stdout of this job

  • sched_join_files: if True, write both stdout and stderr on the same file (the one specified for stdout)

  • queue_name: the name of the scheduler queue (sometimes also called partition), on which the job will be submitted.

  • job_resource: a suitable JobResource subclass with information on how many nodes and cpus it should use. It must be an instance of the aiida.scheduler.Scheduler._job_resource_class class. Use the Scheduler.create_job_resource method to create it.

  • num_machines: how many machines (or nodes) should be used

  • num_mpiprocs_per_machine: how many MPI procs should be used on each machine (or node).

  • priority: a priority for this job. Should be in the format accepted by the specific scheduler.

  • max_memory_kb: The maximum amount of memory the job is allowed to allocate ON EACH NODE, in kilobytes

  • max_wallclock_seconds: The maximum wall clock time that all processes of a job are allowed to exist, in seconds

  • custom_scheduler_commands: a string that will be inserted right after the last scheduler command, and before any other non-scheduler command; useful if some specific flag needs to be added and is not supported by the plugin

  • prepend_text: a (possibly multi-line) string to be inserted in the scheduler script before the main execution line

  • append_text: a (possibly multi-line) string to be inserted in the scheduler script after the main execution line

  • import_sys_environment: import the system environment variables

  • codes_info: a list of aiida.common.datastructures.CalcInfo objects. Each contains the information necessary to run a single code. At the moment, it can contain:

    • cmdline_parameters: a list of strings with the command line arguments of the program to run. This is the main program to be executed. NOTE: The first one is the executable name. For MPI runs, this will probably be “mpirun” or a similar program; this has to be chosen at a upper level.
    • stdin_name: the (relative) file name to be used as stdin for the program specified with argv.
    • stdout_name: the (relative) file name to be used as stdout for the program specified with argv.
    • stderr_name: the (relative) file name to be used as stderr for the program specified with argv.
    • join_files: if True, stderr is redirected on the same file specified for stdout.
  • codes_run_mode: sets the run_mode with which the (multiple) codes have to be executed. For example, parallel execution:

    mpirun -np 8 a.x &
    mpirun -np 8 b.x &
    wait
    

    The serial execution would be without the &’s. Values are given by aiida.common.datastructures.code_run_modes.

class aiida.scheduler.datastructures.MachineInfo(init=None)[source]

Similarly to what is defined in the DRMAA v.2 as SlotInfo; this identifies each machine (also called ‘node’ on some schedulers) on which a job is running, and how many CPUs are being used. (Some of them could be undefined)

  • name: name of the machine
  • num_cpus: number of cores used by the job on this machine
  • num_mpiprocs: number of MPI processes used by the job on this machine
class aiida.scheduler.datastructures.NodeNumberJobResource(**kwargs)[source]

An implementation of JobResource for schedulers that support the specification of a number of nodes and a number of cpus per node

classmethod accepts_default_mpiprocs_per_machine()[source]

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]

Return the total number of cpus of this job resource.

classmethod get_valid_keys()[source]

Return a list of valid keys to be passed to the __init__

class aiida.scheduler.datastructures.ParEnvJobResource(**kwargs)[source]

An implementation of JobResource for schedulers that support the specification of a parallel environment (a string) + the total number of nodes

classmethod accepts_default_mpiprocs_per_machine()[source]

Return True if this JobResource accepts a ‘default_mpiprocs_per_machine’ key, False otherwise.

get_tot_num_mpiprocs()[source]

Return the total number of cpus of this job resource.

aiida.cmdline documentation

Baseclass
class aiida.cmdline.baseclass.VerdiCommand[source]

This command has no documentation yet.

complete(subargs_idx, subargs)[source]

Method called when the user asks for the bash completion. Print a list of valid keywords. Returning without printing will use standard bash completion.

Parameters:
  • subargs_idx – the index of the subargs where the TAB key was pressed (0 is the first element of subargs)
  • subargs – a list of subarguments to this command
classmethod get_command_name()[source]

Return the name of the verdi command associated to this class. By default, the lower-case version of the class name.

get_full_command_name(with_exec_name=True)[source]

Return the current command name. Also tries to get the subcommand name.

Parameters:with_exec_name – if True, return the full string, including the executable name (‘verdi’). If False, omit it.
run(*args)[source]

Method executed when the command is called from the command line.

class aiida.cmdline.baseclass.VerdiCommandWithSubcommands[source]

Used for commands with subcommands. Just define, in the __init__, the self.valid_subcommands dictionary, in the format:

self.valid_subcommands = {
    'uploadfamily': (self.uploadfamily, self.complete_auto),
    'listfamilies': (self.listfamilies, self.complete_none),
    }

where the key is the subcommand name to give on the command line, and the value is a tuple of length 2, the first is the function to call on execution, the second is the function to call on complete.

This class already defined the complete_auto and complete_none commands, that respectively call the default bash completion for filenames/folders, or do not give any completion suggestion. Other functions can of course be defined.

Todo

Improve the docstrings for commands with subcommands.

get_full_command_name(*args, **kwargs)[source]

Return the current command name. Also tries to get the subcommand name.

Also tries to see if the caller function was one specific submethod.

Parameters:with_exec_name – if True, return the full string, including the executable name (‘verdi’). If False, omit it.
Verdi lib

Command line commands for the main executable ‘verdi’ of aiida

If you want to define a new command line parameter, just define a new class inheriting from VerdiCommand, and define a run(self,*args) method accepting a variable-length number of parameters args (the command-line parameters), which will be invoked when this executable is called as verdi NAME

Don’t forget to add the docstring to the class: the first line will be the short description, the following ones the long description.

class aiida.cmdline.verdilib.Completion[source]

Manage bash completion

Return a list of available commands, separated by spaces. Calls the correct function of the command if the TAB has been pressed after the first command.

Returning without printing will use the default bash completion.

class aiida.cmdline.verdilib.CompletionCommand[source]

Return the bash completion function to put in ~/.bashrc

This command prints on screen the function to be inserted in your .bashrc command. You can copy and paste the output, or simply add eval “verdi completioncommand” to your .bashrc, AFTER having added the aiida/bin directory to the path.

run(*args)[source]

I put the documentation here, and I don’t print it, so we don’t clutter too much the .bashrc.

  • “${THE_WORDS[@]}” (with the @) puts each element as a different parameter; note that the variable expansion etc. is performed
  • I add a ‘x’ at the end and then remove it; in this way, $( ) will not remove trailing spaces
  • If the completion command did not print anything, we use the default bash completion for filenames
  • If instead the code prints something empty, thanks to the workaround above $OUTPUT is not empty, so we do go the the ‘else’ case and then, no substitution is suggested.
class aiida.cmdline.verdilib.Help[source]

Describe a specific command

Pass a further argument to get a description of a given command.

class aiida.cmdline.verdilib.Install[source]

Install/setup aiida for the current user

This command creates the ~/.aiida folder in the home directory of the user, interactively asks for the database settings and the repository location, does a setup of the daemon and runs a migrate command to create/setup the database.

complete(subargs_idx, subargs)[source]

No completion after ‘verdi install’.

class aiida.cmdline.verdilib.ListParams[source]

List available commands

List available commands and their short description. For the long description, use the ‘help’ command.

exception aiida.cmdline.verdilib.ProfileParsingException(*args, **kwargs)[source]

Exception raised when parsing the profile command line option, if only -p is provided, and no profile is specified

class aiida.cmdline.verdilib.Quicksetup[source]

Quick setup for the most common usecase (1 user, 1 machine).

Creates a database user ‘aiida_qs_<login-name>’ with random password (if it doesn’t exist). Creates a database ‘<profile>_<username>’ (if it exists, prompts user to use or change the name).

class aiida.cmdline.verdilib.Run[source]

Execute an AiiDA script

class aiida.cmdline.verdilib.Setup[source]

Setup aiida for the current user

This command creates the ~/.aiida folder in the home directory of the user, interactively asks for the database settings and the repository location, does a setup of the daemon and runs a migrate command to create/setup the database.

complete(subargs_idx, subargs)[source]

No completion after ‘verdi install’.

aiida.cmdline.verdilib.exec_from_cmdline(argv)[source]

The main function to be called. Pass as parameter the sys.argv.

aiida.cmdline.verdilib.get_command_suggestion(command)[source]

A function that prints on stderr a list of similar commands

aiida.cmdline.verdilib.get_listparams()[source]

Return a string with the list of parameters, to be printed

The advantage of this function is that the calling routine can choose to print it on stdout or stderr, depending on the needs.

aiida.cmdline.verdilib.parse_profile(argv, merge_equal=False)[source]

Parse the argv to see if a profile has been specified, return it with the command position shift (index where the commands start)

Parameters:merge_equal – if True, merge things like (‘verdi’, ‘–profile’, ‘=’, ‘x’, ‘y’) to (‘verdi’, ‘–profile=x’, ‘y’) but then return the correct index for the original array.
Raises:ProfileParsingException – if there is only ‘verdi’ specified, or if only ‘verdi -p’ (in these cases, one has respectively exception.minus_p_provided equal to False or True)
aiida.cmdline.verdilib.setup(profile, only_config, non_interactive=False, **kwargs)[source]

setup an aiida profile and aiida user (and the aiida default user).

Parameters:
  • profile – Profile name
  • only_config – do not create a new user
  • non_interactive – do not prompt for configuration values, fail if not all values are given as kwargs.
  • backend – one of ‘django’, ‘sqlalchemy’
  • email – valid email address for the user
  • db_host – hostname for the database
  • db_port – port to connect to the database
  • db_user – name of the db user
  • db_pass – password of the db user
aiida.cmdline.verdilib.update_environment(*args, **kwds)[source]

Used as a context manager, changes sys.argv with the new_argv argument, and restores it upon exit.

Daemon
class aiida.cmdline.commands.daemon.Daemon[source]

Manage the AiiDA daemon

This command allows to interact with the AiiDA daemon. Valid subcommands are:

  • start: start the daemon
  • stop: restart the daemon
  • restart: restart the aiida daemon, waiting for it to cleanly exit before restarting it.
  • status: inquire the status of the Daemon.
  • logshow: show the log in a continuous fashion, similar to the ‘tail -f’ command. Press CTRL+C to exit.
__init__()[source]

A dictionary with valid commands and functions to be called: start, stop, status and restart.

configure_user(*args)[source]

Configure the user that can run the daemon.

daemon_logshow(*args)[source]

Show the log of the daemon, press CTRL+C to quit.

daemon_restart(*args)[source]

Restart the daemon. Before restarting, wait for the daemon to really shut down.

daemon_start(*args)[source]

Start the daemon

daemon_status(*args)[source]

Print the status of the daemon

daemon_stop(*args, **kwargs)[source]

Stop the daemon.

Parameters:wait_for_death – If True, also verifies that the process was already killed. It attempts at most max_retries times, with sleep_between_retries seconds between one attempt and the following one (both variables are for the time being hardcoded in the function).
Returns:None if wait_for_death is False. True/False if the process was actually dead or after all the retries it was still alive.
get_daemon_pid()[source]

Return the daemon pid, as read from the supervisord.pid file. Return None if no pid is found (or the pid is not valid).

kill_daemon()[source]

This is the actual call that kills the daemon.

There are some print statements inside, but no sys.exit, so it is safe to be called from other parts of the code.

aiida.cmdline.commands.daemon.is_daemon_user()[source]

Return True if the user is the current daemon user, False otherwise.

Data
class aiida.cmdline.commands.data.Data[source]

Setup and manage data specific types

There is a list of subcommands for managing specific types of data. For instance, ‘data upf’ manages pseudopotentials in the UPF format.

__init__()[source]

A dictionary with valid commands and functions to be called.

class aiida.cmdline.commands.data.Depositable[source]

Provides shell completion for depositable data nodes.

Note

classes, inheriting Depositable, MUST NOT contain attributes, starting with _deposit_, which are not plugins for depositing.

deposit(*args)[source]

Deposit the data node to a given database.

Parameters:args – a namespace with parsed command line parameters.
get_deposit_plugins()[source]

Get the list of all implemented deposition methods for data class.

class aiida.cmdline.commands.data.Exportable[source]

Provides shell completion for exportable data nodes.

Note

classes, inheriting Exportable, MUST NOT contain attributes, starting with _export_, which are not plugins for exporting.

append_export_cmdline_arguments(parser)[source]

Function (to be overloaded in a subclass) to add custom export command line arguments.

Parameters:parser – a ArgParse parser object
Returns:change the parser in place
export(*args)[source]

Export the data node to a given format.

get_export_plugins()[source]

Get the list of all implemented exporters for data class.

print_or_store(node, output_fname, fileformat, other_args={}, overwrite=False)[source]

Depending on the parameters, either print the (single) output file on screen, or stores the file(s) on disk.

Parameters:
  • node – the Data node to print or store on disk
  • output_fname – The filename to store the main file. If empty or None, print instead
  • fileformat – a string to pass to the _exportstring method
  • other_args – a dictionary with additional kwargs to pass to _exportstring
  • overwrite – if False, stops if any file already exists (when output_fname is not empty
Note:

this function calls directly sys.exit(1) when an error occurs (or e.g. if check_overwrite is True and a file already exists).

class aiida.cmdline.commands.data.Importable[source]

Provides shell completion for importable data nodes.

Note

classes, inheriting Importable, MUST NOT contain attributes, starting with _import_, which are not plugins for importing.

get_import_plugins()[source]

Get the list of all implemented importers for data class.

class aiida.cmdline.commands.data.Listable[source]

Provides shell completion for listable data nodes.

Note

classes, inheriting Listable, MUST define value for property dataclass (preferably in __init__), which has to point to correct *Data class.

append_list_cmdline_arguments(parser)[source]

Append additional command line parameters, that are later parsed and used in the query construction.

Parameters:parser – instance of argparse.ArgumentParser
get_column_names()[source]

Return the list with column names.

Note

neither the number nor correspondence of column names and actual columns in the output from the query() are checked.

list(*args)[source]

List all instances of given data class.

Parameters:args – a list of command line arguments.
query(args)[source]

Perform the query and return information for the list.

Parameters:args – a namespace with parsed command line parameters.
Returns:table (list of lists) with information, describing nodes. Each row describes a single hit.
query_group(q_object, args)[source]

Subselect to filter data nodes by their group.

Parameters:
  • q_object – a query object
  • args – a namespace with parsed command line parameters.
query_group_qb(filters, args)[source]

Subselect to filter data nodes by their group.

Parameters:
  • q_object – a query object
  • args – a namespace with parsed command line parameters.
query_past_days(q_object, args)[source]

Subselect to filter data nodes by their age.

Parameters:
  • q_object – a query object
  • args – a namespace with parsed command line parameters.
query_past_days_qb(filters, args)[source]

Subselect to filter data nodes by their age.

Parameters:
  • filters – the filters to be enriched.
  • args – a namespace with parsed command line parameters.
class aiida.cmdline.commands.data.Visualizable[source]

Provides shell completion for visualizable data nodes.

Note

classes, inheriting Visualizable, MUST NOT contain attributes, starting with _show_, which are not plugins for visualization.

In order to specify a default visualization format, one has to override _default_show_format property (preferably in __init__), setting it to the name of default visualization tool.

get_show_plugins()[source]

Get the list of all implemented plugins for visualizing the structure.

show(*args)[source]

Show the data node with a visualization program.

aiida.execmanager documentation

Execution Manager

This file contains the main routines to submit, check and retrieve calculation results. These are general and contain only the main logic; where appropriate, the routines make reference to the suitable plugins for all plugin-specific operations.

aiida.daemon.execmanager.retrieve_computed_for_authinfo(authinfo)[source]
aiida.daemon.execmanager.retrieve_files_from_list(calculation, transport, folder, retrieve_list)[source]

Retrieve all the files in the retrieve_list from the remote into the local folder instance through the transport. The entries in the retrieve_list can be of two types:

  • a string
  • a list

If it is a string, it represents the remote absolute filepath of the file. If the item is a list, the elements will correspond to the following:

  • remotepath
  • localpath
  • depth

If the remotepath contains file patterns with wildcards, the localpath will be treated as the work directory of the folder and the depth integer determines upto what level of the original remotepath nesting the files will be copied.

Parameters:
  • transport – the Transport instance
  • folder – a local Folder instance for the transport to store files into
  • retrieve_list – the list of files to retrieve
aiida.daemon.execmanager.retrieve_jobs()[source]
aiida.daemon.execmanager.submit_calc(calc, authinfo, transport=None)[source]

Submit a calculation

Note:

if no transport is passed, a new transport is opened and then closed within this function. If you want to use an already opened transport, pass it as further parameter. In this case, the transport has to be already open, and must coincide with the transport of the the computer defined by the authinfo.

Parameters:
  • calc – the calculation to submit (an instance of the aiida.orm.JobCalculation class)
  • authinfo – the authinfo for this calculation.
  • transport – if passed, must be an already opened transport. No checks are done on the consistency of the given transport with the transport of the computer defined in the authinfo.
aiida.daemon.execmanager.submit_jobs()[source]

Submit all jobs in the TOSUBMIT state.

aiida.daemon.execmanager.submit_jobs_with_authinfo(authinfo)[source]

Submit jobs in TOSUBMIT status belonging to user and machine as defined in the ‘dbauthinfo’ table.

aiida.daemon.execmanager.update_jobs()[source]

calls an update for each set of pairs (machine, aiidauser)

aiida.daemon.execmanager.update_running_calcs_status(authinfo)[source]

Update the states of calculations in WITHSCHEDULER status belonging to user and machine as defined in the ‘dbauthinfo’ table.

aiida.backends.djsite documentation

Database schema
class aiida.backends.djsite.db.models.DbAttribute(*args, **kwargs)[source]

This table stores attributes that uniquely define the content of the node. Therefore, their modification corrupts the data.

class aiida.backends.djsite.db.models.DbAttributeBaseClass(*args, **kwargs)[source]

Abstract base class for tables storing element-attribute-value data. Element is the dbnode; attribute is the key name. Value is the specific value to store.

This table had different SQL columns to store different types of data, and a datatype field to know the actual datatype.

Moreover, this class unpacks dictionaries and lists when possible, so that it is possible to query inside recursive lists and dicts.

classmethod del_value_for_node(dbnode, key)[source]

Delete an attribute from the database for the given dbnode.

Note:

no exception is raised if no attribute with the given key is found in the DB.

Parameters:
  • dbnode – the dbnode for which you want to delete the key.
  • key – the key to delete.
classmethod get_all_values_for_node(dbnode)[source]

Return a dictionary with all attributes for the given dbnode.

Returns:a dictionary where each key is a level-0 attribute stored in the Db table, correctly converted to the right type.
classmethod get_all_values_for_nodepk(dbnodepk)[source]

Return a dictionary with all attributes for the dbnode with given PK.

Returns:a dictionary where each key is a level-0 attribute stored in the Db table, correctly converted to the right type.
classmethod get_value_for_node(dbnode, key)[source]

Get an attribute from the database for the given dbnode.

Returns:the value stored in the Db table, correctly converted to the right type.
Raises:AttributeError – if no key is found for the given dbnode
classmethod has_key(dbnode, key)[source]

Return True if the given dbnode has an attribute with the given key, False otherwise.

classmethod list_all_node_elements(dbnode)[source]

Return a django queryset with the attributes of the given node, only at deepness level zero (i.e., keys not containing the separator).

classmethod set_value_for_node(dbnode, key, value, with_transaction=True, stop_if_existing=False)[source]

This is the raw-level method that accesses the DB. No checks are done to prevent the user from (re)setting a valid key. To be used only internally.

Todo:

there may be some error on concurrent write; not checked in this unlucky case!

Parameters:
  • dbnode – the dbnode for which the attribute should be stored; in an integer is passed, this is used as the PK of the dbnode, without any further check (for speed reasons)
  • key – the key of the attribute to store; must be a level-zero attribute (i.e., no separators in the key)
  • value – the value of the attribute to store
  • with_transaction – if True (default), do this within a transaction, so that nothing gets stored if a subitem cannot be created. Otherwise, if this parameter is False, no transaction management is performed.
  • stop_if_existing – if True, it will stop with an UniquenessError exception if the key already exists for the given node. Otherwise, it will first delete the old value, if existent. The use with True is useful if you want to use a given attribute as a “locking” value, e.g. to avoid to perform an action twice on the same node. Note that, if you are using transactions, you may get the error only when the transaction is committed.
Raises:

ValueError – if the key contains the separator symbol used internally to unpack dictionaries and lists (defined in cls._sep).

class aiida.backends.djsite.db.models.DbAuthInfo(*args, **kwargs)[source]

Table that pairs aiida users and computers, with all required authentication information.

get_transport()[source]

Given a computer and an aiida user (as entries of the DB) return a configured transport to connect to the computer.

class aiida.backends.djsite.db.models.DbCalcState(*args, **kwargs)[source]

Store the state of calculations.

The advantage of a table (with uniqueness constraints) is that this disallows entering twice in the same state (e.g., retrieving twice).

class aiida.backends.djsite.db.models.DbComment(id, uuid, dbnode_id, ctime, mtime, user_id, content)[source]
class aiida.backends.djsite.db.models.DbComputer(*args, **kwargs)[source]

Table of computers or clusters.

Attributes: * name: A name to be used to refer to this computer. Must be unique. * hostname: Fully-qualified hostname of the host * transport_type: a string with a valid transport type

Note: other things that may be set in the metadata:

  • mpirun command
  • num cores per node
  • max num cores
  • workdir: Full path of the aiida folder on the host. It can contain the string {username} that will be substituted by the username of the user on that machine. The actual workdir is then obtained as workdir.format(username=THE_ACTUAL_USERNAME) Example: workdir = “/scratch/{username}/aiida/”
  • allocate full node = True or False
  • ... (further limits per user etc.)
classmethod get_dbcomputer(computer)[source]

Return a DbComputer from its name (or from another Computer or DbComputer instance)

class aiida.backends.djsite.db.models.DbExtra(*args, **kwargs)[source]

This table stores extra data, still in the key-value format, that the user can attach to a node. Therefore, their modification simply changes the user-defined data, but does not corrupt the node (it will still be loadable without errors). Could be useful to add “duplicate” information for easier querying, or for tagging nodes.

class aiida.backends.djsite.db.models.DbGroup(*args, **kwargs)[source]

A group of nodes.

Any group of nodes can be created, but some groups may have specific meaning if they satisfy specific rules (for instance, groups of UpdData objects are pseudopotential families - if no two pseudos are included for the same atomic element).

Direct connection between two dbnodes. The label is identifying the link type.

class aiida.backends.djsite.db.models.DbLock(key, creation, timeout, owner)[source]
class aiida.backends.djsite.db.models.DbLog(id, time, loggername, levelname, objname, objpk, message, metadata)[source]
class aiida.backends.djsite.db.models.DbMultipleValueAttributeBaseClass(*args, **kwargs)[source]

Abstract base class for tables storing attribute + value data, of different data types (without any association to a Node).

classmethod create_value(key, value, subspecifier_value=None, other_attribs={})[source]

Create a new list of attributes, without storing them, associated with the current key/value pair (and to the given subspecifier, e.g. the DbNode for DbAttributes and DbExtras).

Note:

No hits are done on the DB, in particular no check is done on the existence of the given nodes.

Parameters:
  • key – a string with the key to create (can contain the separator cls._sep if this is a sub-attribute: indeed, this function calls itself recursively)
  • value – the value to store (a basic data type or a list or a dict)
  • subspecifier_value – must be None if this class has no subspecifier set (e.g., the DbSetting class). Must be the value of the subspecifier (e.g., the dbnode) for classes that define it (e.g. DbAttribute and DbExtra)
  • other_attribs – a dictionary of other parameters, to store only on the level-zero attribute (e.g. for description in DbSetting).
Returns:

always a list of class instances; it is the user responsibility to store such entries (typically with a Django bulk_create() call).

classmethod del_value(key, only_children=False, subspecifier_value=None)[source]

Delete a value associated with the given key (if existing).

Note:

No exceptions are raised if no entry is found.

Parameters:
  • key – the key to delete. Can contain the separator cls._sep if you want to delete a subkey.
  • only_children – if True, delete only children and not the entry itself.
  • subspecifier_value – must be None if this class has no subspecifier set (e.g., the DbSetting class). Must be the value of the subspecifier (e.g., the dbnode) for classes that define it (e.g. DbAttribute and DbExtra)
classmethod get_query_dict(value)[source]

Return a dictionary that can be used in a django filter to query for a specific value. This takes care of checking the type of the input parameter ‘value’ and to convert it to the right query.

Parameters:value – The value that should be queried. Note: can only be base datatype, not a list or dict. For those, query directly for one of the sub-elements.
Todo:see if we want to give the possibility to query for the existence of a (possibly empty) dictionary or list, of for their length.
Note:this will of course not find a data if this was stored in the DB as a serialized JSON.
Returns:a dictionary to be used in the django .filter() method. For instance, if ‘value’ is a string, it will return the dictionary {'datatype': 'txt', 'tval': value}.
Raise:ValueError if value is not of a base datatype (string, integer, float, bool, None, or date)
getvalue()[source]

This can be called on a given row and will get the corresponding value, casting it correctly.

long_field_length()

Return the length of “long” fields. This is used, for instance, for the ‘key’ field of attributes. This returns 1024 typically, but it returns 255 if the backend is mysql.

Note:Call this function only AFTER having called load_dbenv!
classmethod set_value(key, value, with_transaction=True, subspecifier_value=None, other_attribs={}, stop_if_existing=False)[source]

Set a new value in the DB, possibly associated to the given subspecifier.

Note:

This method also stored directly in the DB.

Parameters:
  • key – a string with the key to create (must be a level-0 attribute, that is it cannot contain the separator cls._sep).
  • value – the value to store (a basic data type or a list or a dict)
  • subspecifier_value – must be None if this class has no subspecifier set (e.g., the DbSetting class). Must be the value of the subspecifier (e.g., the dbnode) for classes that define it (e.g. DbAttribute and DbExtra)
  • with_transaction – True if you want this function to be managed with transactions. Set to False if you already have a manual management of transactions in the block where you are calling this function (useful for speed improvements to avoid recursive transactions)
  • other_attribs – a dictionary of other parameters, to store only on the level-zero attribute (e.g. for description in DbSetting).
  • stop_if_existing – if True, it will stop with an UniquenessError exception if the new entry would violate an uniqueness constraint in the DB (same key, or same key+node, depending on the specific subclass). Otherwise, it will first delete the old value, if existent. The use with True is useful if you want to use a given attribute as a “locking” value, e.g. to avoid to perform an action twice on the same node. Note that, if you are using transactions, you may get the error only when the transaction is committed.
subspecifier_pk

Return the subspecifier PK in the database (or None, if no subspecifier should be used)

subspecifiers_dict

Return a dict to narrow down the query to only those matching also the subspecifier.

classmethod validate_key(key)[source]

Validate the key string to check if it is valid (e.g., if it does not contain the separator symbol.).

Returns:None if the key is valid
Raises:ValidationError – if the key is not valid
class aiida.backends.djsite.db.models.DbNode(*args, **kwargs)[source]

Generic node: data or calculation or code.

Nodes can be linked (DbLink table) Naming convention for Node relationships: A –> C –> B.

  • A is ‘input’ of C.
  • C is ‘output’ of A.

Internal attributes, that define the node itself, are stored in the DbAttribute table; further user-defined attributes, called ‘extra’, are stored in the DbExtra table (same schema and methods of the DbAttribute table, but the code does not rely on the content of the table, therefore the user can use it at his will to tag or annotate nodes.

Note:Attributes in the DbAttribute table have to be thought as belonging to the DbNode, (this is the reason for which there is no ‘user’ field in the DbAttribute field). Moreover, Attributes define uniquely the Node so should be immutable (except for the few ones defined in the _updatable_attributes attribute of the Node() class, that are updatable: these are Attributes that are set by AiiDA, so the user should not modify them, but can be changed (e.g., the append_text of a code, that can be redefined if the code has to be recompiled).
attributes

Return all attributes of the given node as a single dictionary.

extras

Return all extras of the given node as a single dictionary.

get_aiida_class()[source]

Return the corresponding aiida instance of class aiida.orm.Node or a appropriate subclass.

get_simple_name(invalid_result=None)[source]

Return a string with the last part of the type name.

If the type is empty, use ‘Node’. If the type is invalid, return the content of the input variable invalid_result.

Parameters:invalid_result – The value to be returned if the node type is not recognized.
class aiida.backends.djsite.db.models.DbSetting(*args, **kwargs)[source]

This will store generic settings that should be database-wide.

class aiida.backends.djsite.db.models.DbUser(*args, **kwargs)[source]

This class replaces the default User class of Django

class aiida.backends.djsite.db.models.DbWorkflow(id, uuid, ctime, mtime, user_id, label, description, nodeversion, lastsyncedversion, state, report, module, module_class, script_path, script_md5)[source]
get_aiida_class()[source]

Return the corresponding aiida instance of class aiida.worflow

is_subworkflow()[source]

Return True if this is a subworkflow, False if it is a root workflow, launched by the user.

class aiida.backends.djsite.db.models.DbWorkflowData(id, parent_id, name, time, data_type, value_type, json_value, aiida_obj_id)[source]
class aiida.backends.djsite.db.models.DbWorkflowStep(id, parent_id, name, user_id, time, nextcall, state)[source]
aiida.backends.djsite.db.models.deserialize_attributes(data, sep, original_class=None, original_pk=None)[source]

Deserialize the attributes from the format internally stored in the DB to the actual format (dictionaries, lists, integers, ...

Parameters:
  • data – must be a dictionary of dictionaries. In the top-level dictionary, the key must be the key of the attribute. The value must be a dictionary with the following keys: datatype, tval, fval, ival, bval, dval. Other keys are ignored. NOTE that a type check is not performed! tval is expected to be a string, dval a date, etc.
  • sep – a string, the separator between subfields (to separate the name of a dictionary from the keys it contains, for instance)
  • original_class – if these elements come from a specific subclass of DbMultipleValueAttributeBaseClass, pass here the class (note: the class, not the instance!). This is used only in case the wrong number of elements is found in the raw data, to print a more meaningful message (if the class has a dbnode associated to it)
  • original_pk – if the elements come from a specific subclass of DbMultipleValueAttributeBaseClass that has a dbnode associated to it, pass here the PK integer. This is used only in case the wrong number of elements is found in the raw data, to print a more meaningful message
Returns:

a dictionary, where for each entry the corresponding value is returned, deserialized back to lists, dictionaries, etc. Example: if data = {'a': {'datatype': "list", "ival": 2, ...}, 'a.0': {'datatype': "int", "ival": 2, ...}, 'a.1': {'datatype': "txt", "tval":  "yy"}], it will return {"a": [2, "yy"]}

aiida.backends.sqlalchemy documentation

Database schema

QueryBuilder documentation

The QueryBuilder: A class that allows you to query the AiiDA database, independent from backend. Note that the backend implementation is enforced and handled with a composition model! QueryBuilder() is the frontend class that the user can use. It inherits from object and contains backend-specific functionality. Backend specific functionality is provided by the implementation classes.

These inherit from aiida.backends.general.querybuilder_interface.QueryBuilderInterface(), an interface classes which enforces the implementation of its defined methods. An instance of one of the implementation classes becomes a member of the QueryBuilder() instance when instantiated by the user.

class aiida.orm.querybuilder.QueryBuilder(*args, **kwargs)[source]

QueryBuilder: The class to query the AiiDA database. Usage:

from aiida.orm.querybuilder import QueryBuilder
qb = QueryBuilder()
# Querying nodes:
qb.append(Node)
# retrieving the results:
results = qb.all()
__init__(*args, **kwargs)[source]

Instantiates a QueryBuilder instance.

Which backend is used decided here based on backend-settings (taken from the user profile). This cannot be overriden so far by the user.

Parameters:
  • debug (bool) – Turn on debug mode. This feature prints information on the screen about the stages of the QueryBuilder. Does not affect results.
  • path (list) – A list of the vertices to traverse. Leave empty if you plan on using the method QueryBuilder.append().
  • filters – The filters to apply. You can specify the filters here, when appending to the query using QueryBuilder.append() or even later using QueryBuilder.add_filter(). Check latter gives API-details.
  • project – The projections to apply. You can specify the projections here, when appending to the query using QueryBuilder.append() or even later using QueryBuilder.add_projection(). Latter gives you API-details.
  • limit (int) – Limit the number of rows to this number. Check QueryBuilder.limit() for more information.
  • offset (int) – Set an offset for the results returned. Details in QueryBuilder.offset().
  • order_by – How to order the results. As the 2 above, can be set also at later stage, check QueryBuilder.order_by() for more information.
add_filter(tagspec, filter_spec)[source]

Adding a filter to my filters.

Parameters:
  • tagspec – The tag, which has to exist already as a key in self._filters
  • filter_spec – The specifications for the filter, has to be a dictionary

Usage:

qb = QueryBuilder()         # Instantiating the QueryBuilder instance
qb.append(Node, tag='node') # Appending a Node
#let's put some filters:
qb.add_filter('node',{'id':{'>':12}})
# 2 filters together:
qb.add_filter('node',{'label':'foo', 'uuid':{'like':'ab%'}})
# Now I am overriding the first filter I set:
qb.add_filter('node',{'id':13})
add_projection(tag_spec, projection_spec)[source]

Adds a projection

Parameters:
  • tag_spec – A valid specification for a tag
  • projection_spec – The specification for the projection. A projection is a list of dictionaries, with each dictionary containing key-value pairs where the key is database entity (e.g. a column / an attribute) and the value is (optional) additional information on how to process this database entity.

If the given projection_spec is not a list, it will be expanded to a list. If the listitems are not dictionaries, but strings (No additional processing of the projected results desired), they will be expanded to dictionaries.

Usage:

qb = QueryBuilder()
qb.append(StructureData, tag='struc')

# Will project the uuid and the kinds
qb.add_projection('struc', ['uuid', 'attributes.kinds'])

The above example will project the uuid and the kinds-attribute of all matching structures. There are 2 (so far) special keys.

The single star * will project the ORM-instance:

qb = QueryBuilder()
qb.append(StructureData, tag='struc')
# Will project the ORM instance
qb.add_projection('struc', '*')
print type(qb.first()[0])
# >>> aiida.orm.data.structure.StructureData

The double start ** projects all possible projections of this entity:

QueryBuilder().append(StructureData,tag=’s’, project=’**’).limit(1).dict()[0][‘s’].keys()

# >>> u’user_id, description, ctime, label, extras, mtime, id, attributes, dbcomputer_id, nodeversion, type, public, uuid’

Be aware that the result of ** depends on the backend implementation.

all(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of lists of all projected entities.
append(cls=None, type=None, tag=None, filters=None, project=None, subclassing=True, edge_tag=None, edge_filters=None, edge_project=None, outerjoin=False, **kwargs)[source]

Any iterative procedure to build the path for a graph query needs to invoke this method to append to the path.

Parameters:
  • cls – The Aiida-class (or backend-class) defining the appended vertice
  • type (str) – The type of the class, if cls is not given
  • autotag (bool) – Whether to find automatically a unique tag. If this is set to True (default False),
  • tag (str) – A unique tag. If none is given, I will create a unique tag myself.
  • filters – Filters to apply for this vertice. See add_filter(), the method invoked in the background, or usage examples for details.
  • project – Projections to apply. See usage examples for details. More information also in add_projection().
  • subclassing (bool) – Whether to include subclasses of the given class (default True). E.g. Specifying a Calculation as cls will include JobCalculations, InlineCalculations, etc..
  • outerjoin (bool) – If True, (default is False), will do a left outerjoin instead of an inner join
  • edge_tag (str) – The tag that the edge will get. If nothing is specified (and there is a meaningful edge) the default is tag1–tag2 with tag1 being the entity joining from and tag2 being the entity joining to (this entity).
  • edge_filters (str) – The filters to apply on the edge. Also here, details in add_filter().
  • edge_project (str) – The project from the edges. API-details in add_projection().

A small usage example how this can be invoked:

qb = QueryBuilder()             # Instantiating empty querybuilder instance
qb.append(cls=StructureData)    # First item is StructureData node
# The
# next node in the path is a PwCalculation, with
# the structure joined as an input
qb.append(
    cls=PwCalculation,
    output_of=StructureData
)
Returns:self
count()[source]

Counts the number of rows returned by the backend.

Returns:the number of rows as an integer
dict(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of dictionaries of all projected entities. Each dictionary consists of key value pairs, where the key is the tag of the vertice and the value a dictionary of key-value pairs where key is the entity description (a column name or attribute path) and the value the value in the DB.

Usage:

qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['type', 'id'],  # returns type (string) and id (string)
    tag='descendant'
)

# Return the dictionaries:
print "qb.iterdict()"
for d in qb.iterdict():
    print '>>>', d

results in the following output:

qb.iterdict()
>>> {'descendant': {
        'type': u'calculation.job.quantumespresso.pw.PwCalculation.',
        'id': 7716}
    }
>>> {'descendant': {
        'type': u'data.remote.RemoteData.',
        'id': 8510}
    }
distinct()[source]

Asks for distinct rows, which is the same as asking the backend to remove duplicates. Does not execute the query!

If you want a distinct query:

qb = QueryBuilder()
# append stuff!
qb.append(...)
qb.append(...)
...
qb.distinct().all() #or
qb.distinct().dict()
Returns:self
first()[source]

Executes query asking for one instance. Use as follows:

qb = QueryBuilder(**queryhelp)
qb.first()
Returns:One row of results as a list
get_alias(tag)[source]

In order to continue a query by the user, this utility function returns the aliased ormclasses.

Parameters:tag – The tag for a vertice in the path
Returns:the alias given for that vertice
get_aliases()[source]
Returns:the list of aliases
get_json_compatible_queryhelp()[source]

Makes the queryhelp a json - compatible dictionary. In this way,the queryhelp can be stored in the database or a json-object, retrieved or shared and used later. See this usage:

qb = QueryBuilder(limit=3).append(StructureData, project='id').order_by({StructureData:'id'})
queryhelp  = qb.get_json_compatible_queryhelp()

# Now I could save this dictionary somewhere and use it later:

qb2=QueryBuilder(**queryhelp)

# This is True if no change has been made to the database.
# Note that such a comparison can only be True if the order of results is enforced
qb.all()==qb2.all()
Returns:the json-compatible queryhelp
get_query()[source]

Instantiates and manipulates a sqlalchemy.orm.Query instance if this is needed. First, I check if the query instance is still valid by hashing the queryhelp. In this way, if a user asks for the same query twice, I am not recreating an instance.

Returns:an instance of sqlalchemy.orm.Query that is specific to the backend used.
get_results_dict()[source]

Deprecated, use dict() instead

get_used_tags(vertices=True, edges=True)[source]

Returns a list of all the vertices that are being used. Some parameter allow to select only subsets. :param bool vertices: Defaults to True. If True, adds the tags of vertices to the returned list :param bool edges: Defaults to True. If True, adds the tags of edges to the returnend list.

Returns:A list of all tags, including (if there is) also the tag give for the edges
iterall(batch_size=100)[source]

Same as all(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of lists
iterdict(batch_size=100)[source]

Same as dict(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of dictionaries
limit(limit)[source]

Set the limit (nr of rows to return)

Parameters:limit (int) – integers of number of rows of rows to return
offset(offset)[source]

Set the offset. If offset is set, that many rows are skipped before returning. offset = 0 is the same as omitting setting the offset. If both offset and limit appear, then offset rows are skipped before starting to count the limit rows that are returned.

Parameters:offset (int) – integers of nr of rows to skip
one()[source]

Executes the query asking for exactly one results. Will raise an exception if this is not the case :raises: MultipleObjectsError if more then one row can be returned :raises: NotExistent if no result was found

order_by(order_by)[source]

Set the entity to order by

Parameters:order_by – This is a list of items, where each item is a dictionary specifies what to sort for an entity

In each dictionary in that list, keys represent valid tags of entities (tables), and values are list of columns.

Usage:

#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':['id']})

# or
#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'asc'}}]})

# for descending order:
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'desc'}}]})

# or (shorter)
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':'desc'}]})
set_debug(debug)[source]

Run in debug mode. This does not affect functionality, but prints intermediate stages when creating a query on screen.

Parameters:debug (bool) – Turn debug on or off
class aiida.backends.general.querybuilder_interface.QueryBuilderInterface(*args, **kwargs)[source]
AiidaComputer()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Computer

AiidaGroup()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Group

AiidaNode()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for Node

AiidaUser()[source]

A property, decorated with @property. Returns the implementation for the AiiDA-class for User

Computer()[source]

A property, decorated with @property. Returns the implementation for the Computer

Group()[source]

A property, decorated with @property. Returns the implementation for the Group

A property, decorated with @property. Returns the implementation for the DbLink

Node()[source]

Decorated as a property, returns the implementation for DbNode. It needs to return a subclass of sqlalchemy.Base, which means that for different ORM’s a corresponding dummy-model must be written.

User()[source]

A property, decorated with @property. Returns the implementation for the User

count()[source]
Returns:the number of results
first()[source]

Executes query in the backend asking for one instance.

Returns:One row of aiida results
get_aiida_res(key, res)[source]

Some instance returned by ORM (django or SA) need to be converted to Aiida instances (eg nodes)

Parameters:
  • key – the key that this entry would be returned with
  • res – the result returned by the query
Returns:

an aiida-compatible instance

get_filter_expr_from_attributes(operator, value, attr_key, column=None, column_name=None, alias=None)[source]

A classmethod that returns an valid SQLAlchemy expression.

Parameters:
  • operator – The operator provided by the user (‘==’, ‘>’, ...)
  • value – The value to compare with, e.g. (5.0, ‘foo’, [‘a’,’b’])
  • attr_key (str) – The path to that attribute as a tuple of values. I.e. if that attribute I want to filter by is the 2nd element in a list stored under the key ‘mylist’, this is (‘mylist’, ‘2’).
  • column – Optional, an instance of sqlalchemy.orm.attributes.InstrumentedAttribute or
  • column_name (str) – The name of the column, and the backend should get the InstrumentedAttribute.
  • alias – The aliased class.
Returns:

An instance of sqlalchemy.sql.elements.BinaryExpression

get_session()[source]
Returns:a valid session, an instance of sqlalchemy.orm.session.Session
iterall(batch_size=100)[source]
Returns:An iterator over all the results of a list of lists.
iterdict(batch_size=100)[source]
Returns:An iterator over all the results of a list of dictionaries.
modify_expansions(alias, expansions)[source]

Modify names of projections if ** was specified. This is important for the schema having attributes in a different table.

table_groups_nodes()[source]

A property, decorated with @property. Returns the implementation for the many-to-many relationship between group and nodes.

yield_per(batch_size)[source]
Parameters:batch_size (int) – Number of rows to yield per step

Yields count rows at a time

Returns:a generator

Workflows

This section describes the different classes related to workflows, workfunctions and WorkChains.

class aiida.work.persistence.Persistence(running_directory='/tmp/running', finished_directory='/tmp/running/finished', failed_directory='/tmp/running/failed')[source]

Class that uses pickles stored in particular directories to persist the instance state of Processes.

classmethod create_from_basedir(basedir, **kwargs)[source]
Create using a base directory, the pickles will be stored in:
  • running: [basedir]/running
  • finished: [basedir]/finished
  • failed: [basedir]/failed
Parameters:
  • basedir (str) – The base directory to storage pickle under
  • kwargs – Any additional arguments to pass to the constructor
Returns:

A new instance.

create_from_file_and_persist(filepath)[source]

Try and load a process from a file and recreate the Process instance. To prevent multiple threads from recreating a Process from the same pickle, before loading the state from the file, a reentrant lock is created, which will except if the file is already locked. Within a lock context manager, we then attempt to recreate the Process from the process state and when successful we acquire the lock.

Parameters:filepath – path to the pickle to be loaded as a Process
Returns:Process instance
get_running_path(pid)[source]

Get the path where the pickle for a process with pid will be stored while it’s running.

Parameters:pid – The process pid
Returns:A string to the absolute path of where the pickle is stored.
Return type:str
load_all_processes()[source]

Will detect all pickles in the running directory and will try to load them up into Processes. As soon as a pickle is considered for loading, a lock is placed on it, which is not released until the process is destroyed. This is necessary to prevent another thread from loading up the same process.

Returns:a list of Process instances
class aiida.work.persistence.RLock(filename, mode='a', timeout=5, check_interval=0.25, fail_when_locked=False, flags=6)[source]

A reentrant lock, functions in a similar way to threading.RLock in that it can be acquired multiple times. When the corresponding number of release() calls are made the lock will finally release the underlying file lock.

class aiida.work.process.Process[source]

This class represents an AiiDA process which can be executed and will have full provenance saved in the database.

class SaveKeys[source]

Keys used to identify things in the saved instance state bundle.

classmethod create_db_record()[source]

Create a database calculation node that represents what happened in this process. :return:

on_destroy()[source]

Called when a Process enters the DESTROYED state which should be the final process state and so we seal the calculation node

on_finish()[source]

Called when a Process enters the FINISHED state at which point we set the corresponding attribute of the workcalculation node

report(msg, *args, **kwargs)[source]

Log a message to the logger, which should get saved to the database through the attached DbLogHandler. The class name and function name of the caller are prepended to the given message

class aiida.work.process_registry.ProcessRegistry[source]

This class is a knowledge provider that uses the AiiDA database to answer questions related to processes.

class aiida.work.run.RunningInfo(type, pid)
pid

Alias for field number 1

type

Alias for field number 0

class aiida.work.run.RunningType[source]

A type to indicate what type of object is running: a process, a calculation or a workflow

aiida.work.run.legacy_calc(pk)[source]

Create a RunningInfo object for a legacy calculation

Parameters:pk (int) – The calculation pk
Returns:The running info
Return type:RunningInfo
aiida.work.run.legacy_workflow(pk)[source]

Create a RunningInfo object for a legacy workflow.

This can be used in conjunction with aiida.work.workchain.ToContext as follows:

>>> from aiida.work.workchain import WorkChain, ToContext, Outputs
>>>
>>> class MyWf(WorkChain):
>>>     @classmethod
>>>     def define(cls, spec):
>>>         super(MyWf, cls).define(spec)
>>>         spec.outline(cls.step1, cls.step2)
>>>
>>>     def step1(self):
>>>         wf = OldEquationOfState()
>>>         wf.start()
>>>         return ToContext(eos=legacy_workflow(wf.pk))
>>>
>>>     def step2(self):
>>>         # Now self.ctx.eos contains the terminated workflow
>>>         pass
Parameters:pk (int) – The workflow pk
Returns:The running info
Return type:RunningInfo
aiida.work.run.queue_up(process_class, inputs, storage)[source]

This queues up the Process so that it’s executed by the daemon when it gets around to it.

Parameters:
  • process_class – The process class to queue up.
  • inputs (Mapping) – The inputs to the process.
  • storage – The storage engine which will be used to save the process (of type plum.persistence)
Returns:

The pid of the queued process.

aiida.work.run.run(process_class, *args, **inputs)[source]

Synchronously (i.e. blocking) run a workfunction or process.

Parameters:
  • process_class – The process class or workfunction
  • _attributes – Optional attributes (only for process)
  • args – Positional arguments for a workfunction
  • inputs – The list of inputs
class aiida.work.test_utils.BadOutput[source]

A Process that emits an output that isn’t part of the spec raising an exception.

class aiida.work.test_utils.DummyProcess[source]

A Process that does nothing when it runs.

class aiida.work.util.ProcessStack[source]

Keep track of the per-thread call stack of processes.

classmethod get_active_process_calc_node()[source]

Get the calculation node of the process at the top of the stack

Returns:The calculation node
Return type:aiida.orm.implementation.general.calculation.job.AbstractJobCalculation
classmethod get_active_process_id()[source]

Get the pid of the process at the top of the stack

Returns:The pid
classmethod pop(process=None, pid=None)[source]

Pop a process from the stack. To make sure the stack is not corrupted the process instance or pid of the calling process should be supplied so we can verify that is really is top of the stack.

Parameters:
  • process – The process instance
  • pid – The process id.
aiida.work.util.get_or_create_output_group(calculation)[source]

For a given Calculation, get or create a new frozendict Data node that has as its values all output Data nodes of the Calculation.

Parameters:calculation – Calculation
aiida.work.util.load_class(classstring)[source]

Load a class from a string

aiida.work.workchain.ToContext(**kwargs)[source]

Utility function that returns a list of UpdateContext Interstep instances

NOTE: This is effectively a copy of WorkChain.to_context method added to keep backwards compatibility, but should eventually be deprecated

class aiida.work.workchain.WorkChain[source]

A WorkChain, the base class for AiiDA workflows.

abort(msg=None, timeout=None)[source]

Abort the workchain by calling the abort method of the Process and also adding the abort message to the report

Parameters:
  • msg (str) – The abort message
  • timeout (float) – Wait for the given time until the process has aborted
Returns:

True if the process is aborted at the end of the function, False otherwise

abort_nowait(msg=None)[source]

Abort the workchain at the next state transition without waiting which is achieved by passing a timeout value of zero

Parameters:msg (str) – The abort message
insert_barrier(wait_on)[source]

Insert a barrier that will cause the workchain to wait until the wait on is finished before continuing to the next step.

Parameters:wait_on – The thing to wait on (of type plum.wait.wait_on)
insert_intersteps(intersteps)[source]

Insert an interstep to be executed after the current step ends but before the next step ends

Parameters:interstep – class:Interstep
remove_barrier(wait_on)[source]

Remove a barrier.

Precondition: must be a barrier that was previously inserted

Parameters:wait_on – The wait on to remove (of type plum.wait.wait_on)
to_context(**kwargs)[source]

This is a convenience method that provides syntactic sugar, for a user to add multiple intersteps that will assign a certain value to the corresponding key in the context of the workchain

aiida.work.workchain.if_(condition)[source]

A conditional that can be used in a workchain outline.

Use as:

if_(cls.conditional)(
  cls.step1,
  cls.step2
)

Each step can, of course, also be any valid workchain step e.g. conditional.

Parameters:condition – The workchain method that will return True or False
aiida.work.workchain.while_(condition)[source]

A while loop that can be used in a workchain outline.

Use as:

while_(cls.conditional)(
  cls.step1,
  cls.step2
)

Each step can, of course, also be any valid workchain step e.g. conditional.

Parameters:condition – The workchain method that will return True or False
class aiida.work.workflow.Workflow[source]

This class represents an AiiDA workflow which can be executed and will have full provenance saved in the database.

This file provides very simple workflows for testing purposes. Do not delete, otherwise ‘verdi developertest’ will stop to work.

aiida.work.workfunction.workfunction(func)[source]

A decorator to turn a standard python function into a workfunction. Example usage:

>>> from aiida.orm.data.base import Int
>>> from aiida.work.workfunction import workfunction as wf
>>>
>>> # Define the workfunction
>>> @wf
>>> def sum(a, b):
>>>    return a + b
>>> # Run it with some input
>>> r = sum(Int(4), Int(5))
>>> print(r)
9
>>> r.get_inputs_dict() 
{u'_return': <WorkCalculation: uuid: ce0c63b3-1c84-4bb8-ba64-7b70a36adf34 (pk: 3567)>}
>>> r.get_inputs_dict()['_return'].get_inputs()
[4, 5]

DbImporter documentation

Generic database importer class

This section describes the base class for the import of data from external databases.

aiida.tools.dbimporters.DbImporterFactory(pluginname)[source]

This function loads the correct DbImporter plugin class

class aiida.tools.dbimporters.baseclasses.CifEntry(db_name=None, db_uri=None, id=None, version=None, extras={}, uri=None)[source]

Represents an entry from the structure database (COD, ICSD, ...).

cif

Returns raw contents of a CIF file as string.

get_aiida_structure()[source]
Returns:AiiDA structure corresponding to the CIF file.
get_ase_structure()[source]

Returns ASE representation of the CIF.

Note

To be removed, as it is duplicated in aiida.orm.data.cif.CifData.

get_cif_node(store=False)[source]

Creates a CIF node, that can be used in AiiDA workflow.

Returns:aiida.orm.data.cif.CifData object
get_parsed_cif()[source]

Returns data structure, representing the CIF file. Can be created using PyCIFRW or any other open-source parser.

Returns:list of lists
get_raw_cif()[source]

Returns raw contents of a CIF file as string.

Returns:contents of a file as string
class aiida.tools.dbimporters.baseclasses.DbEntry(db_name=None, db_uri=None, id=None, version=None, extras={}, uri=None)[source]

Represents an entry from external database.

contents

Returns raw contents of a file as string.

class aiida.tools.dbimporters.baseclasses.DbImporter[source]

Base class for database importers.

get_supported_keywords()[source]

Returns the list of all supported query keywords.

Returns:list of strings
query(**kwargs)[source]

Method to query the database.

Parameters:
  • id – database-specific entry identificator
  • element – element name from periodic table of elements
  • number_of_elements – number of different elements
  • mineral_name – name of mineral
  • chemical_name – chemical name of substance
  • formula – chemical formula
  • volume – volume of the unit cell in cubic angstroms
  • spacegroup – symmetry space group symbol in Hermann-Mauguin notation
  • spacegroup_hall – symmetry space group symbol in Hall notation
  • a – length of lattice vector in angstroms
  • b – length of lattice vector in angstroms
  • c – length of lattice vector in angstroms
  • alpha – angles between lattice vectors in degrees
  • beta – angles between lattice vectors in degrees
  • gamma – angles between lattice vectors in degrees
  • z – number of the formula units in the unit cell
  • measurement_temp – temperature in kelvins at which the unit-cell parameters were measured
  • measurement_pressure – pressure in kPa at which the unit-cell parameters were measured
  • diffraction_temp – mean temperature in kelvins at which the intensities were measured
  • diffraction_pressure – mean pressure in kPa at which the intensities were measured
  • authors – authors of the publication
  • journal – name of the journal
  • title – title of the publication
  • year – year of the publication
  • journal_volume – journal volume of the publication
  • journal_issue – journal issue of the publication
  • first_page – first page of the publication
  • last_page – last page of the publication
  • doi – digital object identifyer (DOI), refering to the publication
Raises:

NotImplementedError – if search using given keyword is not implemented.

setup_db(**kwargs)[source]

Sets the database parameters. The method should reconnect to the database using updated parameters, if already connected.

class aiida.tools.dbimporters.baseclasses.DbSearchResults(results)[source]

Base class for database results.

All classes, inheriting this one and overriding at(), are able to benefit from having functions __iter__, __len__ and __getitem__.

class DbSearchResultsIterator(results, increment=1)[source]

Iterator for search results

__iter__()[source]

Instances of aiida.tools.dbimporters.baseclasses.DbSearchResults can be used as iterators.

at(position)[source]

Returns position-th result as aiida.tools.dbimporters.baseclasses.DbEntry.

Parameters:position – zero-based index of a result.
Raises:IndexError – if position is out of bounds.
fetch_all()[source]

Returns all query results as an array of aiida.tools.dbimporters.baseclasses.DbEntry.

next()[source]

Returns the next result of the query (instance of aiida.tools.dbimporters.baseclasses.DbEntry).

Raises:StopIteration – when the end of result array is reached.
class aiida.tools.dbimporters.baseclasses.UpfEntry(db_name=None, db_uri=None, id=None, version=None, extras={}, uri=None)[source]

Represents an entry from the pseudopotential database.

get_upf_node(store=False)[source]

Creates an UPF node, that can be used in AiiDA workflow.

Returns:aiida.orm.data.upf.UpfData object
Structural databases
COD database importer
class aiida.tools.dbimporters.plugins.cod.CodDbImporter(**kwargs)[source]

Database importer for Crystallography Open Database.

get_supported_keywords()[source]

Returns the list of all supported query keywords.

Returns:list of strings
query(**kwargs)[source]

Performs a query on the COD database using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.cod.CodSearchResults.
query_sql(**kwargs)[source]

Forms a SQL query for querying the COD database using keyword = value pairs, specified in kwargs.

Returns:string containing a SQL statement.
setup_db(**kwargs)[source]

Changes the database connection details.

class aiida.tools.dbimporters.plugins.cod.CodEntry(uri, db_name='Crystallography Open Database', db_uri='http://www.crystallography.net', **kwargs)[source]

Represents an entry from COD.

class aiida.tools.dbimporters.plugins.cod.CodSearchResults(results)[source]

Results of the search, performed on COD.

ICSD database importer
exception aiida.tools.dbimporters.plugins.icsd.CifFileErrorExp[source]

Raised when the author loop is missing in a CIF file.

class aiida.tools.dbimporters.plugins.icsd.IcsdDbImporter(**kwargs)[source]

Importer for the Inorganic Crystal Structure Database, short ICSD, provided by FIZ Karlsruhe. It allows to run queries and analyse all the results. See the DbImporter documentation and tutorial page for more information.

Parameters:
  • server

    Server URL, the web page of the database. It is required in order to have access to the full database. I t should contain both the protocol and the domain name and end with a slash, as in:

    server = "http://ICSDSERVER.com/"
    
  • urladd – part of URL which is added between query and and the server URL (default: index.php?). only needed for web page query
  • querydb – boolean, decides whether the mysql database is queried (default: True). If False, the query results are obtained through the web page query, which is restricted to a maximum of 1000 results per query.
  • dl_db – icsd comes with a full (default: icsd) and a demo database (icsdd). This parameter allows the user to switch to the demo database for testing purposes, if the access rights to the full database are not granted.
  • host

    MySQL database host. If the MySQL database is hosted on a different machine, use “127.0.0.1” as host, and open a SSH tunnel to the host using:

    ssh -L 3306:localhost:3306 username@hostname.com
    

    or (if e.g. you get an URLError with Errno 111 (Connection refused) upon querying):

    ssh -L 3306:localhost:3306 -L 8010:localhost:80 username@hostname.com
    
  • user – mysql database username (default: dba)
  • passwd – mysql database password (default: sql)
  • db – name of the database (default: icsd)
  • port – Port to access the mysql database (default: 3306)
get_supported_keywords()[source]
Returns:List of all supported query keywords.
query(**kwargs)[source]

Depending on the db_parameters, the mysql database or the web page are queried. Valid parameters are found using IcsdDbImporter.get_supported_keywords().

Parameters:kwargs – A list of ‘’keyword = [values]’’ pairs.
setup_db(**kwargs)[source]

Change the database connection details. At least the host server has to be defined.

Parameters:kwargs – db_parameters for the mysql database connection (host, user, passwd, db, port)
class aiida.tools.dbimporters.plugins.icsd.IcsdEntry(uri, **kwargs)[source]

Represent an entry from Icsd.

Note:
  • Before July 2nd 2015, source[‘id’] contained icsd.IDNUM (internal icsd id number) and source[‘extras’][‘cif_nr’] the cif number (icsd.COLL_CODE).
  • After July 2nd 2015, source[‘id’] has been replaced by the cif number and source[‘extras’][‘idnum’] is icsd.IDNUM .
cif
Returns:cif file of Icsd entry.
get_aiida_structure()[source]
Returns:AiiDA structure corresponding to the CIF file.
get_ase_structure()[source]
Returns:ASE structure corresponding to the cif file.
get_cif_node()[source]

Create a CIF node, that can be used in AiiDA workflow.

Returns:aiida.orm.data.cif.CifData object
get_corrected_cif()[source]

Add quotes to the lines in the author loop if missing.

Note:ase raises an AssertionError if the quotes in the author loop are missing.
class aiida.tools.dbimporters.plugins.icsd.IcsdSearchResults(query, db_parameters)[source]

Result manager for the query performed on ICSD.

Parameters:
  • query – mysql query or webpage query
  • db_parameters – database parameter setup during the initialisation of the IcsdDbImporter.
at(position)[source]

Return position-th result as IcsdEntry.

next()[source]

Return next result as IcsdEntry.

query_db_version()[source]

Query the version of the icsd database (last row of RELEASE_TAGS).

query_page()[source]

Query the mysql or web page database, depending on the db_parameters. Store the number_of_results, cif file number and the corresponding icsd number.

Note:Icsd uses its own number system, different from the CIF file numbers.
exception aiida.tools.dbimporters.plugins.icsd.NoResultsWebExp[source]

Raised when a webpage query returns no results.

aiida.tools.dbimporters.plugins.icsd.correct_cif(cif)[source]

Correct the format of the CIF files. At the moment, it only fixes missing quotes in the authors field (ase.read.io only works if the author names are quoted, if not an AssertionError is raised).

Parameters:cif – A string containing the content of the CIF file.
Returns:a string containing the corrected CIF file.
MPOD database importer
class aiida.tools.dbimporters.plugins.mpod.MpodDbImporter(**kwargs)[source]

Database importer for Material Properties Open Database.

get_supported_keywords()[source]

Returns the list of all supported query keywords.

Returns:list of strings
query(**kwargs)[source]

Performs a query on the MPOD database using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.mpod.MpodSearchResults.
query_get(**kwargs)[source]

Forms a HTTP GET query for querying the MPOD database. May return more than one query in case an intersection is needed.

Returns:a list containing strings for HTTP GET statement.
setup_db(query_url=None, **kwargs)[source]

Changes the database connection details.

class aiida.tools.dbimporters.plugins.mpod.MpodEntry(uri, **kwargs)[source]

Represents an entry from MPOD.

class aiida.tools.dbimporters.plugins.mpod.MpodSearchResults(results)[source]

Results of the search, performed on MPOD.

OQMD database importer
class aiida.tools.dbimporters.plugins.oqmd.OqmdDbImporter(**kwargs)[source]

Database importer for Open Quantum Materials Database.

get_supported_keywords()[source]

Returns the list of all supported query keywords.

Returns:list of strings
query(**kwargs)[source]

Performs a query on the OQMD database using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.oqmd.OqmdSearchResults.
query_get(**kwargs)[source]

Forms a HTTP GET query for querying the OQMD database.

Returns:a strings for HTTP GET statement.
setup_db(query_url=None, **kwargs)[source]

Changes the database connection details.

class aiida.tools.dbimporters.plugins.oqmd.OqmdEntry(uri, **kwargs)[source]

Represents an entry from OQMD.

class aiida.tools.dbimporters.plugins.oqmd.OqmdSearchResults(results)[source]

Results of the search, performed on OQMD.

PCOD database importer
class aiida.tools.dbimporters.plugins.pcod.PcodDbImporter(**kwargs)[source]

Database importer for Predicted Crystallography Open Database.

query(**kwargs)[source]

Performs a query on the PCOD database using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.pcod.PcodSearchResults.
query_sql(**kwargs)[source]

Forms a SQL query for querying the PCOD database using keyword = value pairs, specified in kwargs.

Returns:string containing a SQL statement.
class aiida.tools.dbimporters.plugins.pcod.PcodEntry(uri, db_name='Predicted Crystallography Open Database', db_uri='http://www.crystallography.net/pcod', **kwargs)[source]

Represents an entry from PCOD.

class aiida.tools.dbimporters.plugins.pcod.PcodSearchResults(results)[source]

Results of the search, performed on PCOD.

TCOD database importer
class aiida.tools.dbimporters.plugins.tcod.TcodDbImporter(**kwargs)[source]

Database importer for Theoretical Crystallography Open Database.

query(**kwargs)[source]

Performs a query on the TCOD database using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.tcod.TcodSearchResults.
class aiida.tools.dbimporters.plugins.tcod.TcodEntry(uri, db_name='Theoretical Crystallography Open Database', db_uri='http://www.crystallography.net/tcod', **kwargs)[source]

Represents an entry from TCOD.

class aiida.tools.dbimporters.plugins.tcod.TcodSearchResults(results)[source]

Results of the search, performed on TCOD.

Other databases
NNINC database importer
class aiida.tools.dbimporters.plugins.nninc.NnincDbImporter(**kwargs)[source]

Database importer for NNIN/C Pseudopotential Virtual Vault.

get_supported_keywords()[source]

Returns the list of all supported query keywords.

Returns:list of strings
query(**kwargs)[source]

Performs a query on the NNIN/C Pseudopotential Virtual Vault using keyword = value pairs, specified in kwargs.

Returns:an instance of aiida.tools.dbimporters.plugins.nninc.NnincSearchResults.
query_get(**kwargs)[source]

Forms a HTTP GET query for querying the NNIN/C Pseudopotential Virtual Vault.

Returns:a string with HTTP GET statement.
setup_db(query_url=None, **kwargs)[source]

Changes the database connection details.

class aiida.tools.dbimporters.plugins.nninc.NnincEntry(uri, **kwargs)[source]

Represents an entry from NNIN/C Pseudopotential Virtual Vault.

class aiida.tools.dbimporters.plugins.nninc.NnincSearchResults(results)[source]

Results of the search, performed on NNIN/C Pseudopotential Virtual Vault.

DbExporter documentation

Note

This is the documentation of the general DbExporter class and the TCOD implementatation. TCOD itself accepts a number of plugins that know how to convert code-specific output to the TCOD format; these typically live in different repositories. For instance, you can find here the extensions for Quantum ESPRESSO.

TCOD database exporter
aiida.tools.dbexporters.tcod.cif_encode_contents(content, gzip=False, gzip_threshold=1024)[source]

Encodes data for usage in CIF text field in a best possible way: binary data is encoded using Base64 encoding; text with non-ASCII symbols, too long lines or lines starting with semicolons (‘;’) is encoded using Quoted-printable encoding.

Parameters:content – the content to be encoded
Return content:encoded content
Return encoding:
 a string specifying used encoding (None, ‘base64’, ‘ncr’, ‘quoted-printable’, ‘gzip+base64’)
aiida.tools.dbexporters.tcod.decode_textfield(content, method)[source]

Decodes the contents of encoded CIF textfield.

Parameters:
  • content – the content to be decoded
  • method – method, which was used for encoding the contents (None, ‘base64’, ‘ncr’, ‘quoted-printable’, ‘gzip+base64’)
Returns:

decoded content

Raises:

ValueError – if the encoding method is unknown

aiida.tools.dbexporters.tcod.decode_textfield_base64(content)[source]

Decodes the contents for CIF textfield from Base64 using standard Python implementation (base64.standard_b64decode())

Parameters:content – a string with contents
Returns:decoded string
aiida.tools.dbexporters.tcod.decode_textfield_gzip_base64(content)[source]

Decodes the contents for CIF textfield from Base64 and decompresses them with gzip.

Parameters:content – a string with contents
Returns:decoded string
aiida.tools.dbexporters.tcod.decode_textfield_ncr(content)[source]

Decodes the contents for CIF textfield from Numeric Character Reference.

Parameters:content – a string with contents
Returns:decoded string
aiida.tools.dbexporters.tcod.decode_textfield_quoted_printable(content)[source]

Decodes the contents for CIF textfield from quoted-printable encoding.

Parameters:content – a string with contents
Returns:decoded string
aiida.tools.dbexporters.tcod.deposit(what, type, author_name=None, author_email=None, url=None, title=None, username=None, password=False, user_email=None, code_label='cif_cod_deposit', computer_name=None, replace=None, message=None, **kwargs)[source]

Launches a aiida.orm.implementation.general.calculation.job.AbstractJobCalculation to deposit data node to *COD-type database.

Returns:launched aiida.orm.implementation.general.calculation.job.AbstractJobCalculation instance.
Raises:ValueError – if any of the required parameters are not given.
aiida.tools.dbexporters.tcod.deposition_cmdline_parameters(parser, expclass='Data')[source]

Provides descriptions of command line options, that are used to control the process of deposition to TCOD.

Parameters:
  • parser – an argparse.Parser instance
  • expclass – name of the exported class to be shown in help string for the command line options

Note

This method must not set any default values for command line options in order not to clash with any other data deposition plugins.

aiida.tools.dbexporters.tcod.encode_textfield_base64(content, foldwidth=76)[source]

Encodes the contents for CIF textfield in Base64 using standard Python implementation (base64.standard_b64encode()).

Parameters:
  • content – a string with contents
  • foldwidth – maximum width of line (default is 76)
Returns:

encoded string

aiida.tools.dbexporters.tcod.encode_textfield_gzip_base64(content, **kwargs)[source]

Gzips the given string and encodes it in Base64.

Parameters:content – a string with contents
Returns:encoded string
aiida.tools.dbexporters.tcod.encode_textfield_ncr(content)[source]

Encodes the contents for CIF textfield in Numeric Character Reference. Encoded characters:

  • \x09, \x0A, \x0D, \x20\x7E;
  • ;‘, if encountered on the beginning of the line;
  • \t
  • .‘ and ‘?‘, if comprise the entire textfield.
Parameters:content – a string with contents
Returns:encoded string
aiida.tools.dbexporters.tcod.encode_textfield_quoted_printable(content)[source]

Encodes the contents for CIF textfield in quoted-printable encoding. In addition to non-ASCII characters, that are encoded by Python function quopri.encodestring(), following characters are encoded:

  • ;‘, if encountered on the beginning of the line;
  • \t‘ and ‘\r‘;
  • .‘ and ‘?‘, if comprise the entire textfield.
Parameters:content – a string with contents
Returns:encoded string
aiida.tools.dbexporters.tcod.export_cif(what, **kwargs)[source]

Exports given coordinate-containing *Data node to string of CIF format.

Returns:string with contents of CIF file.
aiida.tools.dbexporters.tcod.export_cifnode(what, parameters=None, trajectory_index=None, store=False, reduce_symmetry=True, **kwargs)[source]

The main exporter function. Exports given coordinate-containing *Data node to aiida.orm.data.cif.CifData node, ready to be exported to TCOD. All *Data types, having method _get_cif(), are supported in addition to aiida.orm.data.cif.CifData.

Parameters:
  • what – data node to be exported.
  • parameters – a aiida.orm.data.parameter.ParameterData instance, produced by the same calculation as the original exported node.
  • trajectory_index – a step to be converted and exported in case a aiida.orm.data.array.trajectory.TrajectoryData is exported.
  • store – boolean indicating whether to store intermediate nodes or not. Default False.
  • dump_aiida_database – boolean indicating whether to include the dump of AiiDA database (containing only transitive closure of the exported node). Default True.
  • exclude_external_contents – boolean indicating whether to exclude nodes from AiiDA database dump, that are taken from external repositores and have a URL link allowing to refetch their contents. Default False.
  • gzip – boolean indicating whether to Gzip large CIF text fields. Default False.
  • gzip_threshold – integer indicating the maximum size (in bytes) of uncompressed CIF text fields when the gzip option is in action. Default 1024.
Returns:

a aiida.orm.data.cif.CifData node.

aiida.tools.dbexporters.tcod.export_values(what, **kwargs)[source]

Exports given coordinate-containing *Data node to PyCIFRW CIF data structure.

Returns:CIF data structure.

Note

Requires PyCIFRW.

aiida.tools.dbexporters.tcod.extend_with_cmdline_parameters(parser, expclass='Data')[source]

Provides descriptions of command line options, that are used to control the process of exporting data to TCOD CIF files.

Parameters:
  • parser – an argparse.Parser instance
  • expclass – name of the exported class to be shown in help string for the command line options

Note

This method must not set any default values for command line options in order not to clash with any other data export plugins.

aiida.tools.dbexporters.tcod.translate_calculation_specific_values(calc, translator, **kwargs)[source]

Translates calculation-specific values from aiida.orm.implementation.general.calculation.job.AbstractJobCalculation subclass to appropriate TCOD CIF tags.

Parameters:
Raises:

ValueError – if translator is not derived from proper class.

TCOD parameter translator documentation
Base class
class aiida.tools.dbexporters.tcod_plugins.BaseTcodtranslator[source]

Base translator from calculation-specific input and output parameters to TCOD CIF dictionary tags.

classmethod get_BZ_integration_grid_X(calc, **kwargs)[source]

Returns a number of points in the Brillouin zone along reciprocal lattice vector X.

classmethod get_BZ_integration_grid_Y(calc, **kwargs)[source]

Returns a number of points in the Brillouin zone along reciprocal lattice vector Y.

classmethod get_BZ_integration_grid_Z(calc, **kwargs)[source]

Returns a number of points in the Brillouin zone along reciprocal lattice vector Z.

classmethod get_BZ_integration_grid_shift_X(calc, **kwargs)[source]

Returns the shift of the Brillouin zone points along reciprocal lattice vector X.

classmethod get_BZ_integration_grid_shift_Y(calc, **kwargs)[source]

Returns the shift of the Brillouin zone points along reciprocal lattice vector Y.

classmethod get_BZ_integration_grid_shift_Z(calc, **kwargs)[source]

Returns the shift of the Brillouin zone points along reciprocal lattice vector Z.

classmethod get_atom_site_residual_force_Cartesian_x(calc, **kwargs)[source]

Returns a list of x components for Cartesian coordinates of residual force for atom. The list order MUST be the same as in the resulting structure.

classmethod get_atom_site_residual_force_Cartesian_y(calc, **kwargs)[source]

Returns a list of y components for Cartesian coordinates of residual force for atom. The list order MUST be the same as in the resulting structure.

classmethod get_atom_site_residual_force_Cartesian_z(calc, **kwargs)[source]

Returns a list of z components for Cartesian coordinates of residual force for atom. The list order MUST be the same as in the resulting structure.

classmethod get_atom_type_basisset(calc, **kwargs)[source]

Returns a list of basisset names for each atom type. The list order MUST be the same as of get_atom_type_symbol().

classmethod get_atom_type_symbol(calc, **kwargs)[source]

Returns a list of atom types. Each atom site MUST occur only once in this list. List MUST be sorted.

classmethod get_atom_type_valence_configuration(calc, **kwargs)[source]

Returns valence configuration of each atom type. The list order MUST be the same as of get_atom_type_symbol().

classmethod get_computation_wallclock_time(calc, **kwargs)[source]

Returns the computation wallclock time in seconds.

classmethod get_ewald_energy(calc, **kwargs)[source]

Returns Ewald energy in eV.

classmethod get_exchange_correlation_energy(calc, **kwargs)[source]

Returns exchange correlation (XC) energy in eV.

classmethod get_fermi_energy(calc, **kwargs)[source]

Returns Fermi energy in eV.

classmethod get_hartree_energy(calc, **kwargs)[source]

Returns Hartree energy in eV.

classmethod get_integration_Methfessel_Paxton_order(calc, **kwargs)[source]

Returns the order of Methfessel-Paxton approximation if used.

classmethod get_integration_smearing_method(calc, **kwargs)[source]

Returns the smearing method name as string.

classmethod get_integration_smearing_method_other(calc, **kwargs)[source]

Returns the smearing method name as string if the name is different from specified in cif_dft.dic.

classmethod get_kinetic_energy_cutoff_EEX(calc, **kwargs)[source]

Returns kinetic energy cutoff for exact exchange (EEX) operator in eV.

classmethod get_kinetic_energy_cutoff_charge_density(calc, **kwargs)[source]

Returns kinetic energy cutoff for charge density in eV.

classmethod get_kinetic_energy_cutoff_wavefunctions(calc, **kwargs)[source]

Returns kinetic energy cutoff for wavefunctions in eV.

classmethod get_number_of_electrons(calc, **kwargs)[source]

Returns the number of electrons.

classmethod get_one_electron_energy(calc, **kwargs)[source]

Returns one electron energy in eV.

classmethod get_pseudopotential_atom_type(calc, **kwargs)[source]

Returns a list of atom types. Each atom type MUST occur only once in this list. List MUST be sorted.

classmethod get_pseudopotential_type(calc, **kwargs)[source]

Returns a list of pseudopotential types. List MUST be sorted by atom types.

classmethod get_pseudopotential_type_other_name(calc, **kwargs)[source]

Returns a list of other pseudopotential type names. List MUST be sorted by atom types.

classmethod get_software_executable_path(calc, **kwargs)[source]

Returns the file-system path to the executable that was run for this computation.

classmethod get_software_package(calc, **kwargs)[source]

Returns the package or program name that was used to produce the structure. Only package or program name should be used, e.g. ‘VASP’, ‘psi3’, ‘Abinit’, etc.

classmethod get_software_package_compilation_timestamp(calc, **kwargs)[source]

Returns the timestamp of package/program compilation in ISO 8601 format.

classmethod get_software_package_version(calc, **kwargs)[source]

Returns software package version used to compute and produce the computed structure file. Only version designator should be used, e.g. ‘3.4.0’, ‘2.1rc3’.

classmethod get_total_energy(calc, **kwargs)[source]

Returns the total energy in eV.

aiida.backends.tests documentation

Tests
aiida.backends.tests.get_db_test_list()[source]

This function returns the db_test_list for the current backend, merged with the ‘common’ tests.

Note:This function should be called only after setting the backend, and then it returns only the tests for this backend, and the common ones.

Indices and tables