How to manage your installation¶

Managing profiles¶

Creating profiles¶

Each AiiDA installation can have multiple profiles, each of which can have its own individual database and file repository to store the contents of the provenance graph. Profiles allow you to run multiple projects completely independently from one another with just a single AiiDA installation and at least one profile is required to run AiiDA. A new profile can be created using verdi quicksetup or verdi setup, which works similar to the former but gives more control to the user.

Listing profiles¶

The verdi profile command line interface provides various commands to manage the profiles of an AiiDA installation. To list the currently configured profiles, use verdi profile list:

Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida
* project-one
  project-two

In this particular example, there are two configured profiles, project-one and project-two. The first one is highlighted and marked with a * symbol, meaning it is the default profile. A profile being the default means simply that any verdi command will always be executed for that profile. You can change the profile on a per-call basis with the --p/--profile option. To change the default profile use verdi profile setdefault PROFILE.

Showing profiles¶

Each profile defines various parameters, such as the location of the file repository on the file system and the connection parameters for the database. To display these parameters, use verdi profile show:

Info: Profile: project-one
----------------------  ------------------------------------------------
aiidadb_backend         django
aiidadb_engine          postgresql_psycopg2
aiidadb_host            localhost
aiidadb_name            aiida_project_one
aiidadb_pass            correcthorsebatterystaple
aiidadb_port            5432
aiidadb_repository_uri  file:///home/user/.virtualenvs/aiida/repository/
aiidadb_user            aiida
default_user_email      user@email.com
options                 {'daemon_default_workers': 3}
profile_uuid            4c272a87d7f543b08da9fe738d88bb13
----------------------  ------------------------------------------------

By default, the parameters of the default profile are shown, but one can pass the profile name of another, e.g., verdi profile show project-two to change that.

Deleting profiles¶

A profile can be deleted using the verdi profile delete command. By default, deleting a profile will also delete its file repository and the database. This behavior can be changed using the --skip-repository and --skip-db options.

Note

In order to delete the database, the system user needs to have the required rights, which is not always guaranteed depending on the system. In such cases, the database deletion may fail and the user will have to perform the deletion manually through PostgreSQL.

Configuring your installation¶

Activating tab-completion¶

The verdi command line interface has many commands and parameters, which can be tab-completed to simplify its use. To enable tab-completion, the following shell command should be executed:

$ eval "$(_VERDI_COMPLETE=source verdi)"

Place this command in your shell or virtual environment activation script to automatically enable tab completion when opening a new shell or activating an environment. This file is shell specific, but likely one of the following:

the startup file of your shell (.bashrc, .zsh, …), if aiida is installed system-wide

the activation script of your virtual environment

a startup file for your conda environment

Important

After you have added the line to the start up script, make sure to restart the terminal or source the script for the changes to take effect.

Configuring profile options¶

AiiDA provides various configurational options for profiles, which can be controlled with the verdi config command.

To view all configuration options set for the current profile:

$ verdi config list
name                                   source    value
-------------------------------------  --------  ------------
autofill.user.email                    global    abc@test.com
autofill.user.first_name               global    chris
autofill.user.institution              global    epfl
autofill.user.last_name                global    sewell
caching.default_enabled                default   False
caching.disabled_for                   default
caching.enabled_for                    default
daemon.default_workers                 default   1
daemon.timeout                         profile   20
daemon.worker_process_slots            default   200
db.batch_size                          default   100000
logging.aiida_loglevel                 default   REPORT
logging.alembic_loglevel               default   WARNING
logging.circus_loglevel                default   INFO
logging.db_loglevel                    default   REPORT
logging.kiwipy_loglevel                default   WARNING
logging.paramiko_loglevel              default   WARNING
logging.plumpy_loglevel                default   WARNING
logging.sqlalchemy_loglevel            default   WARNING
rmq.task_timeout                       default   10
runner.poll.interval                   profile   50
transport.task_maximum_attempts        global    6
transport.task_retry_initial_interval  default   20
verdi.shell.auto_import                default
warnings.showdeprecations              default   True

Configuration option values are taken, in order of priority, from either the profile specific setting, the global setting (applies to all profiles), or the default value.

You can also filter by a prefix:

$ verdi config list transport
name                                   source    value
-------------------------------------  --------  ------------
transport.task_maximum_attempts        global    6
transport.task_retry_initial_interval  default   20

To show the full information for a configuration option or get its current value:

$ verdi config show transport.task_maximum_attempts
schema:
    default: 5
    description: Maximum number of transport task attempts before a Process is Paused.
    minimum: 1
    type: integer
values:
    default: 5
    global: 6
    profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
6

You can also retrieve the value via the API:

In [1]: from aiida import get_config_option
In [2]: get_config_option('transport.task_maximum_attempts')
Out[2]: 6

To set a value, at the profile or global level:

$ verdi config set transport.task_maximum_attempts 10
Success: 'transport.task_maximum_attempts' set to 10 for 'quicksetup' profile
$ verdi config set --global transport.task_maximum_attempts 20
Success: 'transport.task_maximum_attempts' set to 20 globally
$ verdi config show transport.task_maximum_attempts
schema:
    type: integer
    default: 5
    minimum: 1
    description: Maximum number of transport task attempts before a Process is Paused.
values:
    default: 5
    global: 20
    profile: 10
$ verdi config get transport.task_maximum_attempts
10

Tip

By default any option set through verdi config will be applied to the current default profile. To change the profile you can use the profile option.

Similarly to unset a value:

$ verdi config unset transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset for 'quicksetup' profile
$ verdi config unset --global transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset globally
$ verdi config show transport.task_maximum_attempts
schema:
    type: integer
    default: 5
    minimum: 1
    description: Maximum number of transport task attempts before a Process is Paused.
values:
    default: 5
    global: <NOTSET>
    profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
5

Important

Changes that affect the daemon (e.g. logging.aiida_loglevel) will only take affect after restarting the daemon.

Isolating multiple instances¶

An AiiDA instance is defined as the installed source code plus the configuration folder that stores the configuration files with all the configured profiles. It is possible to run multiple AiiDA instances on a single machine, simply by isolating the code and configuration in a virtual environment.

To isolate the code, make sure to install AiiDA into a virtual environment, e.g., with conda or venv, as described here. Whenever you activate this particular environment, you will be running the particular version of AiiDA (and all the plugins) that you installed specifically for it.

This is separate from the configuration of AiiDA, which is stored in the configuration directory which is always named .aiida and by default is stored in the home directory. Therefore, the default path of the configuration directory is ~/.aiida. By default, each AiiDA instance (each installation) will store associated profiles in this folder. A best practice is to always separate the profiles together with the code to which they belong. The typical approach is to place the configuration folder in the virtual environment itself and have it automatically selected whenever the environment is activated.

The location of the AiiDA configuration folder can be controlled with the AIIDA_PATH environment variable. This allows us to change the configuration folder automatically, by adding the following lines to the activation script of a virtual environment. For example, if the path of your virtual environment is /home/user/.virtualenvs/aiida, add the following line:

$ export AIIDA_PATH='/home/user/.virtualenvs/aiida'

Make sure to reactivate the virtual environment, if it was already active, for the changes to take effect.

Note

For conda, create a directory structure etc/conda/activate.d in the root folder of your conda environment (e.g. /home/user/miniconda/envs/aiida), and place a file aiida-init.sh in that folder which exports the AIIDA_PATH.

You can test that everything works by first echoing the environment variable with echo $AIIDA_PATH to confirm it prints the correct path. Finally, you can check that AiiDA know also properly realizes the new location for the configuration folder by calling verdi profile list. This should display the current location of the configuration directory:

Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida
Critical: configuration file /home/user/.virtualenvs/aiida/.aiida/config.json does not exist

The second line you will only see if you haven’t yet setup a profile for this AiiDA instance. For information on setting up a profile, refer to creating profiles.

Besides a single path, the value of AIIDA_PATH can also be a colon-separated list of paths. AiiDA will go through each of the paths and check whether they contain a configuration directory, i.e., a folder with the name .aiida. The first configuration directory that is encountered will be used as the configuration directory. If no configuration directory is found, one will be created in the last path that was considered. For example, the directory structure in your home folder ~/ might look like this:

.
├── .aiida
└── project_a
    ├── .aiida
    └── subfolder

If you leave the AIIDA_PATH variable unset, the default location ~/.aiida will be used. However, if you set:

$ export AIIDA_PATH='~/project_a:'

the configuration directory ~/project_a/.aiida will be used.

Warning

If there was no .aiida directory in ~/project_a, AiiDA would have created it for you, so make sure to set the AIIDA_PATH correctly.

Daemon as a service¶

The daemon can be set up as a system service, such that it automatically starts at system startup. How to do this, is operating system specific. For Ubuntu, here is a template for the service file and ansible instructions to install the service.

Tuning performance¶

AiiDA supports running hundreds of thousands of calculations and graphs with millions of nodes. However, optimal performance at that scale might require some tweaks to the AiiDA configuration to balance the CPU and disk load. Here are a few general tips that might improve the AiiDA performance:

Prevent your operating system from indexing the file repository.

Many Linux distributions include the locate command to quickly find files and folders, and run a daily cron job updatedb.mlocate to create the corresponding index. A large file repository can take a long time to index, up to the point where the hard drive is constantly indexing.

In order to exclude the repository folder from indexing, add its path to the PRUNEPATH variable in the /etc/updatedb.conf configuration file (use sudo).

Optimize the number of daemon workers

The verdi deamon can manage an arbitrary number of parallel workers; by default only one is activated. If verdi daemon status shows the daemon worker(s) constantly at high CPU usage, use verdi daemon incr X to add X daemon workers. It is recommended that the number of workers does not exceed the number of CPU cores. Ideally, if possible, one should use one or two cores less than the machine has, to avoid to degrade the PostgreSQL database performance.
Move the Postgresql database to a fast disk (SSD), ideally on a large partition.
Stop the AiiDA daemon and back up your database.
Find the data directory of your postgres installation (something like /var/lib/postgresql/9.6/main, /scratch/postgres/9.6/main, …).
The best way is to become the postgres UNIX user and enter the postgres shell:
psql
SHOW data_directory;
\q
If you are unable to enter the postgres shell, try looking for the data_directory variable in a file /etc/postgresql/9.6/main/postgresql.conf or similar.
Stop the postgres database service:
service postgresql stop
Copy all files and folders from the postgres data_directory to the new location:
cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY

.. note:: Flag ``-a`` will create a directory within ``DESTINATION_DIRECTORY``, e.g.::

cp -a OLD_DIR/main/ NEW_DIR/

creates ``NEW_DIR/main``.
It will also keep the file permissions (necessary).

The file permissions of the new and old directory need to be identical (including subdirectories).
In particular, the owner and group should be both ``postgres`` (except for symbolic links in ``server.crt`` and ``server.key`` that may or may not be present).

.. note::

    If the permissions of these links need to be changed, use the ``-h`` option of ``chown`` to avoid changing the permissions of the destination of the links.
    In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!)::

    -rw-r--r-- 1 root root 989 Mar  1  2012 /etc/ssl/certs/ssl-cert-snakeoil.pem
    -rw-r----- 1 root ssl-cert 1704 Mar  1  2012 /etc/ssl/private/ssl-cert-snakeoil.key
Point the data_directory variable in your postgres configuration file (e.g. /etc/postgresql/9.6/main/postgresql.conf) to the new directory.
Restart the database daemon:
service postgresql start
Finally, check that the data directory has indeed changed:
psql
SHOW data_directory;
\q
and try a simple AiiDA query with the new database. If everything went fine, you can delete the old database location.

Updating your installation¶

Whenever updating your AiiDA installation, make sure you follow these instructions very carefully, even when merely upgrading the patch version! Failing to do so, may leave your installation in a broken state, or worse may even damage your data, potentially irreparably.

Activate the Python environment where AiiDA is installed.

Finish all running processes. All finished processes will be automatically migrated, but it is not possible to resume unfinished processes.

Stop the daemon using verdi daemon stop.

Create a backup of your database and repository.

Warning

Once you have migrated your database, you can no longer go back to an older version of aiida-core (unless you restore your database and repository from a backup).

Update your aiida-core installation.

If you have installed AiiDA through conda simply run: conda update aiida-core.

If you have installed AiiDA through pip simply run: pip install --upgrade aiida-core.

If you have installed from the git repository using pip install -e ., first delete all the .pyc files (find . -name "*.pyc" -delete) before updating your branch with git pull.

Run reentry scan to update the cache of registered entry points.

Migrate your database with verdi -p <profile_name> database migrate. Depending on the size of your database and the number of migrations to perform, data migration can take time, so please be patient.

After the database migration finishes, you will be able to continue working with your existing data.

Note

If the update involved a change in the major version number of aiida-core, expect backwards incompatible changes and check whether you also need to update installed plugin packages.

Updating from 0.x.* to 1.*¶

Additional instructions on how to migrate from 0.12.x versions.
Additional instructions on how to migrate from versions 0.4 – 0.11.
For a list of breaking changes between the 0.x and the 1.x series of AiiDA, see here.

Backing up your installation¶

A full backup of an AiiDA instance and AiiDA managed data requires a backup of:

the profile configuration in the config.json file located in the .aiida folder. Typically located at ~/.aiida (see also Setup).
files associated with nodes in the repository folder (one per profile). Typically located in the .aiida folder.
queryable metadata in the PostgreSQL database (one per profile).

Repository backup (`.aiida` folder)¶

For small repositories (with less than ~100k files), simply back up the .aiida folder using standard backup software. For example, the rsync utility supports incremental backups, and a backup command might look like rsync -avHzPx (verbose) or rsync -aHzx.

For large repositories with millions of files, even incremental backups can take a significant amount of time. AiiDA provides a helper script that takes advantage of the AiiDA database in order to figure out which files have been added since your last backup. The instructions below explain how to use it:

Configure your backup using verdi -p PROFILENAME devel configure-backup where PROFILENAME is the name of the AiiDA profile that should be backed up. This will ask for information on:

The “backup folder”, where the backup configuration file will be placed. This defaults to a folder named backup_PROFILENAME in your .aiida directory.

The “destination folder”, where the files of the backup will be stored. This defaults to a subfolder of the backup folder but we strongly suggest to back up to a different drive (see note below).

The configuration step creates two files in the “backup folder”: a backup_info.json configuration file (can also be edited manually) and a start_backup.py script.
Notes on using a SSH mount for the backups (on Linux)
Using the same disk for your backup forgoes protection against the most common cause of data loss: disk failure. One simple option is to use a destination folder mounted over ssh.

On Ubuntu, install sshfs using sudo apt-get install sshfs. Imagine you run your calculations on server_1 and would like to back up regularly to server_2. Mount a server_2 directory on server_1 using the following command on server_1:
sshfs -o idmap=user -o rw backup_user@server_2:/home/backup_user/backup_destination_dir/ /home/aiida_user/remote_backup_dir/
Use gnome-session-properties in the terminal to add this line to the actions performed at start-up. Do not add it to your shell’s startup file (e.g. .bashrc) or your computer will complain that the mount point is not empty whenever you open a new terminal.
Run the start_backup.py script in the “backup folder” to start the backup.

This will back up all data added after the oldest_object_backedup date. It will only carry out a new backup every periodicity days, until a certain end date if specified (using end_date_of_backup or days_to_backup), see this reference page for a detailed description of all options.

Once you’ve checked that it works, make sure to run the script periodically (e.g. using a daily cron job).
Setting up a cron job on Linux
This is a quick note on how to setup a cron job on Linux (you can find many more resources online).

On Ubuntu, you can set up a cron job using:
sudo crontab -u USERNAME -e
It will open an editor, where you can add a line of the form:
00 03 * * * /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net
or (if you need to backup a different profile than the default one):
00 03 * * * verdi -p PROFILENAME run /home/USERNAME/.aiida/backup/start_backup.py 2>&1 | mail -s "Incremental backup of the repository" USER_EMAIL@domain.net
This will launch the backup of the database every day at 3 AM (03:00), and send the output (or any error message) to the email address specified at the end (provided the mail command – from mailutils – is configured appropriately).

Note

You might want to exclude the file repository from any separately set up automatic backups of your home directory.

Database backup¶

PostgreSQL typically spreads database information over multiple files that, if backed up directly, are not guaranteed to restore the database. We therefore strongly recommend to periodically dump the database contents to a file (which you can then back up using your method of choice).

A few useful pointers:

In order to avoid having to enter your database password each time you use the script, you can create a file .pgpass in your home directory containing your database credentials, as described in the PostgreSQL documentation.
In order to dump your database, use the pg_dump utility from PostgreSQL. You can use as a starting example a bash script similar to this file.
You can setup the backup script to run daily using cron (see notes in the previous section).

Restore backup¶

In order to restore a backup, you will need to:

Restore the repository folder that you backed up earlier in the same location as it used to be (you can check the location in the config.json file inside your .aiida folder, or simply using verdi profile show).

Create an empty database following the instructions described in database skipping the verdi setup phase. The database should have the same name and database username as the original one (i.e. if you are restoring on the original postgresql cluster, you may have to either rename or delete the original database).
Change directory to the folder containing the database dump created with pg_dump, and load it using the psql command.
Example commands on Linux Ubuntu
This is an example command, assuming that your dump is named aiidadb-backup.psql:
psql -h localhost -U aiida -d aiidadb -f aiidadb-backup.psql
After supplying your database password, the database should be restored. Note that, if you installed the database on Ubuntu as a system service, you need to type sudo su - postgres to become the postgres UNIX user.

Managing multiple users¶

Setups with multiple users for a single AiiDA instance are currently not supported. Instead, each AiiDA user should install AiiDA in a Unix/Windows account on their own computer. Under this account it is possible to configure all the credentials necessary to connect to remote computers. Using independent accounts will ensure that, for instance, the SSH credentials to connect to supercomputers are not shared with others.

Data can be shared between instances using AiiDA’s export and import functionality. Sharing (subsets of) the AiiDA graph can be done as often as needed.

How to visualize provenance How to package plugins

How to manage your installation¶

Managing profiles¶

Creating profiles¶

Listing profiles¶

Showing profiles¶

Deleting profiles¶

Configuring your installation¶

Activating tab-completion¶

Configuring profile options¶

Isolating multiple instances¶

Daemon as a service¶

Tuning performance¶

Updating your installation¶

Updating from 0.x.* to 1.*¶

Backing up your installation¶

Repository backup (.aiida folder)¶

Database backup¶

Restore backup¶

Managing multiple users¶

Repository backup (`.aiida` folder)¶