How to manage your installation#

Managing profiles#

Creating profiles#

Each AiiDA installation can have multiple profiles, each of which can have its own individual database and file repository to store the contents of the provenance graph. Profiles allow you to run multiple projects completely independently from one another with just a single AiiDA installation and at least one profile is required to run AiiDA. A new profile can be created using verdi quicksetup or verdi setup, which works similar to the former but gives more control to the user.

Listing profiles#

The verdi profile command line interface provides various commands to manage the profiles of an AiiDA installation. To list the currently configured profiles, use verdi profile list:

Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida
* project-one
  project-two

In this particular example, there are two configured profiles, project-one and project-two. The first one is highlighted and marked with a * symbol, meaning it is the default profile. A profile being the default means simply that any verdi command will always be executed for that profile. You can change the profile on a per-call basis with the --p/--profile option. To change the default profile use verdi profile setdefault PROFILE.

Showing profiles#

Each profile defines various parameters, such as the location of the file repository on the file system and the connection parameters for the database. To display these parameters, use verdi profile show:

Report: Profile: a-import-sqla
PROFILE_UUID: fede89dae42b4df3bf46ab27e2b500ca
default_user_email: user@email.com
process_control:
    backend: rabbitmq
    config:
        broker_host: 127.0.0.1
        broker_password: guest
        broker_port: 5672
        broker_protocol: amqp
        broker_username: guest
        broker_virtual_host: ''
storage:
    backend: core.psql_dos
    config:
        database_engine: postgresql_psycopg2
        database_hostname: localhost
        database_name: name
        database_password: abc
        database_port: 5432
        database_username: username
        repository_uri: file:///path/to/repository

By default, the parameters of the default profile are shown, but one can pass the profile name of another, e.g., verdi profile show project-two to change that.

Deleting profiles#

A profile can be deleted using the verdi profile delete command. By default, deleting a profile will also delete its file repository and the database. This behavior can be changed using the --skip-repository and --skip-db options.

Note

In order to delete the database, the system user needs to have the required rights, which is not always guaranteed depending on the system. In such cases, the database deletion may fail and the user will have to perform the deletion manually through PostgreSQL.

Configuring your installation#

Activating tab-completion#

The verdi command line interface has many commands and parameters, which can be tab-completed to simplify its use. To enable tab-completion, the following shell command should be executed (depending on the shell you use):

Enable tab-completion for verdi one of the following supported shells

eval "$(_VERDI_COMPLETE=bash_source verdi)"
eval "$(_VERDI_COMPLETE=zsh_source verdi)"
eval (env _VERDI_COMPLETE=fish_source verdi)

Place this command in your shell or virtual environment activation script to automatically enable tab completion when opening a new shell or activating an environment. This file is shell specific, but likely one of the following:

  • the startup file of your shell (.bashrc, .zsh, …), if aiida is installed system-wide

  • the activators of your virtual environment

  • a startup file for your conda environment

Important

After you have added the line to the start up script, make sure to restart the terminal or source the script for the changes to take effect.

Configuring profile options#

AiiDA provides various configurational options for profiles, which can be controlled with the verdi config command.

To view all configuration options set for the current profile:

$ verdi config list
name                                   source    value
-------------------------------------  --------  ------------
autofill.user.email                    global    abc@test.com
autofill.user.first_name               global    chris
autofill.user.institution              global    epfl
autofill.user.last_name                global    sewell
caching.default_enabled                default   False
caching.disabled_for                   default
caching.enabled_for                    default
daemon.default_workers                 default   1
daemon.timeout                         profile   20
daemon.worker_process_slots            default   200
db.batch_size                          default   100000
logging.aiida_loglevel                 default   REPORT
logging.alembic_loglevel               default   WARNING
logging.circus_loglevel                default   INFO
logging.db_loglevel                    default   REPORT
logging.kiwipy_loglevel                default   WARNING
logging.paramiko_loglevel              default   WARNING
logging.plumpy_loglevel                default   WARNING
logging.sqlalchemy_loglevel            default   WARNING
rmq.task_timeout                       default   10
runner.poll.interval                   profile   50
transport.task_maximum_attempts        global    6
transport.task_retry_initial_interval  default   20
verdi.shell.auto_import                default
warnings.showdeprecations              default   True

Configuration option values are taken, in order of priority, from either the profile specific setting, the global setting (applies to all profiles), or the default value.

You can also filter by a prefix:

$ verdi config list transport
name                                   source    value
-------------------------------------  --------  ------------
transport.task_maximum_attempts        global    6
transport.task_retry_initial_interval  default   20

To show the full information for a configuration option or get its current value:

$ verdi config show transport.task_maximum_attempts
schema:
    default: 5
    description: Maximum number of transport task attempts before a Process is Paused.
    minimum: 1
    type: integer
values:
    default: 5
    global: 6
    profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
6

You can also retrieve the value via the API:

In [1]: from aiida import get_config_option
In [2]: get_config_option('transport.task_maximum_attempts')
Out[2]: 6

To set a value, at the profile or global level:

$ verdi config set transport.task_maximum_attempts 10
Success: 'transport.task_maximum_attempts' set to 10 for 'quicksetup' profile
$ verdi config set --global transport.task_maximum_attempts 20
Success: 'transport.task_maximum_attempts' set to 20 globally
$ verdi config show transport.task_maximum_attempts
schema:
    type: integer
    default: 5
    minimum: 1
    description: Maximum number of transport task attempts before a Process is Paused.
values:
    default: 5
    global: 20
    profile: 10
$ verdi config get transport.task_maximum_attempts
10

Tip

By default any option set through verdi config will be applied to the current default profile. To change the profile you can use the profile option.

Similarly to unset a value:

$ verdi config unset transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset for 'quicksetup' profile
$ verdi config unset --global transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset globally
$ verdi config show transport.task_maximum_attempts
schema:
    type: integer
    default: 5
    minimum: 1
    description: Maximum number of transport task attempts before a Process is Paused.
values:
    default: 5
    global: <NOTSET>
    profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
5

Important

Changes that affect the daemon (e.g. logging.aiida_loglevel) will only take affect after restarting the daemon.

Controlling warnings#

AiiDA may emit warnings for a variety of reasons, for example, warnings when a deprecated part of the code is used. These warnings are on by default as they provide the user with important information. The warnings can be turned off using the warnings.showdeprecations config option, for example:

verdi config set warnings.showdeprecations false

Tip

The command above changes the option for the current profile. However, certain warnings are emitted before a profile can be loaded, for example, when certain modules are imported. To also silence these warnings, apply the option globally:

verdi config set warnings.showdeprecations false --global

In addition to the config option, AiiDA also provides the dedicated environment variable AIIDA_WARN_v{version} for deprecation warnings. Here {version} is the version number in which the deprecated code will be removed, e.g., AIIDA_WARN_v3. This environment variable can be used to enable deprecation warnings even if warnings.showdeprecations is turned off. This can be useful to temporarily enable deprecation warnings for a single command, e.g.:

AIIDA_WARN_v3=1 verdi run script.py

Isolating multiple instances#

An AiiDA instance is defined by its configuration directory, which is always named .aiida. It contains the configuration file, which holds all the profile information, daemon log files, and PID files.

Note

Depending on the storage backend, a profile’s data is typically also stored in this directory. However, this is not necessarily always the case, as often the data location can be configured, and some storage backends use a database service that stores the data elsewhere on disk.

The location of the configuration directory is determined as follows:

1. First, the AIIDA_PATH environment variables is checked, which can be a colon-separated list of directories. The first directory that points to an existing configuration directory is selected. If no existing directories are found, the last directory defined in the variable is used and the configuration directory is created there if it did not already exist. 2. If the AIIDA_PATH is not defined, the current working directory is checked, going up the hierarchy until the first existing configuration directory is encountered. 3. If no existing configuration directory is found yet, the .aiida directory in the user’s home folder is used, and is created automatically if it does not already exist.

Examples#

Consider the following directory structure:

~
├── .aiida
├── project_a
│   ├── .aiida
│   └── subfolder
│       └── .aiida
├── project_b
│   ├── .aiida
│   └── subfolder
└── project_c
    └── subfolder

The following table shows the configuration directory that is selected given a certain AIIDA_PATH variable and the current working directory:

Variables

Configuration directory

AIIDA_PATH = '~/project_b/' AIIDA_PATH = '~/project_b/.aiida' AIIDA_PATH = '~/project_a/.aiida:~/project_b/.aiida' AIIDA_PATH = '~/project_a/subfolder:~/project_a/.aiida:' CWD = '~/project_a/subfolder CWD = '~/project_b/subfolder CWD = '~/project_c/subfolder

~/project_b/.aiida ~/project_b/.aiida ~/project_a/.aiida ~/project_a/subfolder/.aiida ~/project_a/subfolder/.aiida ~/project_b/.aiida ~/.aiida

Tip

The output of verdi status contains the location of the matched configuration directory.

Daemon as a service#

The daemon can be set up as a system service, such that it automatically starts at system startup. How to do this, is operating system specific. For Ubuntu, here is a template for the service file and ansible instructions to install the service.

Tuning performance#

AiiDA supports running hundreds of thousands of calculations and graphs with millions of nodes. However, optimal performance at that scale can require tweaking the AiiDA configuration to balance the CPU and disk load.

Below, we share a few practical tips for assessing and tuning AiiDA performance. Further in-depth information is available in the dedicated topic on performance.

Benchmark workflow engine performance

Download the benchmark script , and run it in your AiiDA environment.

sph@citadel:~/$ python performance_benchmark_base.py -n 100
    Success: Created and configured temporary `Computer` benchmark-e73b8647 for localhost.
    Success: Created temporary `Code` bash for localhost.
    Running 100 calculations.  [####################################]  100%
    Success: All calculations finished successfully.
    Elapsed time: 24.90 seconds.
    Cleaning up...
    12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] 400 Node(s) marked for deletion
    12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] Starting node deletion...
    12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] Deletion of nodes completed.
    Success: Deleted all calculations.
    Success: Deleted the created code bash@benchmark-e73b8647.
    Success: Deleted the created computer benchmark-e73b8647.
    Performance: 0.25 s / process

The output above was generated on an AMD Ryzen 5 3600 6-Core processor (3.6 GHz, 4.2 GHz turbo boost) using AiiDA v2.2.0, and RabbitMQ and PostgreSQL running on the same machine. Here, 100 ArithmeticAddCalculation processes completed in ~25s, corresponding to 0.25 seconds per process.

If you observe a significantly higher runtime, you may want to check whether any relevant component (CPU, disk, postgresql, rabbitmq) is congested.

Increase the number of daemon workers

By default, the AiiDA daemon only uses a single worker, i.e. a single operating system process. If verdi daemon status shows the daemon worker constantly at high CPU usage, you can use verdi daemon incr X to add X parallel daemon workers.

Keep in mind that other processes need to run on your computer (e.g. rabbitmq, the PostgreSQL database, …), i.e. it’s a good idea to stop increasing the number of workers before you reach the number of cores of your CPU.

To make the change permanent, set

verdi config set daemon.default_workers 4
Increase the number of daemon worker slots

Each daemon worker accepts only a limited number of tasks at a time. If verdi daemon status constantly warns about a high percentage of the available daemon worker slots being used, you can increase the number of tasks handled by each daemon worker (thus increasing the workload per worker). Increasing it to 1000 should typically work.

Set the corresponding config variable and restart the daemon

verdi config set daemon.worker_process_slots 1000
Prevent your operating system from indexing the file repository.

Many Linux distributions include the locate command to quickly find files and folders, and run a daily cron job updatedb.mlocate to create the corresponding index. A large file repository can take a long time to index, up to the point where the hard drive is constantly indexing.

In order to exclude the repository folder from indexing, add its path to the PRUNEPATH variable in the /etc/updatedb.conf configuration file (use sudo).

Move the Postgresql database to a fast disk (SSD), ideally on a large partition.
  1. Stop the AiiDA daemon and back up your database.

  2. Find the data directory of your postgres installation (something like /var/lib/postgresql/9.6/main, /scratch/postgres/9.6/main, …).

    The best way is to become the postgres UNIX user and enter the postgres shell:

    psql
    SHOW data_directory;
    \q
    

    If you are unable to enter the postgres shell, try looking for the data_directory variable in a file /etc/postgresql/9.6/main/postgresql.conf or similar.

  3. Stop the postgres database service:

    service postgresql stop
    
  4. Copy all files and folders from the postgres data_directory to the new location:

    cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY
    
    .. note:: Flag ``-a`` will create a directory within ``DESTINATION_DIRECTORY``, e.g.::
    
    cp -a OLD_DIR/main/ NEW_DIR/
    
    creates ``NEW_DIR/main``.
    It will also keep the file permissions (necessary).
    
    The file permissions of the new and old directory need to be identical (including subdirectories).
    In particular, the owner and group should be both ``postgres`` (except for symbolic links in ``server.crt`` and ``server.key`` that may or may not be present).
    
    .. note::
    
        If the permissions of these links need to be changed, use the ``-h`` option of ``chown`` to avoid changing the permissions of the destination of the links.
        In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!)::
    
        -rw-r--r-- 1 root root 989 Mar  1  2012 /etc/ssl/certs/ssl-cert-snakeoil.pem
        -rw-r----- 1 root ssl-cert 1704 Mar  1  2012 /etc/ssl/private/ssl-cert-snakeoil.key
    
  5. Point the data_directory variable in your postgres configuration file (e.g. /etc/postgresql/9.6/main/postgresql.conf) to the new directory.

  6. Restart the database daemon:

    service postgresql start
    

Finally, check that the data directory has indeed changed:

psql
SHOW data_directory;
\q

and try a simple AiiDA query with the new database. If everything went fine, you can delete the old database location.

If you’re still encountering performance issues, the following tips can help with pinpointing performance bottlenecks.

Analyze the RabbitMQ message rate

If you’re observing slow performance of the AiiDA engine, the RabbitMQ management plugin provides an intuitive dashboard that lets you monitor the message rate and check on what the AiiDA engine is up to.

Enable the management plugin via something like:

sudo rabbitmq-plugins enable rabbitmq_management

Then, navigate to http://localhost:15672/ and log in with guest/guest.

Updating your installation#

Whenever updating your AiiDA installation, make sure you follow these instructions very carefully, even when merely upgrading the patch version! Failing to do so, may leave your installation in a broken state, or worse may even damage your data, potentially irreparably.

  1. Activate the Python environment where AiiDA is installed.

  2. Finish all running processes.

    All finished processes will be automatically migrated, but it is not possible to resume unfinished processes.

  3. Stop the daemon using verdi daemon stop.

  4. Create a backup of your database and repository.

    Warning

    Once you have migrated your database, you can no longer go back to an older version of aiida-core (unless you restore your database and repository from a backup).

  5. Update your aiida-core installation.

    • If you have installed AiiDA through conda simply run: conda update aiida-core.

    • If you have installed AiiDA through pip simply run: pip install --upgrade aiida-core.

    • If you have installed from the git repository using pip install -e ., first delete all the .pyc files (find . -name "*.pyc" -delete) before updating your branch with git pull.

  6. Migrate your database with verdi -p <profile_name> storage migrate.

    Depending on the size of your database and the number of migrations to perform, data migration can take time, so please be patient.

After the database migration finishes, you will be able to continue working with your existing data.

Note

If the update involved a change in the major version number of aiida-core, expect backwards incompatible changes and check whether you also need to update installed plugin packages.

Updating from 0.x.* to 1.*#

Updating from 1.* to 2.*#

See the Changelog for a list of breaking changes.

Backing up your installation#

A full backup of an AiiDA instance and AiiDA managed data requires a backup of:

  • the AiiDA configuration folder, which is named .aiida. The location of the folder is shown in the output of verdi status. This folder contains, among other things, the config.json configuration file and log files.

  • the data stored for each profile. Where the data is stored, depends on the storage backend used by each profile.

The panels below provide instructions for storage backends provided by aiida-core. To determine what storage backend a profile uses, call verdi profile show.

Tip

Before creating a backup, it is recommended to run verdi storage maintain. This will optimize the storage which can significantly reduce the time required to create the backup. For optimal results, run verdi storage maintain --full. Note that this requires the profile to not be in use by any other process.

The psql_dos storage backend is the default backend for AiiDA. It stores its data in a PostgreSQL database and a file repository on the local filesystem. To fully backup the data stored for a profile, you should backup the associated database and file repository.

PostgreSQL database

To export the entire database, we recommend to use the pg_dump utility:

pg_dump -h <database_hostname> -p <database_port> -d <database_name> -U <database_username> -W > aiida_backup.psql

The -W flag will ensure to prompt for the database password. The parameters between brackets should be replaced with the values that have been configured for the profile. You can retrieve these from the storage.config returned by the verdi profile show command.

Tip

In order to avoid having to enter your database password each time you use the script, you can create a file .pgpass in your home directory containing your database credentials, as described in the PostgreSQL documentation.

File repository

The file repository is a directory on the local file system. The most efficient way to create a backup is to use the rsync utility. The path of the repository is shown in the storage.config.repository_uri key returned by the verdi profile show command. To create a backup, simply run:

rsync -arvz <storage.config.repository_uri> /some/path/aiida_backup

Restoring your installation#

Restoring a backed up AiiDA installation requires:

  • restoring the backed up .aiida folder, with at the very least the config.json file it contains. It should be placed in the path defined by the AIIDA_PATH environment variable. To test the restoration worked, run verdi profile list to verify that all profiles are displayed.

  • restoring the data of each backed up profile. Like the backup procedure, this is dependent on the storage backend used by the profile.

The panels below provide instructions for storage backends provided by aiida-core. To determine what storage backend a profile uses, call verdi profile show.

To fully backup the data stored for a profile using the core.psql_dos backend, you should restore the associated database and file repository.

PostgreSQL database

To restore the PostgreSQL database from the .psql file that was backed up, first you should create an empty database following the instructions described in database skipping the verdi setup phase. The backed up data can then be imported by calling:

psql -h <database_hostname> -p <database_port> -d <database_name> -W < aiida_backup.psql

File repository

To restore the file repository, simply copy the directory that was backed up to the location indicated by the storage.config.repository_uri key returned by the verdi profile show command. Like the backing up process, we recommend using rsync for this:

rsync -arvz /some/path/aiida_backup <storage.config.repository_uri>

Managing multiple users#

AiiDA currently does not support multiple users running concurrently on the same AiiDA profile. While AiiDA will tag any node with the User who created it (the default user is specified in the profile), this information is currently not used internally. In particular, there is currently no permission system in place to limit the operations that can be performed by a given user.

The typical setup involves each user individually installing AiiDA on their operating system account. Data can be shared between private AiiDA profiles through AiiDA’s export and import functionality.

Note that while the configuration file of an AiiDA instance contains access credentials (e.g. for the postgresql database or the rabbitmq service), AiiDA does not store sensitive data in the database or file repository, and AiiDA export archives never contain such data.