How to manage your installation#
Managing profiles#
Creating profiles#
Each AiiDA installation can have multiple profiles, each of which can have its own individual database and file repository to store the contents of the provenance graph. Profiles allow you to run multiple projects completely independently from one another with just a single AiiDA installation and at least one profile is required to run AiiDA. A new profile can be created using verdi presto or verdi profile setup, which works similar to the former but gives more control to the user.
Listing profiles#
The verdi profile command line interface provides various commands to manage the profiles of an AiiDA installation.
To list the currently configured profiles, use verdi profile list
:
Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida
* project-one
project-two
In this particular example, there are two configured profiles, project-one
and project-two
.
The first one is highlighted and marked with a *
symbol, meaning it is the default profile.
A profile being the default means simply that any verdi
command will always be executed for that profile.
You can change the profile on a per-call basis with the --p/--profile
option.
To change the default profile use verdi profile setdefault PROFILE
.
Showing profiles#
Each profile defines various parameters, such as the location of the file repository on the file system and the connection parameters for the database.
To display these parameters, use verdi profile show
:
Report: Profile: a-import-sqla
PROFILE_UUID: fede89dae42b4df3bf46ab27e2b500ca
default_user_email: user@email.com
process_control:
backend: rabbitmq
config:
broker_host: 127.0.0.1
broker_password: guest
broker_port: 5672
broker_protocol: amqp
broker_username: guest
broker_virtual_host: ''
storage:
backend: core.psql_dos
config:
database_engine: postgresql_psycopg2
database_hostname: localhost
database_name: name
database_password: abc
database_port: 5432
database_username: username
repository_uri: file:///path/to/repository
By default, the parameters of the default profile are shown, but one can pass the profile name of another, e.g., verdi profile show project-two
to change that.
Deleting profiles#
A profile can be deleted using the verdi profile delete
command.
By default, deleting a profile will also delete its file repository and the database.
This behavior can be changed using the --skip-repository
and --skip-db
options.
Note
In order to delete the database, the system user needs to have the required rights, which is not always guaranteed depending on the system. In such cases, the database deletion may fail and the user will have to perform the deletion manually through PostgreSQL.
Configuring your installation#
Activating tab-completion#
The verdi
command line interface has many commands and parameters, which can be tab-completed to simplify its use.
To enable tab-completion, the following shell command should be executed (depending on the shell you use):
Enable tab-completion for verdi
one of the following supported shells
eval "$(_VERDI_COMPLETE=bash_source verdi)"
eval "$(_VERDI_COMPLETE=zsh_source verdi)"
eval (env _VERDI_COMPLETE=fish_source verdi)
Place this command in your shell or virtual environment activation script to automatically enable tab completion when opening a new shell or activating an environment. This file is shell specific, but likely one of the following:
the startup file of your shell (
.bashrc
,.zsh
, …), if aiida is installed system-widethe activators of your virtual environment
a startup file for your conda environment
Important
After you have added the line to the start up script, make sure to restart the terminal or source the script for the changes to take effect.
Configuring profile options#
AiiDA provides various configurational options for profiles, which can be controlled with the verdi config command.
To view all configuration options set for the current profile:
$ verdi config list
name source value
------------------------------------- -------- ------------
autofill.user.email global abc@test.com
autofill.user.first_name global chris
autofill.user.institution global epfl
autofill.user.last_name global sewell
caching.default_enabled default False
caching.disabled_for default
caching.enabled_for default
daemon.default_workers default 1
daemon.timeout profile 20
daemon.worker_process_slots default 200
db.batch_size default 100000
logging.aiida_loglevel default REPORT
logging.alembic_loglevel default WARNING
logging.circus_loglevel default INFO
logging.db_loglevel default REPORT
logging.kiwipy_loglevel default WARNING
logging.paramiko_loglevel default WARNING
logging.plumpy_loglevel default WARNING
logging.sqlalchemy_loglevel default WARNING
rmq.task_timeout default 10
runner.poll.interval profile 50
transport.task_maximum_attempts global 6
transport.task_retry_initial_interval default 20
verdi.shell.auto_import default
warnings.showdeprecations default True
Configuration option values are taken, in order of priority, from either the profile specific setting, the global setting (applies to all profiles), or the default value.
You can also filter by a prefix:
$ verdi config list transport
name source value
------------------------------------- -------- ------------
transport.task_maximum_attempts global 6
transport.task_retry_initial_interval default 20
To show the full information for a configuration option or get its current value:
$ verdi config show transport.task_maximum_attempts
schema:
default: 5
description: Maximum number of transport task attempts before a Process is Paused.
minimum: 1
type: integer
values:
default: 5
global: 6
profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
6
You can also retrieve the value via the API:
In [1]: from aiida import get_config_option
In [2]: get_config_option('transport.task_maximum_attempts')
Out[2]: 6
To set a value, at the profile or global level:
$ verdi config set transport.task_maximum_attempts 10
Success: 'transport.task_maximum_attempts' set to 10 for 'quicksetup' profile
$ verdi config set --global transport.task_maximum_attempts 20
Success: 'transport.task_maximum_attempts' set to 20 globally
$ verdi config show transport.task_maximum_attempts
schema:
type: integer
default: 5
minimum: 1
description: Maximum number of transport task attempts before a Process is Paused.
values:
default: 5
global: 20
profile: 10
$ verdi config get transport.task_maximum_attempts
10
Tip
By default any option set through verdi config
will be applied to the current default profile.
To change the profile you can use the profile option.
Similarly to unset a value:
$ verdi config unset transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset for 'quicksetup' profile
$ verdi config unset --global transport.task_maximum_attempts
Success: 'transport.task_maximum_attempts' unset globally
$ verdi config show transport.task_maximum_attempts
schema:
type: integer
default: 5
minimum: 1
description: Maximum number of transport task attempts before a Process is Paused.
values:
default: 5
global: <NOTSET>
profile: <NOTSET>
$ verdi config get transport.task_maximum_attempts
5
Important
Changes that affect the daemon (e.g. logging.aiida_loglevel
) will only take affect after restarting the daemon.
See also
Controlling warnings#
AiiDA may emit warnings for a variety of reasons, for example, warnings when a deprecated part of the code is used.
These warnings are on by default as they provide the user with important information.
The warnings can be turned off using the warnings.showdeprecations
config option, for example:
verdi config set warnings.showdeprecations false
Tip
The command above changes the option for the current profile. However, certain warnings are emitted before a profile can be loaded, for example, when certain modules are imported. To also silence these warnings, apply the option globally:
verdi config set warnings.showdeprecations false --global
In addition to the config option, AiiDA also provides the dedicated environment variable AIIDA_WARN_v{version}
for deprecation warnings.
Here {version}
is the version number in which the deprecated code will be removed, e.g., AIIDA_WARN_v3
.
This environment variable can be used to enable deprecation warnings even if warnings.showdeprecations
is turned off.
This can be useful to temporarily enable deprecation warnings for a single command, e.g.:
AIIDA_WARN_v3=1 verdi run script.py
Isolating multiple instances#
An AiiDA instance is defined as the installed source code plus the configuration folder that stores the configuration files with all the configured profiles. It is possible to run multiple AiiDA instances on a single machine, simply by isolating the code and configuration in a virtual environment.
To isolate the code, make sure to install AiiDA into a virtual environment, e.g., with conda or venv. Whenever you activate this particular environment, you will be running the particular version of AiiDA (and all the plugins) that you installed specifically for it.
This is separate from the configuration of AiiDA, which is stored in the configuration directory which is always named .aiida
and by default is stored in the home directory.
Therefore, the default path of the configuration directory is ~/.aiida
.
By default, each AiiDA instance (each installation) will store associated profiles in this folder.
A best practice is to always separate the profiles together with the code to which they belong.
The typical approach is to place the configuration folder in the virtual environment itself and have it automatically selected whenever the environment is activated.
The location of the AiiDA configuration folder can be controlled with the AIIDA_PATH
environment variable.
This allows us to change the configuration folder automatically, by adding the following lines to the activation script of a virtual environment.
For example, if the path of your virtual environment is /home/user/.virtualenvs/aiida
, add the following line:
$ export AIIDA_PATH='/home/user/.virtualenvs/aiida'
Make sure to reactivate the virtual environment, if it was already active, for the changes to take effect.
Note
For conda
, create a directory structure etc/conda/activate.d
in the root folder of your conda environment (e.g. /home/user/miniconda/envs/aiida
), and place a file aiida-init.sh
in that folder which exports the AIIDA_PATH
.
You can test that everything works by first echoing the environment variable with echo $AIIDA_PATH
to confirm it prints the correct path.
Finally, you can check that AiiDA know also properly realizes the new location for the configuration folder by calling verdi profile list
.
This should display the current location of the configuration directory:
Info: configuration folder: /home/user/.virtualenvs/aiida/.aiida
Critical: configuration file /home/user/.virtualenvs/aiida/.aiida/config.json does not exist
The second line you will only see if you haven’t yet setup a profile for this AiiDA instance. For information on setting up a profile, refer to creating profiles.
Besides a single path, the value of AIIDA_PATH
can also be a colon-separated list of paths.
AiiDA will go through each of the paths and check whether they contain a configuration directory, i.e., a folder with the name .aiida
.
The first configuration directory that is encountered will be used as the configuration directory.
If no configuration directory is found, one will be created in the last path that was considered.
For example, the directory structure in your home folder ~/
might look like this:
.
├── .aiida
└── project_a
├── .aiida
└── subfolder
If you leave the AIIDA_PATH
variable unset, the default location ~/.aiida
will be used.
However, if you set:
$ export AIIDA_PATH='~/project_a:'
the configuration directory ~/project_a/.aiida
will be used.
Warning
If there was no .aiida
directory in ~/project_a
, AiiDA would have created it for you, so make sure to set the AIIDA_PATH
correctly.
Daemon as a service#
The daemon can be set up as a system service, such that it automatically starts at system startup. How to do this, is operating system specific. For Ubuntu, here is a template for the service file and ansible instructions to install the service.
Tuning performance#
AiiDA supports running hundreds of thousands of calculations and graphs with millions of nodes. However, optimal performance at that scale can require tweaking the AiiDA configuration to balance the CPU and disk load.
Below, we share a few practical tips for assessing and tuning AiiDA performance. Further in-depth information is available in the dedicated topic on performance.
Benchmark workflow engine performance
Download the benchmark script
, and run it in your AiiDA environment.
sph@citadel:~/$ python performance_benchmark_base.py -n 100
Success: Created and configured temporary `Computer` benchmark-e73b8647 for localhost.
Success: Created temporary `Code` bash for localhost.
Running 100 calculations. [####################################] 100%
Success: All calculations finished successfully.
Elapsed time: 24.90 seconds.
Cleaning up...
12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] 400 Node(s) marked for deletion
12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] Starting node deletion...
12/19/2022 10:57:43 AM <12625> aiida.delete: [REPORT] Deletion of nodes completed.
Success: Deleted all calculations.
Success: Deleted the created code bash@benchmark-e73b8647.
Success: Deleted the created computer benchmark-e73b8647.
Performance: 0.25 s / process
The output above was generated on an AMD Ryzen 5 3600 6-Core processor (3.6 GHz, 4.2 GHz turbo boost) using AiiDA v2.2.0, and RabbitMQ and PostgreSQL running on the same machine.
Here, 100 ArithmeticAddCalculation
processes completed in ~25s, corresponding to 0.25 seconds per process.
If you observe a significantly higher runtime, you may want to check whether any relevant component (CPU, disk, postgresql, rabbitmq) is congested.
Increase the number of daemon workers
By default, the AiiDA daemon only uses a single worker, i.e. a single operating system process.
If verdi daemon status
shows the daemon worker constantly at high CPU usage, you can use verdi daemon incr X
to add X
parallel daemon workers.
Keep in mind that other processes need to run on your computer (e.g. rabbitmq, the PostgreSQL database, …), i.e. it’s a good idea to stop increasing the number of workers before you reach the number of cores of your CPU.
To make the change permanent, set
verdi config set daemon.default_workers 4
Increase the number of daemon worker slots
Each daemon worker accepts only a limited number of tasks at a time.
If verdi daemon status
constantly warns about a high percentage of the available daemon worker slots being used, you can increase the number of tasks handled by each daemon worker (thus increasing the workload per worker).
Increasing it to 1000 should typically work.
Set the corresponding config variable and restart the daemon
verdi config set daemon.worker_process_slots 1000
Prevent your operating system from indexing the file repository.
Many Linux distributions include the locate
command to quickly find files and folders, and run a daily cron job updatedb.mlocate
to create the corresponding index.
A large file repository can take a long time to index, up to the point where the hard drive is constantly indexing.
In order to exclude the repository folder from indexing, add its path to the PRUNEPATH
variable in the /etc/updatedb.conf
configuration file (use sudo
).
Move the Postgresql database to a fast disk (SSD), ideally on a large partition.
Stop the AiiDA daemon and back up your database.
Find the data directory of your postgres installation (something like
/var/lib/postgresql/9.6/main
,/scratch/postgres/9.6/main
, …).The best way is to become the postgres UNIX user and enter the postgres shell:
psql SHOW data_directory; \q
If you are unable to enter the postgres shell, try looking for the
data_directory
variable in a file/etc/postgresql/9.6/main/postgresql.conf
or similar.Stop the postgres database service:
service postgresql stop
Copy all files and folders from the postgres
data_directory
to the new location:cp -a SOURCE_DIRECTORY DESTINATION_DIRECTORY .. note:: Flag ``-a`` will create a directory within ``DESTINATION_DIRECTORY``, e.g.:: cp -a OLD_DIR/main/ NEW_DIR/ creates ``NEW_DIR/main``. It will also keep the file permissions (necessary). The file permissions of the new and old directory need to be identical (including subdirectories). In particular, the owner and group should be both ``postgres`` (except for symbolic links in ``server.crt`` and ``server.key`` that may or may not be present). .. note:: If the permissions of these links need to be changed, use the ``-h`` option of ``chown`` to avoid changing the permissions of the destination of the links. In case you have changed the permission of the links destination by mistake, they should typically be (beware that this might depend on your actual distribution!):: -rw-r--r-- 1 root root 989 Mar 1 2012 /etc/ssl/certs/ssl-cert-snakeoil.pem -rw-r----- 1 root ssl-cert 1704 Mar 1 2012 /etc/ssl/private/ssl-cert-snakeoil.key
Point the
data_directory
variable in your postgres configuration file (e.g./etc/postgresql/9.6/main/postgresql.conf
) to the new directory.Restart the database daemon:
service postgresql start
Finally, check that the data directory has indeed changed:
psql
SHOW data_directory;
\q
and try a simple AiiDA query with the new database. If everything went fine, you can delete the old database location.
If you’re still encountering performance issues, the following tips can help with pinpointing performance bottlenecks.
Analyze the RabbitMQ message rate
If you’re observing slow performance of the AiiDA engine, the RabbitMQ management plugin provides an intuitive dashboard that lets you monitor the message rate and check on what the AiiDA engine is up to.
Enable the management plugin via something like:
sudo rabbitmq-plugins enable rabbitmq_management
Then, navigate to http://localhost:15672/ and log in with guest
/guest
.
Updating your installation#
Whenever updating your AiiDA installation, make sure you follow these instructions very carefully, even when merely upgrading the patch version! Failing to do so, may leave your installation in a broken state, or worse may even damage your data, potentially irreparably.
Activate the Python environment where AiiDA is installed.
- Finish all running processes.
All finished processes will be automatically migrated, but it is not possible to resume unfinished processes.
Stop the daemon using
verdi daemon stop
.Create a backup of your database and repository.
Warning
Once you have migrated your database, you can no longer go back to an older version of
aiida-core
(unless you restore your database and repository from a backup).Update your
aiida-core
installation.If you have installed AiiDA through
conda
simply run:conda update aiida-core
.If you have installed AiiDA through
pip
simply run:pip install --upgrade aiida-core
.If you have installed from the git repository using
pip install -e .
, first delete all the.pyc
files (find . -name "*.pyc" -delete
) before updating your branch withgit pull
.
- Migrate your database with
verdi -p <profile_name> storage migrate
. Depending on the size of your database and the number of migrations to perform, data migration can take time, so please be patient.
- Migrate your database with
After the database migration finishes, you will be able to continue working with your existing data.
Note
If the update involved a change in the major version number of aiida-core
, expect backwards incompatible changes and check whether you also need to update installed plugin packages.
Updating from 0.x.* to 1.*#
Additional instructions on how to migrate from 0.12.x versions.
Additional instructions on how to migrate from versions 0.4 – 0.11.
For a list of breaking changes between the 0.x and the 1.x series of AiiDA, see here.
Updating from 1.* to 2.*#
See the Changelog for a list of breaking changes.
Backing up your data#
General information#
The most convenient way to back up the data of a single AiiDA profile is to use
$ verdi --profile <profile_name> storage backup /path/to/destination
This command automatically manages a subfolder structure of previous backups, and new backups are done in an efficient way (using rsync
hard-link functionality to the previous backup).
The command backs up everything that’s needed to restore the profile later:
the AiiDA configuration file
.aiida/config.json
, from which other profiles are removed (seeverdi status
for exact location);all the data of the backed up profile (which depends on the storage backend).
The specific procedure of the command and whether it even is implemented depends on the storage backend.
Note
The verdi storage backup
command is implemented in a way to be as safe as possible to use when AiiDA is running, meaning that it will most likely produce an uncorrupted backup even when data is being modified. However, the exact conditions depend on the specific storage backend and to err on the safe side, only perform a backup when the profile is not in use.
Storage backend specific information#
Alternatively to the CLI command, one can also manually create a backup. This requires a backup of the configuration file .aiida/config.json
and the storage backend. The panels below provide instructions for storage backends provided by aiida-core
. To determine what storage backend a profile uses, call verdi profile show
.
Tip
Before creating a backup, it is recommended to run verdi storage maintain
.
This will optimize the storage which can significantly reduce the time required to create the backup.
For optimal results, run verdi storage maintain --full
.
Note that this requires the profile to not be in use by any other process.
The psql_dos
storage backend is the default backend for AiiDA.
It stores its data in a PostgreSQL database and a file repository on the local filesystem.
To fully backup the data stored for a profile, you should backup the associated database and file repository.
PostgreSQL database
To export the entire database, we recommend to use the pg_dump utility:
pg_dump -h <database_hostname> -p <database_port> -d <database_name> -U <database_username> -W > aiida_backup.psql
The -W
flag will ensure to prompt for the database password.
The parameters between brackets should be replaced with the values that have been configured for the profile.
You can retrieve these from the storage.config
returned by the verdi profile show
command.
Tip
In order to avoid having to enter your database password each time you use the script, you can create a file .pgpass
in your home directory containing your database credentials, as described in the PostgreSQL documentation.
File repository
The file repository is a directory on the local file system.
The most efficient way to create a backup is to use the rsync utility.
The path of the repository is shown in the storage.config.repository_uri
key returned by the verdi profile show
command.
To create a backup, simply run:
rsync -arvz <storage.config.repository_uri> /some/path/aiida_backup
Restoring data from a backup#
Restoring a backed up AiiDA profile requires:
restoring the profile information in the AiiDA
config.json
file. Simply copy the`profiles` entry from the backed upconfig.json
to the one of the running AiiDA instance (seeverdi status
for exact location). Some information (e.g. the database parameters) might need to be updated.restoring the data of of the backed up profile according to the
config.json
entry. Like the backup procedure, this is dependent on the storage backend used by the profile.
The panels below provide instructions for storage backends provided by aiida-core
.
To determine what storage backend a profile uses, call verdi profile show
.
To test if the restoration worked, run verdi -p <profile-name> status
to verify that AiiDA can successfully connect to the data storage.
To restore the backed up data for a profile using the core.psql_dos
backend, you should restore the associated database and file repository.
PostgreSQL database
To restore the PostgreSQL database from the db.psql
file that was backed up, first you should create an empty database following the instructions described in the installation guide.
The backed up data can then be imported by calling:
psql -h <db_hostname> -p <db_port> - U <db_user> -d <db_name> -W < db.psql
where the parameters need to match with the corresponding AiiDA config.json profile entry.
File repository
To restore the file repository, simply copy the directory that was backed up to the location indicated in AiiDA config.json (or the storage.config.repository_uri
key returned by the verdi profile show
command).
Like the backing up process, we recommend using rsync
for this:
rsync -arvz /path/to/backup/container <storage.config.repository_uri>
Managing multiple users#
AiiDA currently does not support multiple users running concurrently on the same AiiDA profile.
While AiiDA will tag any node with the User
who created it (the default user is specified in the profile), this information is currently not used internally.
In particular, there is currently no permission system in place to limit the operations that can be performed by a given user.
The typical setup involves each user individually installing AiiDA on their operating system account. Data can be shared between private AiiDA profiles through AiiDA’s export and import functionality.
Note that while the configuration file of an AiiDA instance contains access credentials (e.g. for the postgresql database or the rabbitmq service), AiiDA does not store sensitive data in the database or file repository, and AiiDA export archives never contain such data.