ICSD database importer

In this section we explain how to import CIF files from the ICSD database using the IcsdDbImporter class.

Before being able to query ICSD, provided by FIZ Karlsruhe, you should have the intranet database installed on a server (http://www2.fiz-karlsruhe.de/icsd_intranet.html). Follow the installation as described in the manual.

It is necessary to know the webpage of the ICSD web interface and have access to the full database from the local machine.

You can either query the MySQL database or the web page. The latter is restricted to a maximum of 1000 search results, which makes it unsuitable for data mining. For this purpose, we recommend you to set up the MySQL connection.

Setup

An instance of the IcsdDbImporter can be created as follows:

from aiida.tools.dbimporters.plugins.icsd import IcsdDbImporter

importer = IcsdDbImporter(server="http://ICSDSERVER.com/", host= "127.0.0.1")

Here is a list of the most important input parameters with an explanation.

For both connection types (web and SQL):

  • server: Address of web interface of the ICSD database. It should contain both the protocol and the domain name, ending with a slash. Example:

    server = "http://ICSDSERVER.com/"
    

The following parameters are required only for the MySQL query:

  • host: Database host name address.

    Tip

    If the database is not hosted on your local machine, it can be useful to create an ssh tunnel to the 3306 port of the database host:

    ssh -L 3306:localhost:3306 username@icsddbhostname.com
    

    If you get an URLError with Errno 111 (Connection refused) when you query the database, try to instead use:

    ssh -L 3306:localhost:3306 -L 8010:localhost:80 username@icsddbhostname.com
    

    The database can then be accessed using “127.0.0.1” as host:

    host = "127.0.0.1"
    
  • user / pass_wd / db / port: Login username, password, name of database, and port of your MySQL database, respectively. If the standard installation of the ICSD intranet version has been followed, the default values should work. Otherwise contact your system administrator to get the required information:

    user = "dba", pass_wd = "sql", db = "icsd", port = 3306
    

Other settings:

  • querydb: If True (default) the MySQL database is queried, otherwise the web page is queried.

A more detailed documentation and additional settings can be found under IcsdDbImporter.

How to do a query

If the setup worked, you can do your first query:

cif_nr_list = ["50542", "617290", "35538"]

queryresults = importer.query(id=cif_nr_list)

All supported keywords can be obtained using:

importer.get_supported_keywords()

More information on the keywords can be found here.

A query returns an instance of IcsdSearchResults.

The IcsdEntry at position i can be accessed using:

queryresults.at(i)

You can also iterate through all query results:

for entry in query_results:
    # do something

Instances of IcsdEntry have the following methods:

The most convenient format can be chosen for further processing.

Full example

Here is a full example how the ICSD importer can be used:

from aiida.tools.dbimporters.plugins.icsd import IcsdDbImporter

cif_nr_list = [
    "50542",
    "617290",
    "35538 ",
    "165226",
    "158366"
]

importer = IcsdDbImporter(server="http://ICSDSERVER.com/", host= "127.0.0.1")

query_results = importer.query(id=cif_nr_list)
for result in query_results:
    print(result.source['db_id'])
    aiida_structure = result.get_aiida_structure()
    # do something with the structure

Troubleshooting: Testing the MySQL connection

To test your MySQL connection, first make sure that you can connect to the 3306 port of the machine hosting the database. If the database is not hosted by your local machine, use the local port tunneling provided by ssh, as follows:

ssh -L 3306:localhost:3306 username@icsddbhostname.com

Note

If you get an URLError with Errno 111 (Connection refused) when you query the database, try to use instead:

ssh -L 3306:localhost:3306 -L 8010:localhost:80 username@icsddbhostname.com

Note

You need an account on the host machine.

Note

There are plenty of explanations online on how to setup a tunnel over a SSH connection using the -L option.

Then open a new verdi shell and type:

In [1]: import MySQLdb

In [2]: db = MySQLdb.connect(host="127.0.0.1", user="dba", passwd="sql", db="icsd", port=3306)

If you do not get an error and it does not timeout, you have successfully established your connection to the MySQL database.