QueryBuilder documentation

The general functionalities that all querybuilders need to have are found in this module. AbstractQueryBuilder() is the abstract class for QueryBuilder classes. Subclasses need to be written for every schema/backend implemented in backends.

class aiida.backends.querybuild.querybuilder_base.AbstractQueryBuilder(*args, **kwargs)[source]

QueryBuilderBase is the base class for QueryBuilder classes, which are than adapted to the individual schema and ORM used. In here, general graph traversal functionalities are implemented, the specific type of node and link is dealt in subclasses. In order to load the correct subclass:

from aiida.orm.querybuilder import QueryBuilder
add_filter(tagspec, filter_spec)[source]

Adding a filter to my filters.

Parameters:
  • tagspec – The tag, which has to exist already as a key in self._filters
  • filter_spec – The specifications for the filter, has to be a dictionary
add_projection(tag_spec, projection_spec)[source]

Adds a projection

Parameters:
  • tag_spec – A valid specification for a tag
  • projection_spec – The specification for the projection. A projection is a list of dictionaries, with each dictionary containing key-value pairs where the key is database entity (e.g. a column / an attribute) and the value is (optional) additional information on how to process this database entity.

If the given projection_spec is not a list, it will be expanded to a list. If the listitems are not dictionaries, but strings (No additional processing of the projected results desired), they will be expanded to dictionaries.

Usage:

qb = QueryBuilder()
qb.append(StructureData, tag='struc')

# Will project the uuid and the kinds
qb.add_projection('struc', ['uuid', 'attributes.kinds'])
all(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of lists of all projected entities.
append(cls=None, type=None, tag=None, autotag=False, filters=None, project=None, subclassing=True, edge_tag=None, edge_filters=None, edge_project=None, **kwargs)[source]

Any iterative procedure to build the path for a graph query needs to invoke this method to append to the path.

Parameters:
  • cls – The Aiida-class (or backend-class) defining the appended vertice
  • type – The type of the class, if cls is not given
  • tag – A unique tag. If none is given, will take the classname. See keyword autotag to achieve unique tag.
  • filters – Filters to apply for this vertice. See usage examples for details.
  • autotag – Whether to search for a unique tag, (default False). If True, will find a unique tag. Cannot be set to True if tag is specified.
  • subclassing – Whether to include subclasses of the given class (default True). E.g. Specifying JobCalculation will include PwCalculation

A small usage example how this can be invoked:

qb = QueryBuilder()             # Instantiating empty querybuilder instance
qb.append(cls=StructureData)    # First item is StructureData node
# The
# next node in the path is a PwCalculation, with
# the structure joined as an input
qb.append(
    cls=PwCalculation,
    output_of=StructureData
)
Returns:self
children(**kwargs)[source]

Join to children/descendants of previous vertice in path.

Returns:self
count()[source]

Counts the number of rows returned by the backend.

Returns:the number of rows as an integer
dict(batch_size=None)[source]

Executes the full query with the order of the rows as returned by the backend. the order inside each row is given by the order of the vertices in the path and the order of the projections for each vertice in the path.

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter. Leave the default (None) if speed is not critical or if you don’t know what you’re doing!
Returns:a list of dictionaries of all projected entities. Each dictionary consists of key value pairs, where the key is the tag of the vertice and the value a dictionary of key-value pairs where key is the entity description (a column name or attribute path) and the value the value in the DB.

Usage:

qb = QueryBuilder()
qb.append(
    StructureData,
    tag='structure',
    filters={'uuid':{'==':myuuid}},
)
qb.append(
    Node,
    descendant_of='structure',
    project=['type', 'id'],  # returns type (string) and id (string)
    tag='descendant'
)

# Return the dictionaries:
print "qb.iterdict()"
for d in qb.iterdict():
    print '>>>', d

results in the following output:

qb.iterdict()
>>> {'descendant': {
        'type': u'calculation.job.quantumespresso.pw.PwCalculation.',
        'id': 7716}
    }
>>> {'descendant': {
        'type': u'data.remote.RemoteData.',
        'id': 8510}
    }
distinct()[source]

Asks for distinct rows. Does not execute the query! If you want a distinct query:

qb = QueryBuilder(**queryhelp)
qb.distinct().all() # or
qb.distinct().get_results_dict()
Returns:self
except_if_input_to(calc_class)[source]

Makes counterquery based on the own path, only selecting entries that have been input to calc_class

Parameters:calc_class – The calculation class to check against
Returns:self
first()[source]

Executes query asking for one instance. Use as follows:

qb = QueryBuilder(**queryhelp)
qb.first()
Returns:One row of results as a list, order as given by order of vertices in path and projections for vertice
get_alias(tag)[source]

In order to continue a query by the user, this utility function returns the aliased ormclasses.

Parameters:tag – The tag for a vertice in the path
Returns:the alias given for that vertice
get_aliases()[source]
Returns:the list of aliases
get_json_compatible_queryhelp()[source]

Makes the queryhelp a json - compatible dictionary. In this way,the queryhelp can be stored in a node in the database and retrieved or shared.

Returns:the json-compatible queryhelp

All classes defined in the input are converted to strings specifying the type, for example:

get_query()[source]

Checks if the query instance is still valid by hashing the queryhelp. If not invokes QueryBuilderBase._build().

Returns:an instance of sqlalchemy.orm.Query
get_results_dict()[source]

Deprecated, use QueryBuilderBase.dict() or QueryBuilderBase.iterdict() instead

inject_query(query)[source]

Manipulate the query an inject it back. This can be done to add custom filters using SQLA. :param query: A sqlalchemy.orm.Query instance

inputs(**kwargs)[source]

Join to inputs of previous vertice in path.

Returns:self
iterall(batch_size=100)[source]

Same as QueryBuilderBase.all(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of lists
iterdict(batch_size=100)[source]

Same as QueryBuilderBase.dict(), but returns a generator. Be aware that this is only safe if no commit will take place during this transaction. You might also want to read the SQLAlchemy documentation on http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.yield_per

Parameters:batch_size (int) – The size of the batches to ask the backend to batch results in subcollections. You can optimize the speed of the query by tuning this parameter.
Returns:a generator of dictionaries
limit(limit)[source]

Set the limit (nr of rows to return)

Parameters:limit (int) – integers of nr of rows to return
offset(offset)[source]

Set the offset. If offset is set, that many rows are skipped before returning. offset = 0 is the same as omitting setting the offset. If both offset and limit appear, then offset rows are skipped before starting to count the limit rows that are returned.

Parameters:offset (int) – integers of nr of rows to skip
order_by(order_by)[source]

Set the entity to order by

Parameters:order_by – This is a list of items, where each item is a dictionary specifies what to sort for an entity

In each dictionary in that list, keys represent valid tags of entities (tables), and values are list of columns.

Usage:

#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':['id']})

# or
#Sorting by id (ascending):
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'asc'}}]})

# for descending order:
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':{'order':'desc'}}]})

# or (shorter)
qb = QueryBuilder()
qb.append(Node, tag='node')
qb.order_by({'node':[{'id':'desc'}]})
outputs(**kwargs)[source]

Join to outputs of previous vertice in path.

Returns:self
parents(**kwargs)[source]

Join to parents/ancestors of previous vertice in path.

Returns:self