Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

VIATRA/Query/UserDocumentation/AdvancedPatterns

< VIATRA‎ | Query
Revision as of 10:59, 5 December 2016 by Harmath.incquerylabs.com (Talk | contribs) (Fix formatting)

Advanced query language topics in VIATRA Query

Working with EMaps in VIATRA Query

The eclipse.org EMF wiki gives a proper FAQ about the various modeling related issues, including the usage of EMaps in your metamodel. With VIATRA Query you can even write your own queries to extract the key-value pairs from your instance model.

EMaps in your metamodel

  1. Creating the actual EMap type: Create an EClass with the name [Type1]To[Type2]Map where [Type1] represents the key's type and the [Type2] represents the value's type.
  2. Set the Instance Type Name property of the newly created EClass to java.util.Map$Entry.
  3. Create an EAttribute or EReference named "key" and set the EDataType or EClass for it.
  4. Create an EAttribute or EReference called "value" and set the EDataType or EClass for it.

For example for an <EString, EString> EMap you would have an EClass named EStringToEStringMap if you follow the mentioned scheme.

To actually use this newly created type follow these steps:

  1. Create an EReference with its EClass set to be the map-entry class you created above.
  2. Set the Containment property of your EReference to be true.
  3. Set the upper-bound of your EReference to be -1 (unbounded).

For example an <EString, EString> map would result an EClass named EStringToEStringMap, if you follow the mentioned scheme.

EMaps in your instance model

The contents of the EMap instances can be modified like in every other instance model. One EStringToEstringMap instance will be used an a map entry (key-value pair).

Querying EMaps from VIATRA Query patterns

Here is an example query to extract the key-value pairs from an EMap:

  pattern emapPattern(K : EString, V : EString) {
    EMapTestElement(M); 
    EMapTestElement.map(M, Map); 
    EStringToEStringMap.key(Map, K);
    EStringToEStringMap.value(Map, V);
  }

Parts of this overview are based on the http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F page.


Recursive queries in VIATRA Query

As explained on the query language introduction page, VIATRA Query supports pattern composition via the find keyword. Does it support recursive composition, i.e. a pattern calling itself, or multiple patterns cyclically referencing each other? Yes, it does, albeit with limits. The situation is complicated, as described below; see #Summary and suggestions for an executive summary.

Motivating positive recursion

First of all, there are cases where recursion is plain nonsense, such as this query:

pattern meaningless(x) = { neg find meaningless(x); }

For every choice of value of the variable x, it is a match of pattern meaningless if and only if it is not a match of the same pattern. It is easy to see that this is a contradiction - do not expect VIATRA Query to be useful for evaluating such queries.

To avoid such contradictions, VIATRA Query supports positive recursion only, i.e. patterns referencing themselves or each other cyclically, solely by positive find pattern calls, never by negation (neg find) or aggregation (count find). (In mathematics, this property is called stratification.) Positively recursive queries are always meaningful - unfortunately, they still will not work in all cases, as explained below.

From this point onward, the discussion will be restricted to stratified / positive recursion.

Well-founded recursion

Suppose that we have elements of type Node forming a containment hierarchy of parents and children, and we want to assign them qualified names composed from their simple names and and the name of their parent. Let's see the following recursive pattern:

pattern qualifiedName(node : Node, name) = { 
    // for a child element, compose from parent's qualified name
    find parent(node, parentNode);
    Node.simpleName(node, simpleName);
    find qualifiedName(parent, parentName); // recursive call
    name == eval (parentName + "." + simpleName);
  } or { 
    // for a root element, just use the simple name
    neg find parent(node, _anyParent);  // has no parents
    Node.simpleName(node, name);
  }

This is an example of correct usage of recursion in VIATRA Query.

Take a moment to observe how recursion works here. The pattern qualifiedName recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match (node, name) does not recursively depend on whether (node, name) is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc. As this dependency relationship follows the parent relationship, which represents a containment tree, there can be no dependency cycles.

In general, VIATRA Query returns correct results for positively recursive queries that are well-founded, i.e. individual matches never support each other cyclically. This is typically found to be the case if the recursion traverses along a containment tree (in either direction), or any graph structure that is known to be a DAG (directed acyclic graph).

Behaviour in the ill-founded case

As an aside, one can draw parallels with imperative programs, where the well-founded property of a recursive subroutine would warrant that the recursion terminates. If a recursive program is not well-founded, the subroutine may not terminate. VIATRA Query, however, is guaranteed to terminate even for recursive queries that are not well-founded; the problem lies elsewhere.

Suppose that we have a bunch of people on Earth, and we know that people called Jane are happy; furthermore, everyone else is happy who knows someone that is happy. Suppose now that there is also a society of Martians, who are Persons as well. There are no Janes on Mars, and no Martians know people on Earth.

  pattern happy(x : Person) = { 
    Person.name(x, "Jane");
  } or { 
    Person.knows(x, y);
    find happy(y);  // ill-founded recursion
  }

Since it is possible to have several people that cyclically know each other (in fact, two people are enough that mutually know each other), the recursion in the above query is not well-founded. Initially, though, the results returned will be correct: everyone on Earth is happy, as everyone knows a Jane transitively, while no Martian will be happy. Errors only pop up after incremental maintenance of results. If, by accident, we set the knows reference of a Martian to point to an Earthling, then suddenly all Martians will become happy as well. Later we realize our mistake and delete this reference - but surprisingly, VIATRA Query will still report that Martians are happy, even though the model was returned into its original state!

The key to the issue is that the final result set, where everyone is happy, is not actually contradicted by the query definition (since everyone knows somebody who is happy). It is said that this incorrect result is still a fixpoint, i.e. a solution to the query; however, it is not the least fixpoint, which would be the actually desirable result. In this case, the least fixpoint would be the original, correct result: everyone on Earth is happy, while nobody on Mars is.

Therefore VIATRA Query can return incorrect results even for positively recursive queries, if the recursion is not well-founded. Fortunately, the error is known not manifest as long as the initial model is unchanged, or there are only additions. However, if there is deletion, movement of elements, or changing attribute or reference values, then it is possible that VIATRA Query will yield a non-minimal fixpoint as result, which is typically not desired.

Summary and suggestions

In summary, VIATRA Query supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph); otherwise, the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.

Note that in many typical cases, the transitive closure operator (e.g. find knows+(x,y);) is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by VIATRA Query even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to use transitive closure instead of recursion if possible.


Functional dependencies

The performance of query evaluation may benefit in various ways from knowing the functional dependencies among pattern variables. We say that variables x1, x2, x3, ... xn determine variable y if there can't be more than one value of y given a combination of values for x1, x2, x3, ... xn. In other words, x1, x2, x3, ... xn together uniquely determine y. Yet another way to put it: if two matches of the pattern agree on the values of variables x1, x2, x3, ... xn, they must also agree on the value of y.

In many cases, the recognition of functional dependencies can drastically improve the performance of the evaluation process. It is therefore important to have the dependencies known in case of performance-critical queries.

Automatic inference

Viatra Query has two ways to determine the functional dependencies of your queries: it does its best to automatically infer such dependencies, and you can also help by manually specifying some dependencies (see below). Automatic inference covers cases such as the source of a many-to-one reference uniquely determining its target; or the result of an eval() expression being determined by the variables used in the expression. Since version 1.5, dependencies among parameters for called patterns are also taken into account, though this kind of inference has its limits.

In particular, there are the following two main cases where Viatra is unable to automatically determine functional dependencies:

  • Domain-specific knowledge: such as relative keys, or any other relationship that is not expressed in the metamodel (ecore). Say that Streets contain Houses that have their integer house numbers; in that case it is automatically known that a House determines the Street is resides in (as the containment reference is one-to-many) as well as its own house number; but it requires domain knowledge to understand that a Street and a house number together uniquely determine a House.
  • Disjunctive patterns: as of version 1.5, there is no automatic inference of functional dependencies among parameters of patterns that have multiple pattern bodies in an or relationship.


Manually specifying dependencies (since v1.5)

The @FunctionalDependency annotation can be used inform the query engine about additional functional dependencies that it would be unable to automatically recognize. The annotation is placed on a pattern, and expresses a functional dependency among pattern parameters. Annotation parameters indicate which query parameters determine which other ones. Note that is is not the evaluation of the annotated pattern, but rather other patterns calling it, that can take advantage of the supplied information.

A single occurrence of the annotation expresses a single dependency rule; it is possible to decorate a single pattern with multiple such annotations. Each parameter listed with forEach is taken to appear on the left-hand-side of the dependency (see x1 etc. in the above definition), and parameters listed with unique are on the right-hand-side (like y in the above definition), so that for each combination of values assigned to the "forEach" variables, the value of each "unique" variable has to be unique. See below for examples:

// Here the first annotation is superfluous, as it is inferred automatically anyway
// The second annotation expresses valuable domain knowledge though
@FunctionalDependency(forEach = house, unique = street, unique = houseNumber)
@FunctionalDependency(forEach = street, forEach = houseNumber, unique = house)
pattern address(house: House, street: Street, houseNumber: java Integer) {
	Street.houses(street, house);
	House.number(house, houseNumber); 
}
 
// Houses are either on a Street or on a Road, but not both at the same time;
//  however Viatra is not smart enough (yet) to figure that out.
// In disjunctive patterns, all dependencies have to be specified manually!
@FunctionalDependency(forEach = house, unique = location)
pattern locatedOnThoroughfare(house: House, location: Thoroughfare) {
	Street.houses(location, house);
} or {
	Road.houses(location, house);
}

Back to the top