Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "VIATRA/Query/UserDocumentation/AdvancedPatterns"

< VIATRA‎ | Query
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Working with EMaps in EMF-IncQuery  =
+
{{caution|Old information|This page is not updated anymore; for more up-to-date details look at the language specification at https://www.eclipse.org/viatra/documentation/query-language.html instead.}}
 +
= Advanced query language topics in VIATRA Query =
  
The eclipse.org EMF wiki gives a proper FAQ about the various modeling related issues, including the usage of EMap-s in your metamodel. With EMF-IncQuery you can even write your own queries to extract the key-value pairs from your instance model. <br>
+
== Working with EMaps in VIATRA Query  ==
  
== EMaps in your metamodel ==
+
The eclipse.org EMF wiki gives a proper FAQ about the various modeling related issues, including the usage of EMaps in your metamodel. With VIATRA Query you can even write your own queries to extract the key-value pairs from your instance model. <br>
  
#Creating the actual EMap type Create an EClass with the name [Type1]To[Type2]Map where [Type1] represents the key's type and the [Type2] represents the value's type. <br>  
+
=== EMaps in your metamodel  ===
 +
 
 +
#Creating the actual EMap type: Create an EClass with the name [Type1]To[Type2]Map where [Type1] represents the key's type and the [Type2] represents the value's type. <br>  
 
#Set the Instance Type Name property of the newly created EClass to java.util.Map$Entry. <br>  
 
#Set the Instance Type Name property of the newly created EClass to java.util.Map$Entry. <br>  
 
#Create an EAttribute or EReference named "key" and set the EDataType or EClass for it. <br>  
 
#Create an EAttribute or EReference named "key" and set the EDataType or EClass for it. <br>  
Line 20: Line 23:
 
For example an &lt;EString, EString&gt; map would result an EClass named EStringToEStringMap, if you follow the mentioned scheme.
 
For example an &lt;EString, EString&gt; map would result an EClass named EStringToEStringMap, if you follow the mentioned scheme.
  
== EMaps in your instance model<br>  ==
+
=== EMaps in your instance model<br>  ===
  
 
The contents of the EMap instances can be modified like in every other instance model. One EStringToEstringMap instance will be used an a map entry (key-value pair).  
 
The contents of the EMap instances can be modified like in every other instance model. One EStringToEstringMap instance will be used an a map entry (key-value pair).  
  
== Querying EMaps from EMF-IncQuery patterns ==
+
=== Querying EMaps from VIATRA Query patterns ===
  
Here is an example query to extract the key-value pairs from an EMap<br>  
+
Here is an example query to extract the key-value pairs from an EMap:
 +
<source lang="java">
 +
  pattern emapPattern(K : EString, V : EString) {
 +
    EMapTestElement(M);
 +
    EMapTestElement.map(M, Map);
 +
    EStringToEStringMap.key(Map, K);
 +
    EStringToEStringMap.value(Map, V);
 +
  }
 +
</source>
 +
Parts of this overview are based on the http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F page. <br>
  
pattern emapPattern(K&nbsp;: EString, V&nbsp;: EString) {<br>&nbsp;&nbsp; EMapTestElement(M); <br>&nbsp;&nbsp; EMapTestElement.map(M, Map); <br>&nbsp;&nbsp; EStringToEStringMap.key(Map, K);<br>&nbsp;&nbsp; EStringToEStringMap.value(Map, V);<br>}<br>
 
  
Parts of this overview are based on the http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F page. <br>
+
== Recursive queries in VIATRA Query  ==
 +
 
 +
As explained on the [[EMFIncQuery/UserDocumentation/QueryLanguage | query language introduction page]], VIATRA Query supports pattern composition via the <code>find</code> keyword. Does it support recursive composition, i.e. a pattern calling itself, or multiple patterns cyclically referencing each other? Yes, it does, albeit with limits. The situation is complicated, as described below; see [[#Summary and suggestions]] for an executive summary.
 +
 
 +
=== Motivating positive recursion ===
 +
 
 +
First of all, there are cases where recursion is plain nonsense, such as this query:
 +
 
 +
<source lang="java">pattern meaningless(x) = { neg find meaningless(x); } </source>
 +
 
 +
For every choice of value of the variable <code>x</code>, it is a match of pattern <code>meaningless</code> if and only if it is not a match of the same pattern. It is easy to see that this is a contradiction - do not expect VIATRA Query to be useful for evaluating such queries.
 +
 
 +
To avoid such contradictions, VIATRA Query supports '''positive recursion''' only, i.e. patterns referencing themselves or each other cyclically, solely by positive <code>find</code> pattern calls, never by negation (<code>neg find</code>) or aggregation (<code>count find</code>). (In mathematics, this property is called [https://en.wikipedia.org/wiki/Stratification_%28mathematics%29 stratification].) Positively recursive queries are always meaningful - unfortunately, they still will not work in all cases, as explained below.
 +
 
 +
From this point onward, the discussion will be restricted to stratified / positive recursion.
 +
 
 +
=== Well-founded recursion ===
 +
 
 +
Suppose that we have elements of type <code>Node</code> forming a containment hierarchy of parents and children, and we want to assign them qualified names composed from their simple names and and the name of their parent. Let's see the following recursive pattern:
 +
 
 +
<source lang="java">
 +
pattern qualifiedName(node : Node, name) = {
 +
    // for a child element, compose from parent's qualified name
 +
    find parent(node, parentNode);
 +
    Node.simpleName(node, simpleName);
 +
    find qualifiedName(parent, parentName); // recursive call
 +
    name == eval (parentName + "." + simpleName);
 +
  } or {
 +
    // for a root element, just use the simple name
 +
    neg find parent(node, _anyParent);  // has no parents
 +
    Node.simpleName(node, name);
 +
  }
 +
</source>
 +
 
 +
This is an example of correct usage of recursion in VIATRA Query.
 +
 
 +
Take a moment to observe how recursion works here. The pattern <code>qualifiedName</code> recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match <code>(node, name)</code> does not recursively depend on whether <code>(node, name)</code> is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc. As this dependency relationship follows the <code>parent</code> relationship, which represents a containment tree, there can be no dependency cycles.
 +
 
 +
In general, VIATRA Query returns correct results for positively recursive queries that are '''well-founded''', i.e. individual matches never support each other cyclically. This is typically found to be the case if the recursion traverses along a containment tree (in either direction), or any graph structure that is known to be a DAG (directed acyclic graph).
 +
 
 +
==== Optional reading: problems in the ill-founded case ====
 +
 
 +
As an aside, one can draw parallels with imperative programs, where the well-founded property of a recursive subroutine would warrant that the recursion terminates. If a recursive program is not well-founded, the subroutine may not terminate. VIATRA Query, however, is guaranteed to terminate even for recursive queries that are not well-founded; the problem lies elsewhere.
 +
 
 +
Suppose that we have a bunch of people on Earth, and we know that people called ''Jane'' are happy; furthermore, everyone else is happy who knows someone that is happy. Suppose now that there is also a society of Martians, who are Persons as well. There are no Janes on Mars, and no Martians know people on Earth.
 +
<source lang="java">
 +
  pattern happy(x : Person) = {
 +
    Person.name(x, "Jane");
 +
  } or {
 +
    Person.knows(x, y);
 +
    find happy(y);  // ill-founded recursion
 +
  }
 +
</source>
 +
Since it is possible to have several people that cyclically know each other (in fact, two people are enough that mutually know each other), the recursion in the above query is not well-founded. Initially, though, the results returned will be correct: everyone on Earth is happy, as everyone knows a Jane transitively, while no Martian will be happy. Errors only pop up after incremental maintenance of results. If, by accident, we set the <code>knows</code> reference of a Martian to point to an Earthling, then suddenly all Martians will become happy as well. Later we realize our mistake and delete this reference - but surprisingly, VIATRA Query will still report that Martians are happy, even though the model was returned into its original state!
 +
 
 +
The key to the issue is that the final result set, where everyone is happy, is not actually contradicted by the query definition (since everyone knows somebody who is happy). It is said that this incorrect result is still a fixpoint, i.e. a solution to the query; however, it is not the '''least fixpoint''', which would be the actually desirable result. In this case, the least fixpoint would be the original, correct result: everyone on Earth is happy, while nobody on Mars is.
 +
 
 +
Therefore VIATRA Query, in its default mode of operation, can return incorrect results even for positively recursive queries, if the recursion is not well-founded. Fortunately, the error is known not manifest as long as the initial model is unchanged, or there are only additions. However, if there is deletion, movement of elements, or changing attribute or reference values, then it is possible that VIATRA Query will yield a non-minimal fixpoint as result, which is typically not desired.
 +
 
 +
Fortunately, there is a solution!
 +
 
 +
=== Delete and REDerive: conquering the ill-founded case ===
 +
 
 +
Since the 1.6 version, VIATRA Queries supports ''Delete and REDerive'' evaluation in the query engine. This evaluation strategy makes it possible to correctly compute the results of ''recursive graph patterns'' on ''instance models that contain cycles'' (i.e. when the recursion is ill-founded). Prior versions of VIATRA Queries supported only scenarios where at least one of the cycles was missing, that is, either the patterns were not recursive or the instance models were acyclic.
 +
 
 +
As of now, the Delete and REDerive evaluation can be manually enabled using the query evaluation hint <code>ReteHintOptions.deleteRederiveEvaluation</code>.
 +
 
 +
==== Optional reading: under the hood ====
 +
We demonstrate the problems of the old execution mode and the DRED solution by a concrete example.
 +
 +
Suppose that once in a while, people share secrets with each other. For the sake of the example, imagine that if a person is in a "talks to" relationship with another, then that person will also share his/her secret with the other person. The other person will eventually also share the previous person's secrets with others, that is, the sharing of secrets is transitive. In our example, it is also possible that a person revokes a secret, and, by that, the secret will be/should be also forgotten by all people that heard about that secret.
 +
 +
Given these assumptions, let’s model some real people and their secrets. Assume that we have four people Ann (A), Bill (B), Jane (J), and Mike (M), and we have the following talks to relationships: A -> B, B -> J, J -> M, J -> B. The four people also have some secrets, four numbers, these are respectively A - 1, B - 2, J - 3, and M - 4. In this initial setup, Ann does not know any secrets, but the others know everybody's secrets (including Ann's).
 +
 +
We can encode the secret sharing with VIATRA Queries graph patterns as follows:
 +
<source lang="java">
 +
// Directly known secrets by the given person through the talks to relationship
 +
pattern directSecrets(person : Person, secret : EString) {
 +
Person(other);
 +
Person.talksTo(other, person);
 +
Person.secret(other, secret);
 +
}
 +
 
 +
// Directly or transitively known secrets by the given person
 +
pattern allSecrets(person : Student, secret : EString) {
 +
find directSecrets(person, secret);
 +
} or {
 +
Person(other);
 +
Person.talksTo(other, person);
 +
find allSecrets(other, secret);
 +
}
 +
</source>
 +
 +
We can observe that the allSecrets pattern is recursive, and that the input model has a cycle through the "talks to" relationship. We encourage you to actually model this scenario in VIATRA Queries, and observe what happens if you DELETE the A -> B edge, that is, the scenario when Ann does not want to share her secret anymore. We would expect that the VIATRA Queries evaluation will derive that Ann's secret will be forgotten by the others (as it should be according to our example). However, this is not the case, Ann's secret is still known by everybody else. What has just happened?
 +
 +
In order to better understand what is going on under the hood, we need to introduce the notion of an ''alternative derivation'' of a tuple. Lets focus on the [Bill, 1] tuple which represents that Bill knows Ann's secret. Before the deletion of the A -> B edge, this tuple had two alternative derivations. One of them directly came from Ann because she shared her secret with Bill by directly talking to him. Bill then shared this secret with Jane, Jane with Mike, and Mike with Bill again, that is, Bill got to know Ann's secret through another alternative source, specifically, through Mike. Intuitively this means that two people shared Ann's secret with Bill, even though Mike got to know that secret through Bill himself. More formally, one of the derivations of the [Bill, 1] tuple is derived from the path A -> B, while the other is from A -> B -> J -> M -> B. Now, if we delete the A -> B edge, Ann's secret only loses one alternative derivation, but another one still remains because Bill relies on the information what Mike told him, while Mike relies on Jane, and, finally, Jane relies on Bill. What has happened is that the people in the cyclic "talks to" relationship are reinforcing each other in some false information (what is actually not true anymore). Because one alternative derivation remained, Bill is not forgetting Ann's secret, even though, he should (!), any, by that, all the others also keep that secret to themselves.
 +
 +
The Delete and REDerive evaluation mode helps in correctly computing the results in scenarios like this. The difference in the evaluation is as follows. When the A -> B edge is deleted, we decrement the counter of alternative derivations at Bill for Ann's secret from 2 to 1, ''but'' instead of concluding that Ann's secret is still known because of the remaining derivation, we kind of put the remaining derivation onto the side and temporarily forget about it. We do that because we want to see if that alternative still holds, and we do not want to falsely reinforce anybody by using that alternative. First, we let all the deletions to purge whatever needs to be purged, and only then start re-deriving information from what has survived the delete phase. What this means is that upon the deletion of the A -> B edge, Bill will say that he also does not know Ann's secret anymore (even though he has put aside the fact that he heard it from Mike). In response to that, Jane will also say that she does not know the secret, and, finally, Mike will also revoke his knowledge about that. The last bit is crucial because that one invalidates Bill's alternative information that was put aside before. The deletion phase has ended, and no tuples remained in the temporary store, which also means that we cannot re-derive anything. The evaluation has correctly derived that nobody knows Ann's secret once she is not talking to Bill anymore.
 +
 +
There are some important things to note:
 +
* The first one is related to the non-DRED evaluation. The VIATRA Queries engine propagates tuples as long as the results of some pattern(s) change, that is, until a fixpoint is reached. When we concluded that after the deletion of A -> B everybody still knows Ann's secret, the engine has reached a fixpoint, but it was not the LEAST (or minimal) fixpoint. Intuitively, we associated the non-minimal fixpoint to a wrong pattern result.
 +
* Another important aspect is that Delete and REDerive evaluation is not required if the model is changed only through insertions even if we have both kinds of cycles (patterns + instance models). This is because insertions are just expanding the results of patterns, and the previously explained cyclic reinforcement is not an issue in this case. 
 +
* Note that for the very common special case of transitive closures, the dedicated language element (transitive pattern call) is still likely to be more efficient than the DRED-based recursive solutions.
 +
* With a small penalty in execution time, DRed guarantees correct result maintenance for ''arbitrary'' recursive pattern structures as long as all recursive calls are positive (i.e. no negation or aggregation along cycles of recursion occur).
 +
 
 +
 
 +
=== Summary and suggestions ===
 +
In summary, VIATRA Query supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if either (i) we enable the new DRED mode, at a performance cost, or (ii) the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph). Otherwise (in default mode, with ill-founded recursion), the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.
 +
 
 +
Note that in many typical cases, the [[EMFIncQuery/UserDocumentation/QueryLanguage | transitive closure operator]] (e.g. <code>find knows+(x,y);</code>) is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by VIATRA Query even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to '''use transitive closure instead of recursion if possible'''.
 +
 
 +
== Functional dependencies ==
 +
 
 +
The performance of query evaluation may benefit in various ways from knowing the '''functional dependencies''' among pattern variables. We say that variables ''x1'', ''x2'', ''x3'', ... ''xn'' determine variable ''y'' if there can't be more than one value of ''y'' given a combination of values for ''x1'', ''x2'', ''x3'', ... ''xn''. In other words, ''x1'', ''x2'', ''x3'', ... ''xn'' together uniquely determine ''y''. Yet another way to put it: if two matches of the pattern agree on the values of variables ''x1'', ''x2'', ''x3'', ... ''xn'', they must also agree on the value of ''y''.  <br>
 +
 
 +
In many cases, the recognition of functional dependencies can drastically improve the performance of the evaluation process. It is therefore important to have the dependencies known in case of performance-critical queries.
 +
 
 +
=== Automatic inference ===
 +
Viatra Query has two ways to determine the functional dependencies of your queries: it does its best to automatically infer such dependencies, and you can also help by manually specifying some dependencies (see below). Automatic inference covers cases such as the source of a many-to-one reference uniquely determining its target; or the result of an <code>eval()</code> expression being determined by the variables used in the expression. Since version 1.5, dependencies among parameters for called patterns are also taken into account, though this kind of inference has its limits.
 +
 
 +
In particular, there are the following two main cases where Viatra is unable to automatically determine functional dependencies:
 +
<ul>
 +
<li>'''Domain-specific knowledge''': such as relative keys, or any other relationship that is not expressed in the metamodel (ecore). Say that Streets contain Houses that have their integer house numbers; in that case it is automatically known that a House determines the Street is resides in (as the containment reference is one-to-many) as well as its own house number; but it requires domain knowledge to understand that a Street and a house number together uniquely determine a House.  </li>
 +
<li>'''Disjunctive patterns''': as of version 1.5, there is no automatic inference of functional dependencies among parameters of patterns that have multiple pattern bodies in an <code>or</code> relationship.  </li>
 +
</ul>
 +
 
 +
 
 +
=== Manually specifying dependencies (since v1.5) ===
 +
 
 +
The <code>@FunctionalDependency</code> annotation can be used inform the query engine about additional functional dependencies that it would be unable to automatically recognize. The annotation is placed on a pattern, and expresses a functional dependency among pattern parameters. Annotation parameters indicate which query parameters determine which other ones. Note that is is not the evaluation of the annotated pattern, but rather other patterns calling it, that can take advantage of the supplied information. 
 +
 
 +
A single occurrence of the annotation expresses a single dependency rule; it is possible to decorate a single pattern with multiple such annotations. Each parameter listed with <code>forEach</code> is taken to appear on the left-hand-side of the dependency (see ''x1'' etc. in the above definition), and parameters listed with <code>unique</code> are on the right-hand-side (like ''y'' in the above definition), so that for each combination of values assigned to the "forEach" variables, the value of each "unique" variable has to be unique. See below for examples:
 +
 
 +
<source lang="java">
 +
// Here the first annotation is superfluous, as it is inferred automatically anyway
 +
// The second annotation expresses valuable domain knowledge though
 +
@FunctionalDependency(forEach = house, unique = street, unique = houseNumber)
 +
@FunctionalDependency(forEach = street, forEach = houseNumber, unique = house)
 +
pattern address(house: House, street: Street, houseNumber: java Integer) {
 +
Street.houses(street, house);
 +
House.number(house, houseNumber);
 +
}
 +
 
 +
// Houses are either on a Street or on a Road, but not both at the same time;
 +
//  however Viatra is not smart enough (yet) to figure that out.
 +
// In disjunctive patterns, all dependencies have to be specified manually!
 +
@FunctionalDependency(forEach = house, unique = location)
 +
pattern locatedOnThoroughfare(house: House, location: Thoroughfare) {
 +
Street.houses(location, house);
 +
} or {
 +
Road.houses(location, house);
 +
}
 +
</source>

Latest revision as of 15:15, 14 November 2017

Stop.png
Old information
This page is not updated anymore; for more up-to-date details look at the language specification at https://www.eclipse.org/viatra/documentation/query-language.html instead.

Advanced query language topics in VIATRA Query

Working with EMaps in VIATRA Query

The eclipse.org EMF wiki gives a proper FAQ about the various modeling related issues, including the usage of EMaps in your metamodel. With VIATRA Query you can even write your own queries to extract the key-value pairs from your instance model.

EMaps in your metamodel

  1. Creating the actual EMap type: Create an EClass with the name [Type1]To[Type2]Map where [Type1] represents the key's type and the [Type2] represents the value's type.
  2. Set the Instance Type Name property of the newly created EClass to java.util.Map$Entry.
  3. Create an EAttribute or EReference named "key" and set the EDataType or EClass for it.
  4. Create an EAttribute or EReference called "value" and set the EDataType or EClass for it.

For example for an <EString, EString> EMap you would have an EClass named EStringToEStringMap if you follow the mentioned scheme.

To actually use this newly created type follow these steps:

  1. Create an EReference with its EClass set to be the map-entry class you created above.
  2. Set the Containment property of your EReference to be true.
  3. Set the upper-bound of your EReference to be -1 (unbounded).

For example an <EString, EString> map would result an EClass named EStringToEStringMap, if you follow the mentioned scheme.

EMaps in your instance model

The contents of the EMap instances can be modified like in every other instance model. One EStringToEstringMap instance will be used an a map entry (key-value pair).

Querying EMaps from VIATRA Query patterns

Here is an example query to extract the key-value pairs from an EMap:

  pattern emapPattern(K : EString, V : EString) {
    EMapTestElement(M); 
    EMapTestElement.map(M, Map); 
    EStringToEStringMap.key(Map, K);
    EStringToEStringMap.value(Map, V);
  }

Parts of this overview are based on the http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F page.


Recursive queries in VIATRA Query

As explained on the query language introduction page, VIATRA Query supports pattern composition via the find keyword. Does it support recursive composition, i.e. a pattern calling itself, or multiple patterns cyclically referencing each other? Yes, it does, albeit with limits. The situation is complicated, as described below; see #Summary and suggestions for an executive summary.

Motivating positive recursion

First of all, there are cases where recursion is plain nonsense, such as this query:

pattern meaningless(x) = { neg find meaningless(x); }

For every choice of value of the variable x, it is a match of pattern meaningless if and only if it is not a match of the same pattern. It is easy to see that this is a contradiction - do not expect VIATRA Query to be useful for evaluating such queries.

To avoid such contradictions, VIATRA Query supports positive recursion only, i.e. patterns referencing themselves or each other cyclically, solely by positive find pattern calls, never by negation (neg find) or aggregation (count find). (In mathematics, this property is called stratification.) Positively recursive queries are always meaningful - unfortunately, they still will not work in all cases, as explained below.

From this point onward, the discussion will be restricted to stratified / positive recursion.

Well-founded recursion

Suppose that we have elements of type Node forming a containment hierarchy of parents and children, and we want to assign them qualified names composed from their simple names and and the name of their parent. Let's see the following recursive pattern:

pattern qualifiedName(node : Node, name) = { 
    // for a child element, compose from parent's qualified name
    find parent(node, parentNode);
    Node.simpleName(node, simpleName);
    find qualifiedName(parent, parentName); // recursive call
    name == eval (parentName + "." + simpleName);
  } or { 
    // for a root element, just use the simple name
    neg find parent(node, _anyParent);  // has no parents
    Node.simpleName(node, name);
  }

This is an example of correct usage of recursion in VIATRA Query.

Take a moment to observe how recursion works here. The pattern qualifiedName recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match (node, name) does not recursively depend on whether (node, name) is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc. As this dependency relationship follows the parent relationship, which represents a containment tree, there can be no dependency cycles.

In general, VIATRA Query returns correct results for positively recursive queries that are well-founded, i.e. individual matches never support each other cyclically. This is typically found to be the case if the recursion traverses along a containment tree (in either direction), or any graph structure that is known to be a DAG (directed acyclic graph).

Optional reading: problems in the ill-founded case

As an aside, one can draw parallels with imperative programs, where the well-founded property of a recursive subroutine would warrant that the recursion terminates. If a recursive program is not well-founded, the subroutine may not terminate. VIATRA Query, however, is guaranteed to terminate even for recursive queries that are not well-founded; the problem lies elsewhere.

Suppose that we have a bunch of people on Earth, and we know that people called Jane are happy; furthermore, everyone else is happy who knows someone that is happy. Suppose now that there is also a society of Martians, who are Persons as well. There are no Janes on Mars, and no Martians know people on Earth.

  pattern happy(x : Person) = { 
    Person.name(x, "Jane");
  } or { 
    Person.knows(x, y);
    find happy(y);  // ill-founded recursion
  }

Since it is possible to have several people that cyclically know each other (in fact, two people are enough that mutually know each other), the recursion in the above query is not well-founded. Initially, though, the results returned will be correct: everyone on Earth is happy, as everyone knows a Jane transitively, while no Martian will be happy. Errors only pop up after incremental maintenance of results. If, by accident, we set the knows reference of a Martian to point to an Earthling, then suddenly all Martians will become happy as well. Later we realize our mistake and delete this reference - but surprisingly, VIATRA Query will still report that Martians are happy, even though the model was returned into its original state!

The key to the issue is that the final result set, where everyone is happy, is not actually contradicted by the query definition (since everyone knows somebody who is happy). It is said that this incorrect result is still a fixpoint, i.e. a solution to the query; however, it is not the least fixpoint, which would be the actually desirable result. In this case, the least fixpoint would be the original, correct result: everyone on Earth is happy, while nobody on Mars is.

Therefore VIATRA Query, in its default mode of operation, can return incorrect results even for positively recursive queries, if the recursion is not well-founded. Fortunately, the error is known not manifest as long as the initial model is unchanged, or there are only additions. However, if there is deletion, movement of elements, or changing attribute or reference values, then it is possible that VIATRA Query will yield a non-minimal fixpoint as result, which is typically not desired.

Fortunately, there is a solution!

Delete and REDerive: conquering the ill-founded case

Since the 1.6 version, VIATRA Queries supports Delete and REDerive evaluation in the query engine. This evaluation strategy makes it possible to correctly compute the results of recursive graph patterns on instance models that contain cycles (i.e. when the recursion is ill-founded). Prior versions of VIATRA Queries supported only scenarios where at least one of the cycles was missing, that is, either the patterns were not recursive or the instance models were acyclic.

As of now, the Delete and REDerive evaluation can be manually enabled using the query evaluation hint ReteHintOptions.deleteRederiveEvaluation.

Optional reading: under the hood

We demonstrate the problems of the old execution mode and the DRED solution by a concrete example.

Suppose that once in a while, people share secrets with each other. For the sake of the example, imagine that if a person is in a "talks to" relationship with another, then that person will also share his/her secret with the other person. The other person will eventually also share the previous person's secrets with others, that is, the sharing of secrets is transitive. In our example, it is also possible that a person revokes a secret, and, by that, the secret will be/should be also forgotten by all people that heard about that secret.

Given these assumptions, let’s model some real people and their secrets. Assume that we have four people Ann (A), Bill (B), Jane (J), and Mike (M), and we have the following talks to relationships: A -> B, B -> J, J -> M, J -> B. The four people also have some secrets, four numbers, these are respectively A - 1, B - 2, J - 3, and M - 4. In this initial setup, Ann does not know any secrets, but the others know everybody's secrets (including Ann's).

We can encode the secret sharing with VIATRA Queries graph patterns as follows:

// Directly known secrets by the given person through the talks to relationship
pattern directSecrets(person : Person, secret : EString) {
	Person(other);
	Person.talksTo(other, person);
	Person.secret(other, secret);
}
 
// Directly or transitively known secrets by the given person
pattern allSecrets(person : Student, secret : EString) {
	find directSecrets(person, secret);
} or {
	Person(other);
	Person.talksTo(other, person);
	find allSecrets(other, secret);
}

We can observe that the allSecrets pattern is recursive, and that the input model has a cycle through the "talks to" relationship. We encourage you to actually model this scenario in VIATRA Queries, and observe what happens if you DELETE the A -> B edge, that is, the scenario when Ann does not want to share her secret anymore. We would expect that the VIATRA Queries evaluation will derive that Ann's secret will be forgotten by the others (as it should be according to our example). However, this is not the case, Ann's secret is still known by everybody else. What has just happened?

In order to better understand what is going on under the hood, we need to introduce the notion of an alternative derivation of a tuple. Lets focus on the [Bill, 1] tuple which represents that Bill knows Ann's secret. Before the deletion of the A -> B edge, this tuple had two alternative derivations. One of them directly came from Ann because she shared her secret with Bill by directly talking to him. Bill then shared this secret with Jane, Jane with Mike, and Mike with Bill again, that is, Bill got to know Ann's secret through another alternative source, specifically, through Mike. Intuitively this means that two people shared Ann's secret with Bill, even though Mike got to know that secret through Bill himself. More formally, one of the derivations of the [Bill, 1] tuple is derived from the path A -> B, while the other is from A -> B -> J -> M -> B. Now, if we delete the A -> B edge, Ann's secret only loses one alternative derivation, but another one still remains because Bill relies on the information what Mike told him, while Mike relies on Jane, and, finally, Jane relies on Bill. What has happened is that the people in the cyclic "talks to" relationship are reinforcing each other in some false information (what is actually not true anymore). Because one alternative derivation remained, Bill is not forgetting Ann's secret, even though, he should (!), any, by that, all the others also keep that secret to themselves.

The Delete and REDerive evaluation mode helps in correctly computing the results in scenarios like this. The difference in the evaluation is as follows. When the A -> B edge is deleted, we decrement the counter of alternative derivations at Bill for Ann's secret from 2 to 1, but instead of concluding that Ann's secret is still known because of the remaining derivation, we kind of put the remaining derivation onto the side and temporarily forget about it. We do that because we want to see if that alternative still holds, and we do not want to falsely reinforce anybody by using that alternative. First, we let all the deletions to purge whatever needs to be purged, and only then start re-deriving information from what has survived the delete phase. What this means is that upon the deletion of the A -> B edge, Bill will say that he also does not know Ann's secret anymore (even though he has put aside the fact that he heard it from Mike). In response to that, Jane will also say that she does not know the secret, and, finally, Mike will also revoke his knowledge about that. The last bit is crucial because that one invalidates Bill's alternative information that was put aside before. The deletion phase has ended, and no tuples remained in the temporary store, which also means that we cannot re-derive anything. The evaluation has correctly derived that nobody knows Ann's secret once she is not talking to Bill anymore.

There are some important things to note:

  • The first one is related to the non-DRED evaluation. The VIATRA Queries engine propagates tuples as long as the results of some pattern(s) change, that is, until a fixpoint is reached. When we concluded that after the deletion of A -> B everybody still knows Ann's secret, the engine has reached a fixpoint, but it was not the LEAST (or minimal) fixpoint. Intuitively, we associated the non-minimal fixpoint to a wrong pattern result.
  • Another important aspect is that Delete and REDerive evaluation is not required if the model is changed only through insertions even if we have both kinds of cycles (patterns + instance models). This is because insertions are just expanding the results of patterns, and the previously explained cyclic reinforcement is not an issue in this case.
  • Note that for the very common special case of transitive closures, the dedicated language element (transitive pattern call) is still likely to be more efficient than the DRED-based recursive solutions.
  • With a small penalty in execution time, DRed guarantees correct result maintenance for arbitrary recursive pattern structures as long as all recursive calls are positive (i.e. no negation or aggregation along cycles of recursion occur).


Summary and suggestions

In summary, VIATRA Query supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if either (i) we enable the new DRED mode, at a performance cost, or (ii) the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph). Otherwise (in default mode, with ill-founded recursion), the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.

Note that in many typical cases, the transitive closure operator (e.g. find knows+(x,y);) is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by VIATRA Query even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to use transitive closure instead of recursion if possible.

Functional dependencies

The performance of query evaluation may benefit in various ways from knowing the functional dependencies among pattern variables. We say that variables x1, x2, x3, ... xn determine variable y if there can't be more than one value of y given a combination of values for x1, x2, x3, ... xn. In other words, x1, x2, x3, ... xn together uniquely determine y. Yet another way to put it: if two matches of the pattern agree on the values of variables x1, x2, x3, ... xn, they must also agree on the value of y.

In many cases, the recognition of functional dependencies can drastically improve the performance of the evaluation process. It is therefore important to have the dependencies known in case of performance-critical queries.

Automatic inference

Viatra Query has two ways to determine the functional dependencies of your queries: it does its best to automatically infer such dependencies, and you can also help by manually specifying some dependencies (see below). Automatic inference covers cases such as the source of a many-to-one reference uniquely determining its target; or the result of an eval() expression being determined by the variables used in the expression. Since version 1.5, dependencies among parameters for called patterns are also taken into account, though this kind of inference has its limits.

In particular, there are the following two main cases where Viatra is unable to automatically determine functional dependencies:

  • Domain-specific knowledge: such as relative keys, or any other relationship that is not expressed in the metamodel (ecore). Say that Streets contain Houses that have their integer house numbers; in that case it is automatically known that a House determines the Street is resides in (as the containment reference is one-to-many) as well as its own house number; but it requires domain knowledge to understand that a Street and a house number together uniquely determine a House.
  • Disjunctive patterns: as of version 1.5, there is no automatic inference of functional dependencies among parameters of patterns that have multiple pattern bodies in an or relationship.


Manually specifying dependencies (since v1.5)

The @FunctionalDependency annotation can be used inform the query engine about additional functional dependencies that it would be unable to automatically recognize. The annotation is placed on a pattern, and expresses a functional dependency among pattern parameters. Annotation parameters indicate which query parameters determine which other ones. Note that is is not the evaluation of the annotated pattern, but rather other patterns calling it, that can take advantage of the supplied information.

A single occurrence of the annotation expresses a single dependency rule; it is possible to decorate a single pattern with multiple such annotations. Each parameter listed with forEach is taken to appear on the left-hand-side of the dependency (see x1 etc. in the above definition), and parameters listed with unique are on the right-hand-side (like y in the above definition), so that for each combination of values assigned to the "forEach" variables, the value of each "unique" variable has to be unique. See below for examples:

// Here the first annotation is superfluous, as it is inferred automatically anyway
// The second annotation expresses valuable domain knowledge though
@FunctionalDependency(forEach = house, unique = street, unique = houseNumber)
@FunctionalDependency(forEach = street, forEach = houseNumber, unique = house)
pattern address(house: House, street: Street, houseNumber: java Integer) {
	Street.houses(street, house);
	House.number(house, houseNumber); 
}
 
// Houses are either on a Street or on a Road, but not both at the same time;
//  however Viatra is not smart enough (yet) to figure that out.
// In disjunctive patterns, all dependencies have to be specified manually!
@FunctionalDependency(forEach = house, unique = location)
pattern locatedOnThoroughfare(house: House, location: Thoroughfare) {
	Street.houses(location, house);
} or {
	Road.houses(location, house);
}

Back to the top