Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "VIATRA/Query/UserDocumentation/AdvancedPatterns"

< VIATRA‎ | Query
(Recursive queries in EMF-IncQuery)
(Recursive queries in EMF-IncQuery)
Line 69: Line 69:
 
This is an example of correct usage of recursion in EMF-IncQuery.  
 
This is an example of correct usage of recursion in EMF-IncQuery.  
  
Take a minute to observe how recursion is achieved here. The pattern <code>qualifiedName</code> recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match \code{(node, name)} does not recursively depend on whether \code{(node, name)} is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc.
+
Take a moment to observe how recursion works here. The pattern <code>qualifiedName</code> recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match \code{(node, name)} does not recursively depend on whether \code{(node, name)} is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc. As this dependency relationship follows the <code>parent</code> relationship, which represents a containment tree, there can be no dependency cycles.
  
This is actually a pretty good observation.
+
In general, EMF-IncQuery returns correct results for positively recursive queries that are '''well-founded''', i.e. individual matches never support each other cyclically. This is typically found to be the case if the recursion traverses along a containment tree (in either direction), or any graph structure that is known to be a DAG (directed acyclic graph).
  
 
=== Behaviour in the ill-founded case ===
 
=== Behaviour in the ill-founded case ===
 +
 +
As an aside, one can draw parallels with imperative programs, where the well-founded property of a recursive subroutine would warrant that the recursion terminates. If a recursive program is not well-founded, the subroutine may not terminate. EMF-IncQuery, however, is guaranteed to terminate even for recursive queries that are not well-founded; the problem lies elsewhere.
 +
 +
Suppose that we have a bunch of people on Earth, and we know that people called ''Jane'' are happy; furthermore, everyone else is happy who knows someone that is happy. Suppose now that there is also a society of Martians, who are Persons as well. There are no Janes on Mars, and no Martians know people on Earth.
 +
<code>
 +
  pattern happy(x : Person) = {
 +
    Person.name(x, "Jane");
 +
  } or {
 +
    Person.knows(x, y);
 +
    find happy(y);  // ill-founded recursion
 +
  }
 +
</code>
 +
Since it is possible to have several people that cyclically know each other (in fact, two people are enough that mutually know each other), the recursion in the above query is not well-founded. Initially, though, the results returned will be correct: everyone on Earth is happy, as everyone knows a Jane transitively, while no Martian will be happy. Errors only pop up after incremental maintenance of results. If, by accident, we set the <code>knows</code> reference of a Martian to point to an Earthling, then suddenly all Martians will become happy as well. Later we realize our mistake and delete this reference - but surprisingly, EMF-IncQuery will still report that Martians are happy, even though the model was returned into its original state!
 +
 +
The key to the issue is that the final result set, where everyone is happy, is not actually contradicted by the query definition (since everyone knows somebody who is happy). It is said that this incorrect result is still a fixpoint, i.e. a solution to the query; however, it is not the '''least fixpoint''', which would be the actually desirable result. In this case, the least fixpoint would be the original, correct result: everyone on Earth is happy, while nobody on Mars is.
 +
 +
Therefore EMF-IncQuery can return incorrect results even for positively recursive queries, if the recursion is not well-founded. Fortunately, the error is known not manifest as long as the initial model is unchanged, or there are only additions. However, if there is deletion, movement of elements, or changing attribute or reference values, then it is possible that EMF-IncQuery will yield a non-minimal fixpoint as result, which is typically not desired.
  
 
=== Summary and suggestions ===
 
=== Summary and suggestions ===
 
In summary, EMF-IncQuery supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph); otherwise, the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.  
 
In summary, EMF-IncQuery supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph); otherwise, the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.  
  
Note that in many typical cases, the [[EMFIncQuery/UserDocumentation/QueryLanguage | transitive closure operator]] is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by EMF-IncQuery even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to '''use transitive closure instead of recursion if possible'''.
+
Note that in many typical cases, the [[EMFIncQuery/UserDocumentation/QueryLanguage | transitive closure operator]] (e.g. <code>find knows+(x,y);</code>) is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by EMF-IncQuery even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to '''use transitive closure instead of recursion if possible'''.

Revision as of 14:21, 29 June 2015

Advanced query language topics in EMF-IncQuery

Working with EMaps in EMF-IncQuery

The eclipse.org EMF wiki gives a proper FAQ about the various modeling related issues, including the usage of EMap-s in your metamodel. With EMF-IncQuery you can even write your own queries to extract the key-value pairs from your instance model.

EMaps in your metamodel

  1. Creating the actual EMap type Create an EClass with the name [Type1]To[Type2]Map where [Type1] represents the key's type and the [Type2] represents the value's type.
  2. Set the Instance Type Name property of the newly created EClass to java.util.Map$Entry.
  3. Create an EAttribute or EReference named "key" and set the EDataType or EClass for it.
  4. Create an EAttribute or EReference called "value" and set the EDataType or EClass for it.

For example for an <EString, EString> EMap you would have an EClass named EStringToEStringMap if you follow the mentioned scheme.

To actually use this newly created type follow these steps:

  1. Create an EReference with its EClass set to be the map-entry class you created above.
  2. Set the Containment property of your EReference to be true.
  3. Set the upper-bound of your EReference to be -1 (unbounded).

For example an <EString, EString> map would result an EClass named EStringToEStringMap, if you follow the mentioned scheme.

EMaps in your instance model

The contents of the EMap instances can be modified like in every other instance model. One EStringToEstringMap instance will be used an a map entry (key-value pair).

Querying EMaps from EMF-IncQuery patterns

Here is an example query to extract the key-value pairs from an EMap

pattern emapPattern(K : EString, V : EString) {
   EMapTestElement(M);
   EMapTestElement.map(M, Map);
   EStringToEStringMap.key(Map, K);
   EStringToEStringMap.value(Map, V);
}

Parts of this overview are based on the http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F page.


Recursive queries in EMF-IncQuery

As explained on the query language introduction page, EMF-IncQuery supports pattern composition via the find keyword. Does it support recursive composition, i.e. a pattern calling itself, or multiple patterns cyclically referencing each other? Yes, it does, albeit with limits. The situation is complicated, as described below; see #Summary and suggestions for an executive summary.

Motivating positive recursion

First of all, there are cases where recursion is plain nonsense, such as this query:

pattern meaningless(x) = { neg find meaningless(x); } 

For every choice of value of the variable x, it is a match of pattern meaningless if and only if it is not a match of the same pattern. It is easy to see that this is a contradiction - do not expect EMF-IncQuery to be useful for evaluating such queries.

To avoid such contradictions, EMF-IncQuery supports positive recursion only, i.e. patterns referencing themselves or each other cyclically, solely by positive find pattern calls, never by negation (neg find) or aggregation (count find). (In mathematics, this property is called stratification.) Positively recursive queries are always meaningful - unfortunately, they still will not work in all cases, as explained below.

From this point onward, the discussion will be restricted to stratified / positive recursion.

Well-founded recursion

Suppose that we have elements of type Node forming a containment hierarchy of parents and children, and we want to assign them qualified names composed from their simple names and and the name of their parent. Let's see the following recursive pattern:

 pattern qualifiedName(node : Node, name) = { 
   // for a child element, compose from parent's qualified name
   find parent(node, parentNode);
   Node.simpleName(node, simpleName);
   find qualifiedName(parent, partentName); // recursive call
   name == eval (partentName + "." + simpleName);
 } or { 
   // for a root element, just use the simple name
   neg find parent(node, _anyParent);  // has no parents
   Node.simpleName(node, name);
 } 

This is an example of correct usage of recursion in EMF-IncQuery.

Take a moment to observe how recursion works here. The pattern qualifiedName recursively calls itself in one of its bodies. This means that the result of this query depends on itself, which is seemingly problematic - however, if we look carefully, we discover that on the level of individual pattern matches (i.e. tuples of nodes and their qualified name), there are no dependency cycles. To elaborate, the match \code{(node, name)} does not recursively depend on whether \code{(node, name)} is a match; it only depends on whether \code{(parent, name)} is match; which, in turn, will depend on the parent of the parent node, etc. As this dependency relationship follows the parent relationship, which represents a containment tree, there can be no dependency cycles.

In general, EMF-IncQuery returns correct results for positively recursive queries that are well-founded, i.e. individual matches never support each other cyclically. This is typically found to be the case if the recursion traverses along a containment tree (in either direction), or any graph structure that is known to be a DAG (directed acyclic graph).

Behaviour in the ill-founded case

As an aside, one can draw parallels with imperative programs, where the well-founded property of a recursive subroutine would warrant that the recursion terminates. If a recursive program is not well-founded, the subroutine may not terminate. EMF-IncQuery, however, is guaranteed to terminate even for recursive queries that are not well-founded; the problem lies elsewhere.

Suppose that we have a bunch of people on Earth, and we know that people called Jane are happy; furthermore, everyone else is happy who knows someone that is happy. Suppose now that there is also a society of Martians, who are Persons as well. There are no Janes on Mars, and no Martians know people on Earth.

 pattern happy(x : Person) = { 
   Person.name(x, "Jane");
 } or { 
   Person.knows(x, y);
   find happy(y);  // ill-founded recursion
 } 

Since it is possible to have several people that cyclically know each other (in fact, two people are enough that mutually know each other), the recursion in the above query is not well-founded. Initially, though, the results returned will be correct: everyone on Earth is happy, as everyone knows a Jane transitively, while no Martian will be happy. Errors only pop up after incremental maintenance of results. If, by accident, we set the knows reference of a Martian to point to an Earthling, then suddenly all Martians will become happy as well. Later we realize our mistake and delete this reference - but surprisingly, EMF-IncQuery will still report that Martians are happy, even though the model was returned into its original state!

The key to the issue is that the final result set, where everyone is happy, is not actually contradicted by the query definition (since everyone knows somebody who is happy). It is said that this incorrect result is still a fixpoint, i.e. a solution to the query; however, it is not the least fixpoint, which would be the actually desirable result. In this case, the least fixpoint would be the original, correct result: everyone on Earth is happy, while nobody on Mars is.

Therefore EMF-IncQuery can return incorrect results even for positively recursive queries, if the recursion is not well-founded. Fortunately, the error is known not manifest as long as the initial model is unchanged, or there are only additions. However, if there is deletion, movement of elements, or changing attribute or reference values, then it is possible that EMF-IncQuery will yield a non-minimal fixpoint as result, which is typically not desired.

Summary and suggestions

In summary, EMF-IncQuery supports positive (stratified) recursion only. Even for positive recursion, correct (minimal fixpoint) results are only guaranteed if the recursion is well-founded (e.g. moves along a containment hierarchy or acyclic graph); otherwise, the results are OK only if the model is guaranteed to only ever change by monotonously inserting new stuff, never deleting, moving or changing.

Note that in many typical cases, the transitive closure operator (e.g. find knows+(x,y);) is sufficient to expressed the desired query, without having to resort to recursions. Transitive closures are successfully evaluated and incrementally maintained by EMF-IncQuery even in cases where recursion would be ill-founded and fail (e.g. reachability along relationships that may contain cycles). Even in case the recursion is well-founded, the transitive closure operator may or may not lead to better performance. Therefore our primary recommendation is to use transitive closure instead of recursion if possible.

Back to the top