Jump to: navigation, search

EMFIncQuery/UserDocumentation/QueryLanguage

Language concepts

For the query language, we reuse the concepts of graph patterns (which is a key concept in many graph transformation tools) as a concise and easy way to specify complex structural model queries. These graph-based queries can capture interrelated constellations of EMF objects, with the following benefits:

  • the language is expressive and provides powerful features such as negation or counting,
  • graph patterns are composable and reusable,
  • queries can be evaluated with great freedom, i.e. input and output parameters can be selected at run-time,
  • some frequently encountered shortcomings of EMF’s interfaces are addressed:
    • easy and efficient enumeration of all instances of a class regardless of location,
    • simple backwards navigation along all kinds of references (even without eOpposite)
    • finding objects based on attribute value.

The current version of the IncQuery Graph Pattern language (IQPL) owes many of its syntax to the VTCL language of model transformation framework VIATRA2. If you would like to read more on the foundations of the new language, we kindly point you to our [ICMT 2011] paper (important note: the most up-to-date IncQuery language syntax differs slightly from the examples of the ICMT paper).

References to Ecore metamodels

The IQPL language is statically bound to one or more Ecore metamodels, providing type inference and advanced validation of the implemented queries. Additionally, the tooling (especially the code generator) needs access to the corresponding EMF Generator models as well.

Three different mechanisms are used to match the required EPackages (declared by nsUri) to their definitions (and generator models):

  1. EPackages used in the EMF EPackage Registry are always available.
  2. The Eclipse plug-ins of the target platform and the currently developed ones might also contribute other plug-ins. For that, their corresponding plugin.xml file should contain an org.eclipse.emf.ecore.generated_package extension point.
  3. If neither of the previous mechanism works, you can put an IncQuery Generator model into your EMF-IncQuery project, and add a mapping between the EPackage nsUri and the uri to find a genmodel.

In normal cases, it is highly recommended to stick with the first two approaches whenever possible, and only rely on the IncQuery Generator Models if you are not capable of making the EMF model available as expected.

Warning: as of some shortcomings of the EMF generator, the org.eclipse.emf.ecore.generated_package extension of the Ecore model project might contain incorrect EPackage nsUri (e.g. if the package was renamed), might miss a generator model reference or the entire definition might be missing (e.g. if a new EPackage was introduced after the code generator was executed). In such cases, try to manually repair the EMF model projects, as it makes the integration of EMF models into most applications easier.

Short syntax guide

See also the language tutorial and the School example.

File Header

  1. Import declarations are required to indicate which EMF packages are referenced in the query definitions.
  2. Enclose pattern definitions in a package:
    • package my.own.patterns

Pattern Structure

  1. Introduce a pattern by the pattern keyword, a pattern name, and a list of parameter variables. Then enclose in curly braces a list of constraints that define when the pattern should match.
    • pattern myPattern(a,b,c) = {...pattern contraints...}
  2. Pattern parameters can be suffixed by a type declaration, that will be valid in each pattern body. Here is an alternative way to express the type of variable B:
    • pattern myPattern(a,b : MyClass,c) = {...pattern contraints...};
    • In the language, these parameter types are considered the same as type constraints in the pattern body.
  3. Disjunction ("or") can be expressed by linking several pattern bodies with the or keyword:
    • pattern myPattern(a,b,c) = {... pattern contraints ...} or {... pattern constraints ...}

Basic Pattern Constraints

The most basic pattern constraints are type declarations: use EClasses, ERelations and EAttributes. The data types should also be fine.

  1. An EClass constraint expressing that the variable MyEntityVariable must take a value that is an EObject of the class MyClass (from EPackage my.own.ePackage, as imported above) looks like this:
    • MyClass(myEntityVariable);
  2. A relation constraint for the EReference MyReference expressing that myEntityVariable is of eClass MyClass and its MyReference EReference is pointing to TheReferencedEntity (or if MyReference is many-valued, then it is one of the target object contained in the EList), as seen below:
    • MyClass.MyReference(myEntityVariable, theReferencedEntity);
  3. A relation constraint for an EAttribute, asserting that theAttributeVariable is the String/Integer/etc. object that is the MyAttribute value of myEntityVariable, looks exactly the same as the EReference constraint:
    • MyClass.MyAttribute(myEntityVariable, theAttributeVariable);
  4. Such reference navigations can be chained; the last step may either be a reference or attribute traversal:
    • MyClass.MyReference.ReferenceFromThere.AnotherReference.MyAttribute(myEntityVariable, myString);
  5. (You will probably not need this, but EDatatype type constraints can be applied on attribute values, with a syntax similar to that used for EObjects, and with the additional semantics that the attribute value must come from the model, not just any int/String/etc. computed e.g. by counting):
    • MyDatatype(myAttributeVariable);
    • or for the built-in datatypes (import the Ecore metamodel):
    • EString(myAttributeVariable);

Advanced Pattern Constraints

  1. By default, each variable you define may be equal to each other variable in a query. This is especially important to know when using attributes or multiple variables with the same type (or supertype).
    1. For example, if you have two variables ,yClass(someObj1), MyClass(someObj2), someObj1 and someObj2 may match to the same EObject.
    2. If you want to declare that two variables mustn't be equal, you can write:
      • someObj1 != someObj2;
    3. If you want to declare, that two variables must be the same, you can write:
      • someObj1 == someObj2;
  2. Pattern composition: you can reuse a previously define pattern by calling it in a different pattern's body:
    • find otherPattern(oneParameter, otherParameter, thirdParameter);
  3. You can express negation with the neg keyword. Those actual parameters of the negative pattern call that are not used elsewhere in the calling body will be quantified; this means that the calling pattern only matches if no substitution of these calling variables could be found. See examples in order to understand. The below constraint asserts that for the given value of the (elsewhere defined) variable myEntityVariable, the pattern neighborPattern does not match for any values of otherParameter (not mentioned elsewhere).
    • neg find neighborPattern(myEntityVariable, otherParameter);
  4. In the above constraints, wherever you could use an (attribute) variable in a pattern body, you can also use:
    1. Constant literals of primitive types, such as 10, or "Hello World".
    2. Constant literals of enumeration types, such as MyEEnum::MY_LITERAL
    3. Aggregation of multiple matches of a called pattern into a single value. Currently match counting is supported, in a syntax analogous to negative pattern calls:
      • HowManyNeighbors == count find neighborPattern(myEntityVariable, _);
    4. Attribute expression evaluation: the eval() construct lets you compute values by Java (actually Xbase) expressions using the value of attribute variables:
      • qualifiedName == eval(parentName + "." + simpleName);
      • The Java types of variables are inferred based on the EMF Ecore specification (using the generated Java classes).
  5. Additional attribute constraints using the check() construct, similarly to eval():
    • check(aNumberVariable > aStringVariable.length());
    • Semantically equivalent to true == eval(aNumberVariable > aStringVariable.length());
    • The Java types of variables are inferred based on the EMF Ecore specification (using the generated Java classes).
  6. One can also use the transitive closure of binary patterns in a pattern call, such as the transitive closure of pattern friend:
    • find friend+(myGuy, friendOfAFriendOfAFriend);

Good to know about pattern variables

Pattern variables, regardless whether they represent EObjects or attribute/computed values, adhere to the following rules.

  • Pattern parameters are the variables declared in the pattern header, which will be visible for the "outside world" (i.e. other patterns via pattern composition, or client code / GUI using EMF-IncQuery to find the matches of the pattern). This imposes the following rules of usage on them:
    1. Each parameter must occur in each body of the pattern, since the query engine must be able to come up with a value for them, which is only possible if you say something about them :)
    2. Moreover, there are some constraints that are not sufficient by their own to come up with possible values for the parameter variable; these include aggregation (counting), check() and eval(), inequality and negative pattern calls. Thus to restate the rule: there must be a positive usage of each pattern parameter in each pattern body, where positive usage may mean node and edge constraints, positive pattern calls, as well as being equated with other variables (that have a positive usage), literal values or the result of a count() or eval().
    3. As shown before, type declarations can be directly suffixed to pattern parameters in the pattern header. These count as a positive variable usage in every single body.
  • The rest of the variables are local variables of a pattern body. Multiple bodies may have local variables with the same name. Local variables also have some rules of usage:
    1. Most of the time a pattern body will use a variable at least twice, since the role of local variables is to connect two pattern constraints. For example, if the query selects items whose status is UNFINISHED, then the status variable is used twice (in a constraint making sure that it belongs to the item, and in another constraint comparing it against UNFINISHED); while the item variable is the single pattern parameter.
      • pattern unfinishedItem(item) = { Item.status(item, status); status == Status::UNFINISHED; };
      • Since this is the case, the editor will underline any local variables that occur only once, assuming that you have simply mistyped their name or omitted a constraint. This handy feature may save a lot of debugging!
    2. As an exception, there are rare cases where you deliberately want a variable to be used only once. For example, you might look for a parent, i.e. a Person with at least one child, regardless of any properties of this child. In this case, you have to prefix the name of the single-use variable with the underscore character '_' to indicate that it will not be used anywhere else, and this is intentional.
      • pattern parent(person) = { find hasChild(person, _anyChild); };
    3. There is a more concise form of the latter: if you only use a variable once, it is OK not to name it; just use a single underscore instead of the variable reference. In fact, each occurrence of this anonymous variable will be treated as a separate, single-use variable that is distinguished from any other anonymous variable. (This should look self-evident to those who are familiar with Prolog.)
      • pattern parent(person) = { find hasChild(person, _); };
    4. In case of negative pattern calls and aggregation, anonymous and single-use variables are quantified, since they would make no sense otherwise. See the following examples:
      • neg find hasChild(_, _); means that currently there are no parent-child relationships in the model at all.
      • neg find hasChild(person, _); means that this specific person has no children at all; the person variable must be used elsewhere by other (positive) pattern constraints.
      • neg find hasChild(person, child); means that this specific person is not the parent of this specific child; both variables must be used elsewhere by other (positive) pattern constraints.
      • count find hasChild(_, _) is the number of parent-child relationships in the model.
      • count find hasChild(person, _) is the number of children of this specific person; the person variable must be used elsewhere by other (positive) pattern constraints.
      • count find hasChild(person, child) is not very useful: it evaluates to 1 if this specific person is the parent of this specific child, 0 otherwise; both variables must be used elsewhere by other (positive) pattern constraints.
  • The pattern will have a match if its variables can be substituted with values so that all constraints are satisfied. However, two matches are considered different only if they differ in parameter variables. So more precisely, a match of the pattern is a value substitution for the pattern parameters with the properties that there is at least one way to substitute values for the local variables of at least one of the pattern bodies so that the parameter and local variables together satisfy all constraint of that pattern body (plus type declarations suffixed on the parameter declarations directly).

Limitations (as of IncQuery 0.8)

  • Meta-level queries (instanceOf etc.) will not currently work (although Ecore models can be processed as any other model).
  • Make sure that the result of the check()/eval() expressions can change ONLY IF one of the variables defined in the query changes.
    • In practice, a good rule of thumb is to only use attribute variables or other scalar values in a check()/eval() expression, no EObjects. This is currently enforced.
    • In particular, do not call non-constant methods of EObjects in a check(). Use attribute values instead, if necessary converted to the native type using SomeInt and co, so as to help the type inference.
      • For example, you CAN use check(name.contains(someString)).
      • But You MUSTN'T use check(someObject.name.contains(someString) as the name of someObject can change without the Java reference someObject changing!
      • For these reasons, we have a validator implemented, that allows only referring to EDataTypes in attribute expressions.
  • Use only well-behaving derived references or attributes. Better yet, reimplement the derived feature using queries. Regular derived features are not supported in patterns (except the ones in the Ecore metamodel, which are well-behaving by default) as they can have arbitrary Java implementations and EMF-IncQuery is unable to predict when their value will change.

See advanced issues for additional topics, such as attribute handling.