Skip to main content
Jump to: navigation, search

Difference between revisions of "Equinox/p2/Query Language for p2"

< Equinox‎ | p2
(IQuery examples)
 
(115 intermediate revisions by 2 users not shown)
Line 8: Line 8:
 
There is no uniform way to translate these queries into an expression that can be understood by a database or by something running in another process.
 
There is no uniform way to translate these queries into an expression that can be understood by a database or by something running in another process.
  
Some attempts has been made to rectify this and a repository implementation could potentially make some intelligent decisions based on information derived from looking at the query (what class it is, and in some cases, what properties that are exposed). It is an improvement but the implementation still needs to be very specialized and the general problem remains; A large amount of the queries are still more or less completely opaque and it's enough with one single such query to force a full selection of everything from the database/remote client.
+
Some attempts have been made to rectify this and a repository implementation could potentially make some intelligent decisions based on information derived from looking at the query (what class it is, and in some cases, what properties that are exposed). While this is an improvement, the general problem remains; A large amount of the queries are still more or less completely black-boxed and it's enough with one single such query to force a full selection of everything from the database/remote server.
  
The p2ql expression language discussed here is an attempt to rectify this problem. It can be used in two ways:
+
The p2QL expression language discussed here is an attempt to rectify this problem. It can be used in two ways:
 
* Coexisting with current IQuery / IQueryable
 
* Coexisting with current IQuery / IQueryable
*: Using the ExpressionQuery (a ContextQuery subclass), it can coexist with todays IQuery/IQueryable. The ExpressionQuery is created based on an expression and a list of parameters. The expression can be in two forms. Its either an Expression instance or a String which then gets parsed into an Expression. See section [[#IQuery examples]].
+
*: Using the ExpressionContextQuery (implements IQuery), or the ExpressionQuery (implements IMatchQuery), it can coexist with todays IQuery/IQueryable. The queries are created based on an expression and an array of parameters. See section [[#IQuery examples]].
 
* With more explicit separation of concerns
 
* With more explicit separation of concerns
*: It can also be used in an alternative solution where an IQueryable simply extracts a result based on a predicate expression, and a ICollect instance that deals with further processing of that result (involving collectors, visitors, and the like). More about this in section [[#An alternative way of doing things]]
+
*: It can also be used in an alternative solution where an expression returns an Iterator. This usage scenario is particularly useful when implementing remote repositories or implementations on top of a database. It is also a very efficient way of querying for things that you don't want to collect in memory.
  
 
====Bugzilla====
 
====Bugzilla====
 
We discussed the use of a query language at length on the p2 meeting on November 9 and November 16. The issue is captured in bugzilla [https://bugs.eclipse.org/bugs/show_bug.cgi?id=294691 Create a QueryLanguage for p2].
 
We discussed the use of a query language at length on the p2 meeting on November 9 and November 16. The issue is captured in bugzilla [https://bugs.eclipse.org/bugs/show_bug.cgi?id=294691 Create a QueryLanguage for p2].
  
===Considerations===
+
===Language Design===
When designing the language I tried to consider two things. Who is the primary user and which are the most common queries?
+
  
====Java style operators====
+
TODO: Add some text here to explain where the ideas stem from (xtend, xtext, Scala, Java) and other motivation behind the choice of syntax.
The primary user is typically well aqquinted with Java so I opted for using java operators such as '&&' and '||' rather then SQL'ish keywords like OR and AND. I also use '.' for member access and '[' ']' to get indexed access. The latter can also be used to get keyed access.
+
  
====Matches operator====
+
====There are two major ingredients in p2ql:====
A large amount of queries involve versions, version ranges, and capability matching. So managing that is important. Another thing that is important is to be able to support filtering based on ID's. All of this is now also built in to the language through the matches operator '~='. It can be though of as the 'satisfies' method on an installable unit or provided capability, as 'is included in' when used with a version and version range, as a plain matches when used with a string and a pattern, or as 'instanceof' when comparing with a class.
+
  
The following things can be matched using the operator.
+
* The language itself
 +
* How the language integrates with and interacts with Java objects
  
 +
=====The language itself=====
 +
 +
A query language at its heart is simply an expression evaluation language for two major kinds of expressions:
 +
 +
* Iterating over a collection and subsetting it in some way
 +
* Supplying a boolean expression that is used as a callback when some other object iterates over a collection and computes a subset.
 +
 +
That, then is what p2ql is: a simple language for describing subsets of collections.
 +
 +
=====How the language integrates with Java objects=====
 +
 +
Since p2ql must execute inside a Java runtime, it must interact with Java objects. In the context of P2, there are two kinds of Java objects that a query language must interact with:
 +
 +
* A P2 IQueryable
 +
* Arbitrary Java objects that are passed in as parameters to the query
 +
 +
====Implementation====
 +
 +
p2ql itself is a small dynamically-typed language in the Smalltalk tradition. It has receivers (objects) and messages (methods), and whether an object can receive a message is determined dynamically at runtime.  Even so, because it automatically takes advantage of various opportunities for concurrency, it is very fast.  It is also a pure functional language, which means that all variables are final. As was noted before, it is an expression language--which means that while you can use predefined objects and functions, you cannot define your own objects nor your own named functions.
 +
 +
However, you can define anonymous functions (lambdas, closures, anonymous runnables) in the same way that a SQL "where" clause is really an anonymous function or a nested SQL query is also an anonymous function.
 +
 +
====Java bindings====
 +
 +
When you define a p2ql query, you are really dynamically defining and running a new method on a P2 IQueryable object, usually a metadata repository.
 +
 +
<pre>IQuery&lt;IInstallableUnit&gt; q = QueryUtil.createMatchQuery("this.id == $0", id);
 +
metadataRepository.query(q);</pre>
 +
 +
=====Types of queries=====
 +
 +
There are two main kinds of queries.
 +
 +
* Queries that define a boolean expression to evaluate against every element in the IQueryable. This is similar to just defining a "where" clause in SQL.
 +
 +
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("this.id == $0", id);</pre>
 +
 +
* Queries that subset the IQueryable's "collection contents" and return a new collection
 +
 +
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery("everything.select(x | x.id == $0)", id);</pre>
 +
 +
=====Variable binding=====
 +
 +
Within these types of queries, variables are bound in the following ways:
 +
 +
* In a match query, "this" refers to the "current row object" in the IQueryable.
 +
 +
* In a generic query, "everything" refers to the IQueryable collection itself.
 +
 +
* Dot notation follows the following rules:
 +
** If the receiver of the message is an Iterable or an IQueryable, the message must be a function that iterates across the collection's elements.  (These predefined functions are described below.)
 +
** If the receiver of the message is a POJO, then the property name denoted by the message is converted into a "getRegurlarProperty" or "isBooleanProperty" Java call and the property value is returned.
 +
 +
So, in the examples above, this.id is the same as writing "rowObject.getId()" in Java.
 +
 +
If you prefer, "this" or "everything" can be left out. In other words, unqualified property names or function calls always refer to the "this" or the "everything" object. So, the following queries are equivalent to the first examples.
 +
 +
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("id == $0", id);</pre>
 +
 +
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery("select(x | x.id == $0)", id);</pre>
 +
 +
The $0, $1, ..., $n variables simply refer to the Java parameter objects that were passed into the query. The same dot operator rules apply to these variables as apply to the "everything" and "this" variables: If the variable is an Iterable, you have to call a function on it that processes its contents. Otherwise, you can treat it like a Java Bean and the dot operator accesses its properties.
 +
 +
With that introduction, let's take a look at this material again, but expressed more formally.
 +
 +
====Boolean Queries====
 +
Queries can be written as simple predicates such as "id == $0". When executed, the predicate will be evaluated once for each row in the queried material. This form is used for the IMatchQuery implementation. The current value is available in a predefined variable named ''this''. Like in Java, ''this'' is normally implicit so that:
 +
<pre>id == $0</pre>
 +
is actually the same as:
 +
<pre>this.id == $0</pre>
 +
 +
====Collection Queries====
 +
A query can also be declared as something that acts on "everything". This is particularly useful when things like search for latest or doing requirements traversal. These queries make use of a predefined variable named ''everything''. When accessed, that variable returns an iterator over all queried material. As with ''this'' in a predicate query, ''everything'' is something that you don't normally need to specify, i.e.:
 +
<pre>select(x | x.id == $0)</pre>
 +
is the same as:
 +
<pre>everything.select(x | x.id == $0)</pre>
 +
A boolean query can always be written as a collection query. These two queries are fully equivalent:
 +
<pre>Boolean: id == $0
 +
Collection:  select(x | x.id == $0)</pre>
 +
 +
====Special operators====
 +
<ul><li><b>Matches operator '~='</b><br/>
 +
<p>A large amount of queries involve versions, version ranges, and capability matching. So managing that is important. Another thing that is important is to be able to support filtering based on ID's. All of this is now also built in to the language through the matches operator '''~='''. It can be though of as the 'satisfies' method on an installable unit or provided capability, as 'is included in' when used with a version and version range, as a plain matches when used with a string and a pattern, or as 'instanceof' when comparing with a class.</p>
 +
<p>The following things can be matched using the operator</p>
 
{| border="1" cellpadding="3"
 
{| border="1" cellpadding="3"
 
!LHS
 
!LHS
Line 35: Line 117:
 
!Implemented as
 
!Implemented as
 
|-
 
|-
|IProvidedCapability
+
|IInstallableUnit
|IRequiredCapability
+
|IRequirement
 
|lhs.satisfies(rhs)
 
|lhs.satisfies(rhs)
 
|-
 
|-
 
|IInstallableUnit
 
|IInstallableUnit
|IRequiredCapability
+
|Filter
|lhs.satisfies(rhs)
+
|rhs.matches(lhs.properties)
 +
|-
 +
|IInstallableUnit
 +
|IUpdateDescriptor
 +
|rhs.isUpdateOf(lhs)
 
|-
 
|-
 
|Version
 
|Version
Line 47: Line 133:
 
|rhs.isIncluded(lhs)
 
|rhs.isIncluded(lhs)
 
|-
 
|-
|IInstallableUnit
+
|Map
 
|Filter
 
|Filter
|rhs.matches(lhs.properties)
+
|rhs.match(lhs)
 
|-
 
|-
|Map
+
|Dictionary
 
|Filter
 
|Filter
 
|rhs.match(lhs)
 
|rhs.match(lhs)
 
|-
 
|-
 
|String
 
|String
|Pattern
+
|SimplePattern
|rhs.matcher(lhs).matches()
+
|rhs.isMatch(lhs)
 +
|-
 +
|String
 +
|VersionRange
 +
|rhs.isIncluded(version(lhs))
 
|-
 
|-
 
|<any>
 
|<any>
Line 66: Line 156:
 
|Class
 
|Class
 
|rhs.isAssignableFrom(lhs)
 
|rhs.isAssignableFrom(lhs)
 +
|}</li>
 +
<li><b>And operator '&&'</b><br/>
 +
This operator checks if the first operand evaluates to a boolean. If it is, the it is assumed that all other operands also are booleans and the full evaluation is ''true'' if all its operands evaluate to ''true''. If the first result was not a boolean, then it is assumed that it is a collection and that all other operands also evaluates collections. The operator will then function as a '''intersect''' operator and the result consists those elements that could be found in all collections.
 +
</li>
 +
<li><b>Or operator '||'</b><br/>
 +
This operator checks if the first operand evaluates to a boolean. If it is, the it is assumed that all other operands also are booleans and the full evaluation is ''false'' if none of its operands evaluate to ''true''. If the first result was not a boolean, then it is assumed that it is a collection and that all other operands also evaluates collections. The operator will then function as a '''union''' operator and the result is the unique sum of all elements from all collections.
 +
</li>
 +
</ul>
 +
 +
====Functions====
 +
A small number of functions are available to assist with some common problems. Example:
 +
An IRequiredCapability.getFilter() returns a String. I only want the capabilities that has a filter that match certain properties. Consequently, I need an instance of an OSGi Filter. Not a string. I can do this using the ''filter'' function. Like this:
 +
<pre>requirements.exists(rc | rc.filter == null || $1 ~= filter(rc.filter))</pre>
 +
The currently available constructors are:
 +
{| border="1" cellpadding="3"
 +
!name
 +
!arguments
 +
!creates
 +
|-
 +
|filter
 +
|string
 +
|an instance of org.osgi.framework.Filter
 +
|-
 +
|version
 +
|string
 +
|a p2 Version
 +
|-
 +
|range
 +
|string
 +
|a p2 VersionRange
 +
|-
 +
|class
 +
|string
 +
|a Class
 +
|-
 +
|boolean
 +
|string
 +
|a boolean
 +
|-
 +
|set
 +
|comma separated list of expressions
 +
|an instance of java.util.HashSet
 +
|-
 +
|iquery
 +
|IQuery [, variable [, collector]]
 +
|The result of evaluating the query
 
|}
 
|}
 +
<p>A note about the <b>iquery</b> constructor.<br/>
 +
The constructor operates in one of two modes depending on its IQuery argument.</p>
 +
<p>If the argument implements IMatchQuery, the the constructor will return the boolean result of invoking the isMatch(value) method on that query. The value is picked from the variable and the default variable is 'item'. It is considered an error to pass a third argument in combination with an IMatchQuery.</p>
 +
<p>When the query does not implement the IMatchQuery interface, it is considered a context query and its method perform(iterator, collector) will be called. The iterator is picked from the variable and the default variable is 'everything'. The collector can be passed in as the third argument. If no third argument is present, a new collector will be created.</p>
  
====Latest====
+
====Collection functions====
Some queries must make sure that the result only contains the latest version of each found IU. I added a special keyword 'latest' to make that possible.
+
A number of functions was added to enable various manipulations of collections. A common way of writing collection functions is:
 
+
<pre>elements.someFunction(element | <expression>)</pre>
====Limit====
+
Here are the functions that are available in the p2QL language:
It is sometimes desirable to limit the number of rows returned by a query. I added a special keyword 'limit' to cover that.
+
<ul>
 
+
<li>'''Select'''<br/>
====Combined queries====
+
The '''select''' function will evaluate its expression, once for each element, and return a new collection containing all elements for which the evaluation result was ''true''. Example:
It must also be possible to combine queries (hence '''&&''' and '''||''').
+
<pre>select(x | x.id == 'org.eclipse.equinox.p2.ql')</pre>
 +
</li>
 +
<li>'''Collect'''<br/>
 +
The '''collect''' collects the result of evaluating each element in a new collection. Example:
 +
<pre>collect(x | x.id)</pre>
 +
returns a collection consisting of the value of the id property of each element in the original collection.
 +
</li>
 +
<li>'''Exists'''<br/>
 +
The '''exists''' function will evaluate an expression for each element in the collection until an evaluation yields ''true''. If that happens, the result of the whole evaluation yields ''true''. If no such element is found, the whole evaluation yields ''false''. Example:
 +
<pre>providedCapabilities.exists(p | p.namespace == 'osgi.bundle')</pre></li>
 +
<li>'''All'''<br/>
 +
The '''all''' function will evaluate an expression for each element until an evaluation yields ''false''. If that happens, the whole evaluation yields ''false''. If no such element is found, the whole evaluation yields ''true''. Example:
 +
<pre>$0.all(rc | this ~= rc)</pre>
 +
Assuming $0 is a list of required capabilities, this query asks for items that fulfill all requirements.</li>
 +
<li>'''First'''<br/>
 +
The '''first''' function will evaluate an expression for each element in the collection until an evaluation yields ''true''. If that happens, the result of the whole evaluation yields that element. If no such element is found, the whole evaluation yields ''null''. Example:
 +
<pre>providedCapabilities.first(p | p.namespace == 'java.package')</pre>
 +
Returns the first provided capability with namespace 'java.package'.</li>
 +
<li>'''Flatten'''<br/>
 +
Intended to be applied on collections of collections. Yields a single collection with all elements from the source collections, in the order they are evaluated.Example:
 +
<pre>collect(x | x.requirements).flatten()</pre>
 +
Yields a collection with all required capabilities from all iu's that the collect was applied on.</li>
 +
<li>'''Latest'''<br/>
 +
Some queries must make sure that the result only contains the latest version of each found IU. The special function ''latest'' to makes that possible. The function will require that the collection elements implement the IVersionedId interface. Here is an example of how to use it:
 +
<pre>select(x | x.id == $0).latest()</pre>
 +
As a convenience, it is also possible to write:
 +
<pre>latest(x | x.id == $0)</pre></li>
 +
<li>'''Limit'''<br/>
 +
It is sometimes desirable to limit the number of rows returned by a query. The function 'limit' can be used for that:
 +
<pre>select(...).limit(100)</pre></li>
 +
<li>'''Unique'''<br/>
 +
This function will ensure that the resulting collection contains unique elements. The function can operate in two modes, with or without a cache argument. I.e.
 +
<pre>x.unique()</pre>
 +
or
 +
<pre>x.unique(cache)</pre>
 +
The latter expects the argument to be a ''java.util.Set'' that it can use to enforce the uniqueness of the element. This enables the result to be unique in a larger scope then the collection itself.</li>
 +
<li>'''Traverse'''<br/>
 +
A common scenario in p2 is that you want to start with a set of roots and then find all items that fulfill the root requirements. Those items in turn introduce new requirements so you want to find them too. The process continues until no more requirements can be satisfied. This type of query can be performed using the '''traverse''' function. The function will evaluate an expression, once for each element, collect elements for which the evaluation returned ''true'', then then re-evaluate using the collected result as source of elements. No element is evaluated twice. This continues until no more elements are found:
 +
<pre>$0.traverse(parent | parent.requirements.collect(rc | select(iu | ~= rc)).flatten())</pre>
 +
This is of course a fairly naive slicing mechanism. See [[#Currying]] below for more advanced examples.</li>
 +
</ul>
  
 
====Query parameters====
 
====Query parameters====
The query must be parameterised so that expression parsing can be done once and then executed multiple times. A parameter is expressed as ''':&lt;''n''&gt;''' where ''n'' is the parameter index, originating from 1.
+
The query must be parameterised so that expression parsing can be done once and then executed multiple times. A parameter is expressed as $''n'' where ''n'' is either the parameter index, originating from 0.
 
+
====Java API====
+
The expression tree created by the parser must be well documented and easy to use so that queries can be created programmatically.
+
 
+
====Performance====
+
Both the parser and the evaluator must be very fast. Performance must be comparable with the current solution.
+
 
+
===IQuery examples===
+
Think of the query string as something that would be entered after the WHERE keyword in an SQL query, i.e.
+
 
+
SELECT self FROM repository WHERE ...
+
 
+
The query language will only cover the '...' part since the rest is implicit. So here are some examples of query expressions:
+
  
 +
===Basic IQuery examples===
 +
Here are some examples of how to use the expressions with IQuery:
 
Query for all IU's that has an id:
 
Query for all IU's that has an id:
<pre>IQuery query = new ExpressionQuery("id = :1", id);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("id == $0", id);</pre>
 
Query for the latest IU of some specific id:
 
Query for the latest IU of some specific id:
<pre>IQuery query = new ExpressionQuery("latest id = :1", id);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery("latest(x | x.id == $0)", id);</pre>
 
Query an artifact repository for all keys with a specific classifier:
 
Query an artifact repository for all keys with a specific classifier:
<pre>IQuery query = new ExpressionQuery("class ~= :1 && classifier = :2", IArtifactKey.class, classifier);</pre>
+
<pre>IQuery&lt;IArtifactKey&gt; query = QueryUtil.createMatchQuery(IArtifactKey.class, "classifier == $0", classifier);</pre>
Query for the latest IU that matches a specific version range (note the matches operator):
+
Query for the latest IU that matches a specific version range. Since the second parameter is a VersionRange, the ~= operator is interpreted as ''isIncluded'':
<pre>IQuery query = new ExpressionQuery("latest id = :1 && version ~= :2", id, range);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery("latest(x | x.id == $0 && x.version ~= $1)", id, range);</pre>
 
Query for an IU that has a specific property set:
 
Query for an IU that has a specific property set:
<pre>IQuery query = new ExpressionQuery("properties[:1] = :2", key, value);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("properties[$0] == $1", key, value);</pre>
 
The same query, but this time for multiple possible values:
 
The same query, but this time for multiple possible values:
<pre>IQuery query = new ExpressionQuery("properties[:1] = ANY :2", key, new Object[] { v1, v2, v3 });</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("$1.exists(v | properties[$0] == v)", key, new Object[] { v1, v2, v3 });</pre>
 
Query for all categories found in the repository:
 
Query for all categories found in the repository:
<pre>IQuery query = new ExpressionQuery("properties[:1] = 'true'", IInstallableUnit.PROP_TYPE_CATEGORY);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQueryx("properties[$0] == true", IInstallableUnit.PROP_TYPE_CATEGORY);</pre>
Query for all IU's that fulfil at least one of the requirements from another IU.
+
Query for all IU's that fulfil at least one of the requirements from another IU. Since the first parameter is a list of IRequirements, the ~= applied to each each IU using ''satisfies''.
<pre>IQuery query = new ExpressionQuery("ANY providedCapabilities ~= ANY :1", iu.getRequiredCapabilities());
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("$0.exists(rc | this ~= rc)", iu.getRequirements());</pre>
OR
+
IQuery query = new ExpressionQuery("self ~= ANY :1", iu.getRequiredCapabilities());</pre>
+
 
Query for the latest version of all patches:
 
Query for the latest version of all patches:
<pre>IQuery query = new ExpressionQuery("latest self ~= :1", IInstallableUnitPatch.class);</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery("latest(x | x ~= $0)", IInstallableUnitPatch.class);</pre>
 
Query for all IU's affected by a patch:
 
Query for all IU's affected by a patch:
<pre>IQuery query = new ExpressionQUery("self ~= ANY ALL :1", patch.getApplicabilityScope());</pre>
+
<pre>IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createMatchQuery("$0.exists(rcs | rcs.all(rc | this ~= rc))", patch.getApplicabilityScope());</pre>
  
I think it would be valuable if users of p2 that have the need for specific queries could list them here. It is not unlikely that some queries might motivate some extension to the current syntax or to how the evaluator works.
+
===Localization===
 +
An installable unit may have locale specific properties. Such properties may be stored in the IU itself or in fragments that provide localization for the IU in the 'org.eclipse.equinox.p2.localization' namespace.
 +
p2QL will interpret the member ''translations'' applied on an IU as a map of translated properties, hiding all the details. As an example, this query:
  
===An alternative way of doing things===
+
<pre>translations['license'] ~= /*kommersiellt bruk*/</pre>
Passing instances of the IQuery to an IQueriable is problematic since it is very difficult for the IQueryable implementer to figure out what the IQuery does. When the implementation is based on a database or a remote server, the query needs to be converted into some kind of generic expression. Some special code would need to be written that would make a best effort. It might be able to recognize some queries and then process others locally. Often with very varying performance. Executing a query locally means passing all data over the net or extracting all tuples from a database. For huge repositories, that will be a very resource intensive operation.
+
  
Todays IQuery does have one big advantage though. It can contain java code that does virtually anything to filter the result or even perform actions based on the result. Needless to say, such filtering, or such actions, cannot be performed by a database or a remote server. So clearly, the concept currently encapsulated by the IQuery/IQueryable is needed.
+
when used with a default Locale("sv_SE") would yield all IU's that has a license, translated into Swedish, that contains the exact phrase 'kommersiellt bruk'.
  
Here is a proposal on an alternative solution that would get the best of both worlds. In essence, each query would start with the evaluation of a query expression. Then, if needed, the query result is further processed by a construct very similar to the current IQuery/IQueryable.
+
A special class named '''KeyWithLocale''' was added to allow queries for entries that does not match the current locale. So the above query can be written as:
  
Simplified, the solution would be based on the following two interfaces:
+
<pre>QueryUtil.createMatchQuery("this.translations[$0] ~= /*kommersiellt bruk*/", new KeyWithLocale("license", new Locale("sv", "SE")))</pre>
<pre>public interface IQueryable {
+
  Iterator query(String expression, Object[] args, IProgressMonitor monitor);
+
  Iterator query(Expression expression, Object[] args, IProgressMonitor monitor);
+
}
+
  
public interface ICollector {
+
===Currying===
  Collector collect(Iterator iterator, Collector collector);
+
Want some add some spice? Probably...
}</pre>
+
  
The ICollector is basically the same thing as the current IQuery. The key thing here is the separation of concern. An IQueriable can return a result based on a predicate. The ICollector does something with that result.
+
Consider the traversal function and the example we had above:
 +
<pre>$0.traverse(parent | parent.requirements.collect(rc | select(iu | iu ~= rc)).flatten())</pre>
 +
<p>Now let us assume that we want to perform this traversal and filter the requirements based on platform. Our first attempt could look something like this.
 +
<pre>$0.traverse(parent | parent.requirements.select(rc | rc.filter == null || $1 ~= rc.filter).collect(rc | select(iu | iu ~= rc)).flatten())</pre>
 +
This would however be very inefficient since many requirements exists in several IU's. During the traversal there is no point traversing a requirement more then once so it would be good to "remember" the perused requirements.
 +
</p><p>The p2QL syntax permits something generally referred to as ''currying''. Currying means that you can provide more parameters then the single one that represents the current element to the expression of a collection function. So far, we've seen examples using the syntax:
 +
<pre>select(x | <do something with x>)</pre>
 +
This is actually a short form for the longer:
 +
<pre>select(_, {x | <do something with x>})</pre>
 +
The select provides one parameter to each iteration. It's value is always provided reachable using the special operator '_'. In this case, the variable x maps to the parameter _ since they have the same positional offset. I can add more parameters by declaring:
 +
<pre>select(a, b, _, {x, y, z, <do something with x, y, z>})</pre>
 +
Variable x will now have the value of a, y the value of b, and z the value of _.</p>
 +
<p>So why is this important? Well, the initializers are just evaluated once for one call to select. The expression however, is evaluated once for each new value of _.<p>
 +
<p>Let us now return to the traversal again. Because with this syntax, we can actually specify a global set that we can pass to the unique function so that no requirement is perused twice:
 +
<pre>$0.traverse(set(), _, { rqCache, parent | parent.requirements.unique(rqCache).select(rc | rc.filter == null || $1 ~= rc.filter).collect(rc | select(iu | iu ~= rc)).flatten()})</pre>
 +
</p>
  
Today, you would write something like:
+
===Performance comparison: p2QL traversal versus the p2 Slicer===
<pre>MyQuery query = new MyQuery(collectorArgs, queryArgs);
+
<p>A test was performed using the query example above on the Galileo repository, using a specific version of the org.eclipse.sdk.feature.group IU as the root. The test produced a result with 411 IU's in < 12 ms average. I compared this with another test that used the p2 Slicer. The number of IU's produced was exactly the same. The slicer however, took > 22 ms (it also uses a capability cache internally). Not very scientific perhaps but repeated tests produce similar results.</p>
Collector coll = repo.query(query, new Collector(), monitor);
+
<p>I'm sure the Slicer can be optimized and achieve results that might be even slightly better then the traversal but I still think this proves a point. p2QL Performance is very good at this point.</p>
</pre>
+
If the query does things that must be done locally and is not covered by the query expression language, this would now be replaced by:
+
<pre>MyCollector collector = new MyCollector(collectorArgs);
+
Collector coll = collector.collect(repo.query("some expression", queryArgs, monitor), new Collector());
+
</pre>
+
Note though, that any code that just wishes to iterate over the result, would not need the collector at all. It would still be possible to just write:
+
<pre>Iterator itr = repo.query("some expression", queryArgs, monitor);
+
while(itr.hasNext()) {
+
  ...
+
}</pre>
+
and judging from the current usage of the query, that would be the most common way of doing things. The places where special handling by a collector is needed are very rare.
+
  
We could also allow a more generic Iterator filter, i.e. an Iterator implementation that uses an inner Iterator. With two basic implementations, ContextIteratorFilter and MatchIteratorFilter (corresponding to the current ContextQuery and MatchQuery) such filters would assist in writing general purpose filters to execute locally.
+
===Java API===
 +
The expression tree created by the parser must be well documented and easy to use so that queries can be created programmatically. Since all expressions are immutable and without context, they can be combined freely. Hence, code like this is fully possible:
 +
<pre>
 +
// Create some expressions. Note the use of identifiers instead of
 +
// indexes for the parameters
 +
 
 +
IExpressionFactory factory = ExpressionUtil.getFactory();
 +
IExpression item = factory.variable("this");
 +
IExpression cmp1 = factory.equals(factory.member(item, "id"), factory.parameter(0));
 +
IExpression cmp2 = factory.equals(
 +
factory.at(factory.member(item, "properties"), factory.parameter(1)),
 +
factory.parameter(2));
 +
 
 +
IExpression everything = factory.variable("everything");
 +
IExpression lambda = factory.lambda(item, factory.and(new IExpression[] {cmp1, cmp2}));
 +
IExpression latest = factory.latest(factory.select(everything, lambda));
 +
 
 +
// Create the query
 +
IQuery&lt;IInstallableUnit&gt; query = QueryUtil.createQuery&lt;IInstallableUnit&gt;(latest "test.bundle", "org.eclipse.equinox.p2.type.group", Boolean.TRUE);
 +
</pre>
  
 
===The [http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form BNF]===
 
===The [http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form BNF]===
 
<pre>
 
<pre>
p2query : limit? 'latest'? orExpression ;
+
condition
 
+
: orExpression ( '?' orExpression ':' orExpression )?
limit: 'limit' INT | parameter ; # limit the number of returned rows
+
;
  
 
orExpression : andExpression ( '||' andExpression )* ;
 
orExpression : andExpression ( '||' andExpression )* ;
  
andExpression : notExpression ( '&&' notExpression )* ;
+
andExpression : binaryExpression ( '&&' binaryExpression )* ;
 +
 
 +
binaryExpression : notExpression ( op notExpression )?;
 +
 
 +
op : '==' | '!=' | '>' | '>=' | '<' | '<=' | '~=' ;
  
 
notExpression
 
notExpression
  : '!' notExpression
+
: '!' notExpression
  | binaryExpression
+
| collectionExpression
  ;
+
;
  
op : '=' | '!=' | '>' | '>=' | '<' | '<=' | '~=' ;
+
collectionExpression
 +
: memberExpression ( '.' collectionFunction )*
 +
;
  
binaryExpression : sideExpression op sideExpression ;
+
memberExpression : function ( ( '.' ID ) | ( '[' memberExpression ']' ) )* ;
  
sideExpression : ('any' | 'all')* memberExpression ;
+
function
 +
: ( filter | version | range | class) '(' condition ')'
 +
| set '(' ( condition ( ',' condition )* )? ')'
 +
| unaryExpression
 +
;
  
memberExpression : unaryExpression ( ( '.' ID ) | ( '[' memberExpression ']' ) )* ;
+
collectionFunction
 +
: ( select | reject | collect | exists | all | traverse | first ) '(' lambdaDefinition ')'
 +
| limit '(' memberExpression ')'
 +
| unique '(' memberExpression? ')'
 +
| latest '(' lambdaDefinition? ')'
 +
| flatten '(' lambdaDefinition? ')'
 +
;
 +
 
 +
lambdaDefinition
 +
: initializer ( ',' initializer )* ( ',' '{' lambda '}' )?
 +
| '{' lambda '}'
 +
| lambda
 +
;
 +
 
 +
initializer
 +
: '_'
 +
| condition
 +
;
 +
 
 +
lambda
 +
: ( ID ( ',' ID )* )? '|' condition
 +
;
  
 
unaryExpression
 
unaryExpression
  : '(' orExpression ')'
+
: '(' condition ')'
  | '[' memberExpression ( ',' memberExpression )* ']' // #array construct
+
| '[' condition ( ',' condition )* ']' // #array construct
  | '/' regexpPattern '/'
+
| STRING
  | STRING
+
| INT
  | INT
+
| parameter
  | parameter
+
| 'null'
  | 'self'
+
| 'true'
  | 'null'
+
| 'false'
  | 'true'
+
| ID
  | 'false'
+
;
  | ID // # implies self.identifier
+
  ;
+
  
  parameter
+
parameter
  : ':' INT
+
: '$' INT | ID
  ;
+
;
 
</pre>
 
</pre>
 
===The expression tree model===
 
[[Image:P2ql.png]]
 

Latest revision as of 07:10, 27 February 2013

Background

As p2 gets more widely used, we predict that repositories will grow larger and larger over time. The Galileo repository alone contains 3866 IU's (in one single service release). Starting with Helios, we plan to keep everything over time which will put the IU count close to 10.000. And that's just one year of Eclipse IP approved material. OSGi is gaining in popularity and p2 is getting increasingly popular. Some companies plan to map the central maven repository as p2. We can soon expect to see p2 repositories with over a 100.000 IU's in them.

The problem

p2 has a query mechanism today that makes it hard to create a repository implementation that is based on a database. It is also diffiult to create an efficient client/server solution. The reason for this is that most of the queries are written as implementations of the IQuery interface, either as ContextQuery derivates that need access to all rows in the database, or as MatchQuery derivates with a java method to which all rows need to be passed.

There is no uniform way to translate these queries into an expression that can be understood by a database or by something running in another process.

Some attempts have been made to rectify this and a repository implementation could potentially make some intelligent decisions based on information derived from looking at the query (what class it is, and in some cases, what properties that are exposed). While this is an improvement, the general problem remains; A large amount of the queries are still more or less completely black-boxed and it's enough with one single such query to force a full selection of everything from the database/remote server.

The p2QL expression language discussed here is an attempt to rectify this problem. It can be used in two ways:

  • Coexisting with current IQuery / IQueryable
    Using the ExpressionContextQuery (implements IQuery), or the ExpressionQuery (implements IMatchQuery), it can coexist with todays IQuery/IQueryable. The queries are created based on an expression and an array of parameters. See section #IQuery examples.
  • With more explicit separation of concerns
    It can also be used in an alternative solution where an expression returns an Iterator. This usage scenario is particularly useful when implementing remote repositories or implementations on top of a database. It is also a very efficient way of querying for things that you don't want to collect in memory.

Bugzilla

We discussed the use of a query language at length on the p2 meeting on November 9 and November 16. The issue is captured in bugzilla Create a QueryLanguage for p2.

Language Design

TODO: Add some text here to explain where the ideas stem from (xtend, xtext, Scala, Java) and other motivation behind the choice of syntax.

There are two major ingredients in p2ql:

  • The language itself
  • How the language integrates with and interacts with Java objects
The language itself

A query language at its heart is simply an expression evaluation language for two major kinds of expressions:

  • Iterating over a collection and subsetting it in some way
  • Supplying a boolean expression that is used as a callback when some other object iterates over a collection and computes a subset.

That, then is what p2ql is: a simple language for describing subsets of collections.

How the language integrates with Java objects

Since p2ql must execute inside a Java runtime, it must interact with Java objects. In the context of P2, there are two kinds of Java objects that a query language must interact with:

  • A P2 IQueryable
  • Arbitrary Java objects that are passed in as parameters to the query

Implementation

p2ql itself is a small dynamically-typed language in the Smalltalk tradition. It has receivers (objects) and messages (methods), and whether an object can receive a message is determined dynamically at runtime. Even so, because it automatically takes advantage of various opportunities for concurrency, it is very fast. It is also a pure functional language, which means that all variables are final. As was noted before, it is an expression language--which means that while you can use predefined objects and functions, you cannot define your own objects nor your own named functions.

However, you can define anonymous functions (lambdas, closures, anonymous runnables) in the same way that a SQL "where" clause is really an anonymous function or a nested SQL query is also an anonymous function.

Java bindings

When you define a p2ql query, you are really dynamically defining and running a new method on a P2 IQueryable object, usually a metadata repository.

IQuery<IInstallableUnit> q = QueryUtil.createMatchQuery("this.id == $0", id);
metadataRepository.query(q);
Types of queries

There are two main kinds of queries.

  • Queries that define a boolean expression to evaluate against every element in the IQueryable. This is similar to just defining a "where" clause in SQL.
IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("this.id == $0", id);
  • Queries that subset the IQueryable's "collection contents" and return a new collection
IQuery<IInstallableUnit> query = QueryUtil.createQuery("everything.select(x | x.id == $0)", id);
Variable binding

Within these types of queries, variables are bound in the following ways:

  • In a match query, "this" refers to the "current row object" in the IQueryable.
  • In a generic query, "everything" refers to the IQueryable collection itself.
  • Dot notation follows the following rules:
    • If the receiver of the message is an Iterable or an IQueryable, the message must be a function that iterates across the collection's elements. (These predefined functions are described below.)
    • If the receiver of the message is a POJO, then the property name denoted by the message is converted into a "getRegurlarProperty" or "isBooleanProperty" Java call and the property value is returned.

So, in the examples above, this.id is the same as writing "rowObject.getId()" in Java.

If you prefer, "this" or "everything" can be left out. In other words, unqualified property names or function calls always refer to the "this" or the "everything" object. So, the following queries are equivalent to the first examples.

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("id == $0", id);
IQuery<IInstallableUnit> query = QueryUtil.createQuery("select(x | x.id == $0)", id);

The $0, $1, ..., $n variables simply refer to the Java parameter objects that were passed into the query. The same dot operator rules apply to these variables as apply to the "everything" and "this" variables: If the variable is an Iterable, you have to call a function on it that processes its contents. Otherwise, you can treat it like a Java Bean and the dot operator accesses its properties.

With that introduction, let's take a look at this material again, but expressed more formally.

Boolean Queries

Queries can be written as simple predicates such as "id == $0". When executed, the predicate will be evaluated once for each row in the queried material. This form is used for the IMatchQuery implementation. The current value is available in a predefined variable named this. Like in Java, this is normally implicit so that:

id == $0

is actually the same as:

this.id == $0

Collection Queries

A query can also be declared as something that acts on "everything". This is particularly useful when things like search for latest or doing requirements traversal. These queries make use of a predefined variable named everything. When accessed, that variable returns an iterator over all queried material. As with this in a predicate query, everything is something that you don't normally need to specify, i.e.:

select(x | x.id == $0)

is the same as:

everything.select(x | x.id == $0)

A boolean query can always be written as a collection query. These two queries are fully equivalent:

Boolean: id == $0
Collection:   select(x | x.id == $0)

Special operators

  • Matches operator '~='

    A large amount of queries involve versions, version ranges, and capability matching. So managing that is important. Another thing that is important is to be able to support filtering based on ID's. All of this is now also built in to the language through the matches operator ~=. It can be though of as the 'satisfies' method on an installable unit or provided capability, as 'is included in' when used with a version and version range, as a plain matches when used with a string and a pattern, or as 'instanceof' when comparing with a class.

    The following things can be matched using the operator

    LHS RHS Implemented as
    IInstallableUnit IRequirement lhs.satisfies(rhs)
    IInstallableUnit Filter rhs.matches(lhs.properties)
    IInstallableUnit IUpdateDescriptor rhs.isUpdateOf(lhs)
    Version VersionRange rhs.isIncluded(lhs)
    Map Filter rhs.match(lhs)
    Dictionary Filter rhs.match(lhs)
    String SimplePattern rhs.isMatch(lhs)
    String VersionRange rhs.isIncluded(version(lhs))
    <any> Class rhs.isInstance(lhs)
    Class Class rhs.isAssignableFrom(lhs)
  • And operator '&&'
    This operator checks if the first operand evaluates to a boolean. If it is, the it is assumed that all other operands also are booleans and the full evaluation is true if all its operands evaluate to true. If the first result was not a boolean, then it is assumed that it is a collection and that all other operands also evaluates collections. The operator will then function as a intersect operator and the result consists those elements that could be found in all collections.
  • Or operator '||'
    This operator checks if the first operand evaluates to a boolean. If it is, the it is assumed that all other operands also are booleans and the full evaluation is false if none of its operands evaluate to true. If the first result was not a boolean, then it is assumed that it is a collection and that all other operands also evaluates collections. The operator will then function as a union operator and the result is the unique sum of all elements from all collections.

Functions

A small number of functions are available to assist with some common problems. Example: An IRequiredCapability.getFilter() returns a String. I only want the capabilities that has a filter that match certain properties. Consequently, I need an instance of an OSGi Filter. Not a string. I can do this using the filter function. Like this:

requirements.exists(rc | rc.filter == null || $1 ~= filter(rc.filter))

The currently available constructors are:

name arguments creates
filter string an instance of org.osgi.framework.Filter
version string a p2 Version
range string a p2 VersionRange
class string a Class
boolean string a boolean
set comma separated list of expressions an instance of java.util.HashSet
iquery IQuery [, variable [, collector]] The result of evaluating the query

A note about the iquery constructor.
The constructor operates in one of two modes depending on its IQuery argument.

If the argument implements IMatchQuery, the the constructor will return the boolean result of invoking the isMatch(value) method on that query. The value is picked from the variable and the default variable is 'item'. It is considered an error to pass a third argument in combination with an IMatchQuery.

When the query does not implement the IMatchQuery interface, it is considered a context query and its method perform(iterator, collector) will be called. The iterator is picked from the variable and the default variable is 'everything'. The collector can be passed in as the third argument. If no third argument is present, a new collector will be created.

Collection functions

A number of functions was added to enable various manipulations of collections. A common way of writing collection functions is:

elements.someFunction(element | <expression>)

Here are the functions that are available in the p2QL language:

  • Select
    The select function will evaluate its expression, once for each element, and return a new collection containing all elements for which the evaluation result was true. Example:
    select(x | x.id == 'org.eclipse.equinox.p2.ql')
  • Collect
    The collect collects the result of evaluating each element in a new collection. Example:
    collect(x | x.id)

    returns a collection consisting of the value of the id property of each element in the original collection.

  • Exists
    The exists function will evaluate an expression for each element in the collection until an evaluation yields true. If that happens, the result of the whole evaluation yields true. If no such element is found, the whole evaluation yields false. Example:
    providedCapabilities.exists(p | p.namespace == 'osgi.bundle')
  • All
    The all function will evaluate an expression for each element until an evaluation yields false. If that happens, the whole evaluation yields false. If no such element is found, the whole evaluation yields true. Example:
    $0.all(rc | this ~= rc)
    Assuming $0 is a list of required capabilities, this query asks for items that fulfill all requirements.
  • First
    The first function will evaluate an expression for each element in the collection until an evaluation yields true. If that happens, the result of the whole evaluation yields that element. If no such element is found, the whole evaluation yields null. Example:
    providedCapabilities.first(p | p.namespace == 'java.package')
    Returns the first provided capability with namespace 'java.package'.
  • Flatten
    Intended to be applied on collections of collections. Yields a single collection with all elements from the source collections, in the order they are evaluated.Example:
    collect(x | x.requirements).flatten()
    Yields a collection with all required capabilities from all iu's that the collect was applied on.
  • Latest
    Some queries must make sure that the result only contains the latest version of each found IU. The special function latest to makes that possible. The function will require that the collection elements implement the IVersionedId interface. Here is an example of how to use it:
    select(x | x.id == $0).latest()

    As a convenience, it is also possible to write:

    latest(x | x.id == $0)
  • Limit
    It is sometimes desirable to limit the number of rows returned by a query. The function 'limit' can be used for that:
    select(...).limit(100)
  • Unique
    This function will ensure that the resulting collection contains unique elements. The function can operate in two modes, with or without a cache argument. I.e.
    x.unique()

    or

    x.unique(cache)
    The latter expects the argument to be a java.util.Set that it can use to enforce the uniqueness of the element. This enables the result to be unique in a larger scope then the collection itself.
  • Traverse
    A common scenario in p2 is that you want to start with a set of roots and then find all items that fulfill the root requirements. Those items in turn introduce new requirements so you want to find them too. The process continues until no more requirements can be satisfied. This type of query can be performed using the traverse function. The function will evaluate an expression, once for each element, collect elements for which the evaluation returned true, then then re-evaluate using the collected result as source of elements. No element is evaluated twice. This continues until no more elements are found:
    $0.traverse(parent | parent.requirements.collect(rc | select(iu | ~= rc)).flatten())
    This is of course a fairly naive slicing mechanism. See #Currying below for more advanced examples.

Query parameters

The query must be parameterised so that expression parsing can be done once and then executed multiple times. A parameter is expressed as $n where n is either the parameter index, originating from 0.

Basic IQuery examples

Here are some examples of how to use the expressions with IQuery: Query for all IU's that has an id:

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("id == $0", id);

Query for the latest IU of some specific id:

IQuery<IInstallableUnit> query = QueryUtil.createQuery("latest(x | x.id == $0)", id);

Query an artifact repository for all keys with a specific classifier:

IQuery<IArtifactKey> query = QueryUtil.createMatchQuery(IArtifactKey.class, "classifier == $0", classifier);

Query for the latest IU that matches a specific version range. Since the second parameter is a VersionRange, the ~= operator is interpreted as isIncluded:

IQuery<IInstallableUnit> query = QueryUtil.createQuery("latest(x | x.id == $0 && x.version ~= $1)", id, range);

Query for an IU that has a specific property set:

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("properties[$0] == $1", key, value);

The same query, but this time for multiple possible values:

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("$1.exists(v | properties[$0] == v)", key, new Object[] { v1, v2, v3 });

Query for all categories found in the repository:

IQuery<IInstallableUnit> query = QueryUtil.createMatchQueryx("properties[$0] == true", IInstallableUnit.PROP_TYPE_CATEGORY);

Query for all IU's that fulfil at least one of the requirements from another IU. Since the first parameter is a list of IRequirements, the ~= applied to each each IU using satisfies.

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("$0.exists(rc | this ~= rc)", iu.getRequirements());

Query for the latest version of all patches:

IQuery<IInstallableUnit> query = QueryUtil.createQuery("latest(x | x ~= $0)", IInstallableUnitPatch.class);

Query for all IU's affected by a patch:

IQuery<IInstallableUnit> query = QueryUtil.createMatchQuery("$0.exists(rcs | rcs.all(rc | this ~= rc))", patch.getApplicabilityScope());

Localization

An installable unit may have locale specific properties. Such properties may be stored in the IU itself or in fragments that provide localization for the IU in the 'org.eclipse.equinox.p2.localization' namespace. p2QL will interpret the member translations applied on an IU as a map of translated properties, hiding all the details. As an example, this query:

translations['license'] ~= /*kommersiellt bruk*/

when used with a default Locale("sv_SE") would yield all IU's that has a license, translated into Swedish, that contains the exact phrase 'kommersiellt bruk'.

A special class named KeyWithLocale was added to allow queries for entries that does not match the current locale. So the above query can be written as:

QueryUtil.createMatchQuery("this.translations[$0] ~= /*kommersiellt bruk*/", new KeyWithLocale("license", new Locale("sv", "SE")))

Currying

Want some add some spice? Probably...

Consider the traversal function and the example we had above:

$0.traverse(parent | parent.requirements.collect(rc | select(iu | iu ~= rc)).flatten())

Now let us assume that we want to perform this traversal and filter the requirements based on platform. Our first attempt could look something like this.

$0.traverse(parent | parent.requirements.select(rc | rc.filter == null || $1 ~= rc.filter).collect(rc | select(iu | iu ~= rc)).flatten())

This would however be very inefficient since many requirements exists in several IU's. During the traversal there is no point traversing a requirement more then once so it would be good to "remember" the perused requirements.

The p2QL syntax permits something generally referred to as currying. Currying means that you can provide more parameters then the single one that represents the current element to the expression of a collection function. So far, we've seen examples using the syntax:

select(x | <do something with x>)

This is actually a short form for the longer:

select(_, {x | <do something with x>})

The select provides one parameter to each iteration. It's value is always provided reachable using the special operator '_'. In this case, the variable x maps to the parameter _ since they have the same positional offset. I can add more parameters by declaring:

select(a, b, _, {x, y, z, <do something with x, y, z>})
Variable x will now have the value of a, y the value of b, and z the value of _.

So why is this important? Well, the initializers are just evaluated once for one call to select. The expression however, is evaluated once for each new value of _.<p> <p>Let us now return to the traversal again. Because with this syntax, we can actually specify a global set that we can pass to the unique function so that no requirement is perused twice:

$0.traverse(set(), _, { rqCache, parent | parent.requirements.unique(rqCache).select(rc | rc.filter == null || $1 ~= rc.filter).collect(rc | select(iu | iu ~= rc)).flatten()})

Performance comparison: p2QL traversal versus the p2 Slicer

A test was performed using the query example above on the Galileo repository, using a specific version of the org.eclipse.sdk.feature.group IU as the root. The test produced a result with 411 IU's in < 12 ms average. I compared this with another test that used the p2 Slicer. The number of IU's produced was exactly the same. The slicer however, took > 22 ms (it also uses a capability cache internally). Not very scientific perhaps but repeated tests produce similar results.

I'm sure the Slicer can be optimized and achieve results that might be even slightly better then the traversal but I still think this proves a point. p2QL Performance is very good at this point.

Java API

The expression tree created by the parser must be well documented and easy to use so that queries can be created programmatically. Since all expressions are immutable and without context, they can be combined freely. Hence, code like this is fully possible:

// Create some expressions. Note the use of identifiers instead of
// indexes for the parameters

IExpressionFactory factory = ExpressionUtil.getFactory();
IExpression item = factory.variable("this");
IExpression cmp1 = factory.equals(factory.member(item, "id"), factory.parameter(0));
IExpression cmp2 = factory.equals(
 factory.at(factory.member(item, "properties"), factory.parameter(1)),
 factory.parameter(2));

IExpression everything = factory.variable("everything");
IExpression lambda = factory.lambda(item, factory.and(new IExpression[] {cmp1, cmp2}));
IExpression latest = factory.latest(factory.select(everything, lambda));

// Create the query
IQuery<IInstallableUnit> query = QueryUtil.createQuery<IInstallableUnit>(latest "test.bundle", "org.eclipse.equinox.p2.type.group", Boolean.TRUE);

The BNF

condition
	: orExpression ( '?' orExpression ':' orExpression )?
	;

orExpression : andExpression ( '||' andExpression )* ;

andExpression : binaryExpression ( '&&' binaryExpression )* ;

binaryExpression : notExpression ( op notExpression )?;

op : '==' | '!=' | '>' | '>=' | '<' | '<=' | '~=' ;

notExpression
	: '!' notExpression
	| collectionExpression
	;

collectionExpression
	: memberExpression ( '.' collectionFunction )*
	;

memberExpression : function ( ( '.' ID ) | ( '[' memberExpression ']' ) )* ;

function
	: ( filter | version | range | class) '(' condition ')'
	| set '(' ( condition ( ',' condition )* )? ')'
	| unaryExpression
	;

collectionFunction
	: ( select | reject | collect | exists | all | traverse | first ) '(' lambdaDefinition ')'
	| limit '(' memberExpression ')'
	| unique '(' memberExpression? ')'
	| latest '(' lambdaDefinition? ')'
	| flatten '(' lambdaDefinition? ')'
	;

lambdaDefinition
	: initializer ( ',' initializer )* ( ',' '{' lambda '}' )?
	| '{' lambda '}'
	| lambda
	;

initializer
	: '_'
	| condition
	;

lambda
	: ( ID ( ',' ID )* )? '|' condition
	;

unaryExpression
	: '(' condition ')'
	| '[' condition ( ',' condition )* ']' // #array construct
	| STRING
	| INT
	| parameter
	| 'null'
	| 'true'
	| 'false'
	| ID
	;

parameter
	: '$' INT | ID
	;

Back to the top