Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Equinox/p2/Proposals/Query Management and Optimization"

< Equinox‎ | p2
Line 2: Line 2:
  
 
'''Problem:'''  Query and result sets are too tightly coupled.
 
'''Problem:'''  Query and result sets are too tightly coupled.
 +
 
'''Objective:''' Decouple the query from the results it returns
 
'''Objective:''' Decouple the query from the results it returns
 +
 
'''Goals:'''  
 
'''Goals:'''  
 
# Allow those implementing IQueryable to implement optimized versions of the query
 
# Allow those implementing IQueryable to implement optimized versions of the query
Line 9: Line 11:
 
While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible.  This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues.
 
While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible.  This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues.
  
There are currently four (4) areas that we have identified that can be improved:
+
There are currently five (5) requirements for this proposal:
# Define standard queries as "non-opaque" meaning repository implementors could supply custom implementations using SQL or some other query language
+
# Allow for composite queries. This includes both Querying the results of a previous query and composing several queries up front.
 
# Remove the complexities surrounding collectors.  Many collectors act as queries, hampering the repositories ability to create custom implementations
 
# Remove the complexities surrounding collectors.  Many collectors act as queries, hampering the repositories ability to create custom implementations
# Make query results queryable.  In many instances, it would be convenient to re-query a set of query results.  This may have performance implications with respect to the non-opaque queries.
+
# Remove the requirement that the collector passed in, is the collector returned (possibly even remove the collector argument to IQueryable)
 +
# Define standard queries as "non-opaque" meaning repository implementors could supply custom implementations using SQL or some other query language
 
# Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results.  This also includes QueryStatus,  
 
# Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results.  This also includes QueryStatus,  
 +
 +
== Support Composite Queries ==
 +
 +
While p2 does have a compound query, this query does not work if the compounded queries are "complex", that is, they have overridden the perform() method.  To solve this, there are two things that should be done:
 +
# Make the collectors themselves Queryable ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260112 Bug #260112])
 +
# Change CompoundQuery so the perform logic properly calls perform() on each Compounded Query ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260012 Bug 260012])
 +
 +
== Simplify the Collectors ==
 +
Currently, many collectors have been implemented as "Queries".  That is, the collectors have the logic of whether or not something should be included in the result set.  This implementation allows collectors to essentially be used as "composite queries".  This has a number of drawbacks, including:
 +
# Query implementors cannot account for this when creating alternative query implementations.
 +
# This adds complexity when trying to design asynchronous query results
 +
# Limits reuse as some queries use the Query class isMatch method while other use the collector accept method.
 +
 +
To address this issue, we propose re-writing complex collectors (those that override the accept method), in terms of a Query.  In particular, this affects:
 +
* LatestIUVersionCollector
 +
* AvailableIUCollector
 +
* LatestIUVersionElementCollector ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260105 Bug 260105])
 +
* CategoryElementCollector
 +
* InstalledIUCollector
 +
* ProductQuery.Collector
 +
* IUPropertyUtils .localeFragmentCollector
 +
* IUPropertyUtils.hostLocalizationCollector
 +
 +
There are also a few other queries that provide a custom mechanism for storing the results (i.e. group IUs into categories, etc...).  These should also be reviewed.
 +
 +
In order to support the migration of Collectors to Queries, Collectors themselves will likely need to be Queryable.
 +
 +
== Remove the collector argument to IQueryable#query() ==
 +
Once the collectors have been simplified, there will be no need for the client to force a collector on the receiver, and the receiver will be free to construct a collector however they see fit. ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=256355 Bug 256355])
  
 
== Non Opaque Queries ==
 
== Non Opaque Queries ==
  
To help repository implementors craft custom queries to represent many of the "standard" p2 queries, a set of non-opaque (i.e. transparent) queries should be provided.  These queries should have well documented semantics and allow query implementors to access the properties of the query without explicitly depending on the Query itself.
+
To help repository implementers craft custom queries to represent many of the "standard" p2 queries, a set of non-opaque (i.e. transparent) queries should be provided.  These queries should have well documented semantics and allow query implementers to access the properties of the query without explicitly depending on the Query itself.
  
 
To support alternative implementations of Queries, we propose adding the following two methods to Query.java:
 
To support alternative implementations of Queries, we propose adding the following two methods to Query.java:
Line 67: Line 99:
 
</source>
 
</source>
  
== Simplify the Collectors ==
 
Currently, many collectors have been implemented as "Queries".  That is, the collectors have the logic of whether or not something should be included in the result set.  This implementation allows collectors to essentially be used as "composite queries".  This has a number of drawbacks, including:
 
# Query implementors cannot account for this when creating alternative query implementations.
 
# This adds complexity when trying to design asynchronous query results
 
# Limits reuse as some queries use the Query class isMatch method while other use the collector accept method.
 
 
To address this issue, we propose re-writing complex collectors (those that override the accept method), in terms of a Query.  In particular, this affects:
 
* LatestIUVersionCollector
 
* AvailableIUCollector
 
* LatestIUVersionElementCollector
 
* CategoryElementCollector
 
* InstalledIUCollector
 
* ProductQuery.Collector
 
* IUPropertyUtils .localeFragmentCollector
 
* IUPropertyUtils.hostLocalizationCollector
 
 
There are also a few other queries that provide a custom mechanism for storing the results (i.e. group IUs into categories, etc...).  These should also be reviewed.
 
 
In order to support the migration of Collectors to Queries, Collectors themselves will likely need to be Queryable.
 
 
== Make Query Results (Collectors) Queryable ==
 
  
To help simplify the collectors while still allowing them to be used as "composite queries", we propose making Query Results, (i.e. collectors) Queryable.  Of course, this still limits alternative query implementations (transparent queries), and composing queries in this manner will not be enouraged.  Over time, we will investigate the possibility of constructing proper "CompoundQueries".
 
  
 
== Design Query Results (Collectors) as a Future ==
 
== Design Query Results (Collectors) as a Future ==

Revision as of 02:38, 7 January 2009


Problem: Query and result sets are too tightly coupled.

Objective: Decouple the query from the results it returns

Goals:

  1. Allow those implementing IQueryable to implement optimized versions of the query
  2. Return results in a lazy / incremental way

While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible. This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues.

There are currently five (5) requirements for this proposal:

  1. Allow for composite queries. This includes both Querying the results of a previous query and composing several queries up front.
  2. Remove the complexities surrounding collectors. Many collectors act as queries, hampering the repositories ability to create custom implementations
  3. Remove the requirement that the collector passed in, is the collector returned (possibly even remove the collector argument to IQueryable)
  4. Define standard queries as "non-opaque" meaning repository implementors could supply custom implementations using SQL or some other query language
  5. Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results. This also includes QueryStatus,

Support Composite Queries

While p2 does have a compound query, this query does not work if the compounded queries are "complex", that is, they have overridden the perform() method. To solve this, there are two things that should be done:

  1. Make the collectors themselves Queryable (Bug #260112)
  2. Change CompoundQuery so the perform logic properly calls perform() on each Compounded Query (Bug 260012)

Simplify the Collectors

Currently, many collectors have been implemented as "Queries". That is, the collectors have the logic of whether or not something should be included in the result set. This implementation allows collectors to essentially be used as "composite queries". This has a number of drawbacks, including:

  1. Query implementors cannot account for this when creating alternative query implementations.
  2. This adds complexity when trying to design asynchronous query results
  3. Limits reuse as some queries use the Query class isMatch method while other use the collector accept method.

To address this issue, we propose re-writing complex collectors (those that override the accept method), in terms of a Query. In particular, this affects:

  • LatestIUVersionCollector
  • AvailableIUCollector
  • LatestIUVersionElementCollector (Bug 260105)
  • CategoryElementCollector
  • InstalledIUCollector
  • ProductQuery.Collector
  • IUPropertyUtils .localeFragmentCollector
  • IUPropertyUtils.hostLocalizationCollector

There are also a few other queries that provide a custom mechanism for storing the results (i.e. group IUs into categories, etc...). These should also be reviewed.

In order to support the migration of Collectors to Queries, Collectors themselves will likely need to be Queryable.

Remove the collector argument to IQueryable#query()

Once the collectors have been simplified, there will be no need for the client to force a collector on the receiver, and the receiver will be free to construct a collector however they see fit. (Bug 256355)

Non Opaque Queries

To help repository implementers craft custom queries to represent many of the "standard" p2 queries, a set of non-opaque (i.e. transparent) queries should be provided. These queries should have well documented semantics and allow query implementers to access the properties of the query without explicitly depending on the Query itself.

To support alternative implementations of Queries, we propose adding the following two methods to Query.java:

/**
 * Indicates whether or not this Query is Transparent.  The properties of Transparent queries can be
 * accessed via the <link>getProperty</link> method.  Each transparent query should provide a 
 * get<PropertyName>() method for each parameter. In addition to this, each property should be describe as a 
 * constant at the top of the class.  For example, the CapabilityQuery should define:
 * 
 * public static final String REQUIRED_CAPABILITIES = "RequiredCapabilities";
 *
 * public IRequiredCapability[] getRequiredCapabilities();
 *
 * This query should also override isTransparent to return "true".
 *
 */
public boolean isTransparent {
    return false;
}
 
/**
 * Returns a particular property of a given Query.  
 */
public Object getProperty(String property) {
  // Reflectively look up the property and call the getter to get it
  // If it fails, return null;
}

Implementors of IQueryable can use these methods to construct alternative mechanisms of querying their data. For example, a DB backed IQueryable may use these methods to construct an SQL statement.

class MyRepository implements IQueryable {
  public Collector query(Query query, Collector collector, IProgressMonitor monitor) {
    if (query.isTransparent() && query.getClass().getName().equals("CapabilityQuery") {
        Object o = query.getProperty("RequiredCapabilities");
        if ( o != null ) {
           IRequiredCapability[] capabilities = (IRequiredCapability[]);
           SQLStatement statement = constructCapabilityQuery(capabilities);
           executeSQL(statement);
           return results;
        }
    }
    return query.perform(getIterator, collector);
  }
}


Design Query Results (Collectors) as a Future

This needs to be filled in, but here are some basic requirements:

  • Support asynchronous data collection, that is, don't block when the query perform happens
  • Collect a number of items at one time
    • Support Polling (query.isDataAvailable(int numberOfDatum))
    • Support blocking (query.waitUntilDataIsAvailable(int numberOfDatum))
  • Restart query (Possibly add a new query to the collector, and restart the query)
  • End Query
  • Get the status of the Query (working, done, broken, etc..)

Back to the top