Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Difference between revisions of "Equinox/p2/Proposals/Query Management and Optimization"
Line 2: | Line 2: | ||
'''Problem:''' Query and result sets are too tightly coupled. | '''Problem:''' Query and result sets are too tightly coupled. | ||
+ | |||
'''Objective:''' Decouple the query from the results it returns | '''Objective:''' Decouple the query from the results it returns | ||
+ | |||
'''Goals:''' | '''Goals:''' | ||
# Allow those implementing IQueryable to implement optimized versions of the query | # Allow those implementing IQueryable to implement optimized versions of the query | ||
Line 9: | Line 11: | ||
While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible. This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues. | While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible. This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues. | ||
− | There are currently | + | There are currently five (5) requirements for this proposal: |
− | # | + | # Allow for composite queries. This includes both Querying the results of a previous query and composing several queries up front. |
# Remove the complexities surrounding collectors. Many collectors act as queries, hampering the repositories ability to create custom implementations | # Remove the complexities surrounding collectors. Many collectors act as queries, hampering the repositories ability to create custom implementations | ||
− | # | + | # Remove the requirement that the collector passed in, is the collector returned (possibly even remove the collector argument to IQueryable) |
+ | # Define standard queries as "non-opaque" meaning repository implementors could supply custom implementations using SQL or some other query language | ||
# Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results. This also includes QueryStatus, | # Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results. This also includes QueryStatus, | ||
+ | |||
+ | == Support Composite Queries == | ||
+ | |||
+ | While p2 does have a compound query, this query does not work if the compounded queries are "complex", that is, they have overridden the perform() method. To solve this, there are two things that should be done: | ||
+ | # Make the collectors themselves Queryable ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260112 Bug #260112]) | ||
+ | # Change CompoundQuery so the perform logic properly calls perform() on each Compounded Query ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260012 Bug 260012]) | ||
+ | |||
+ | == Simplify the Collectors == | ||
+ | Currently, many collectors have been implemented as "Queries". That is, the collectors have the logic of whether or not something should be included in the result set. This implementation allows collectors to essentially be used as "composite queries". This has a number of drawbacks, including: | ||
+ | # Query implementors cannot account for this when creating alternative query implementations. | ||
+ | # This adds complexity when trying to design asynchronous query results | ||
+ | # Limits reuse as some queries use the Query class isMatch method while other use the collector accept method. | ||
+ | |||
+ | To address this issue, we propose re-writing complex collectors (those that override the accept method), in terms of a Query. In particular, this affects: | ||
+ | * LatestIUVersionCollector | ||
+ | * AvailableIUCollector | ||
+ | * LatestIUVersionElementCollector ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=260105 Bug 260105]) | ||
+ | * CategoryElementCollector | ||
+ | * InstalledIUCollector | ||
+ | * ProductQuery.Collector | ||
+ | * IUPropertyUtils .localeFragmentCollector | ||
+ | * IUPropertyUtils.hostLocalizationCollector | ||
+ | |||
+ | There are also a few other queries that provide a custom mechanism for storing the results (i.e. group IUs into categories, etc...). These should also be reviewed. | ||
+ | |||
+ | In order to support the migration of Collectors to Queries, Collectors themselves will likely need to be Queryable. | ||
+ | |||
+ | == Remove the collector argument to IQueryable#query() == | ||
+ | Once the collectors have been simplified, there will be no need for the client to force a collector on the receiver, and the receiver will be free to construct a collector however they see fit. ([https://bugs.eclipse.org/bugs/show_bug.cgi?id=256355 Bug 256355]) | ||
== Non Opaque Queries == | == Non Opaque Queries == | ||
− | To help repository | + | To help repository implementers craft custom queries to represent many of the "standard" p2 queries, a set of non-opaque (i.e. transparent) queries should be provided. These queries should have well documented semantics and allow query implementers to access the properties of the query without explicitly depending on the Query itself. |
To support alternative implementations of Queries, we propose adding the following two methods to Query.java: | To support alternative implementations of Queries, we propose adding the following two methods to Query.java: | ||
Line 67: | Line 99: | ||
</source> | </source> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Design Query Results (Collectors) as a Future == | == Design Query Results (Collectors) as a Future == |
Revision as of 02:38, 7 January 2009
Problem: Query and result sets are too tightly coupled.
Objective: Decouple the query from the results it returns
Goals:
- Allow those implementing IQueryable to implement optimized versions of the query
- Return results in a lazy / incremental way
While p2 is designed to support alternate storage for things like IUs, providing optimized mechanisms for querying these locations is not currently possible. This (working draft) proposal will help identify the current limitations with custom implementations of IQueryable and present concrete suggestions on how to address these issues.
There are currently five (5) requirements for this proposal:
- Allow for composite queries. This includes both Querying the results of a previous query and composing several queries up front.
- Remove the complexities surrounding collectors. Many collectors act as queries, hampering the repositories ability to create custom implementations
- Remove the requirement that the collector passed in, is the collector returned (possibly even remove the collector argument to IQueryable)
- Define standard queries as "non-opaque" meaning repository implementors could supply custom implementations using SQL or some other query language
- Design QueryResults (Collectors) as a Future, that is, something that can continually gather results and the caller can check available results, listen for results and get the results. This also includes QueryStatus,
Contents
Support Composite Queries
While p2 does have a compound query, this query does not work if the compounded queries are "complex", that is, they have overridden the perform() method. To solve this, there are two things that should be done:
- Make the collectors themselves Queryable (Bug #260112)
- Change CompoundQuery so the perform logic properly calls perform() on each Compounded Query (Bug 260012)
Simplify the Collectors
Currently, many collectors have been implemented as "Queries". That is, the collectors have the logic of whether or not something should be included in the result set. This implementation allows collectors to essentially be used as "composite queries". This has a number of drawbacks, including:
- Query implementors cannot account for this when creating alternative query implementations.
- This adds complexity when trying to design asynchronous query results
- Limits reuse as some queries use the Query class isMatch method while other use the collector accept method.
To address this issue, we propose re-writing complex collectors (those that override the accept method), in terms of a Query. In particular, this affects:
- LatestIUVersionCollector
- AvailableIUCollector
- LatestIUVersionElementCollector (Bug 260105)
- CategoryElementCollector
- InstalledIUCollector
- ProductQuery.Collector
- IUPropertyUtils .localeFragmentCollector
- IUPropertyUtils.hostLocalizationCollector
There are also a few other queries that provide a custom mechanism for storing the results (i.e. group IUs into categories, etc...). These should also be reviewed.
In order to support the migration of Collectors to Queries, Collectors themselves will likely need to be Queryable.
Remove the collector argument to IQueryable#query()
Once the collectors have been simplified, there will be no need for the client to force a collector on the receiver, and the receiver will be free to construct a collector however they see fit. (Bug 256355)
Non Opaque Queries
To help repository implementers craft custom queries to represent many of the "standard" p2 queries, a set of non-opaque (i.e. transparent) queries should be provided. These queries should have well documented semantics and allow query implementers to access the properties of the query without explicitly depending on the Query itself.
To support alternative implementations of Queries, we propose adding the following two methods to Query.java:
/** * Indicates whether or not this Query is Transparent. The properties of Transparent queries can be * accessed via the <link>getProperty</link> method. Each transparent query should provide a * get<PropertyName>() method for each parameter. In addition to this, each property should be describe as a * constant at the top of the class. For example, the CapabilityQuery should define: * * public static final String REQUIRED_CAPABILITIES = "RequiredCapabilities"; * * public IRequiredCapability[] getRequiredCapabilities(); * * This query should also override isTransparent to return "true". * */ public boolean isTransparent { return false; } /** * Returns a particular property of a given Query. */ public Object getProperty(String property) { // Reflectively look up the property and call the getter to get it // If it fails, return null; }
Implementors of IQueryable can use these methods to construct alternative mechanisms of querying their data. For example, a DB backed IQueryable may use these methods to construct an SQL statement.
class MyRepository implements IQueryable { public Collector query(Query query, Collector collector, IProgressMonitor monitor) { if (query.isTransparent() && query.getClass().getName().equals("CapabilityQuery") { Object o = query.getProperty("RequiredCapabilities"); if ( o != null ) { IRequiredCapability[] capabilities = (IRequiredCapability[]); SQLStatement statement = constructCapabilityQuery(capabilities); executeSQL(statement); return results; } } return query.perform(getIterator, collector); } }
Design Query Results (Collectors) as a Future
This needs to be filled in, but here are some basic requirements:
- Support asynchronous data collection, that is, don't block when the query perform happens
- Collect a number of items at one time
- Support Polling (query.isDataAvailable(int numberOfDatum))
- Support blocking (query.waitUntilDataIsAvailable(int numberOfDatum))
- Restart query (Possibly add a new query to the collector, and restart the query)
- End Query
- Get the status of the Query (working, done, broken, etc..)