This plan lays out the work areas required to re architect the current Orion Java server for very large scale deployment. This is not all work we will be able to scope within a single release, but is intended to provide a roadmap of the work required over the coming releases to make this happen. Work items are roughly sorted by order of priority.

Assessment of current state

Much of the current Orion Java server is designed to enable scalability. There are a number of problem areas detailed in the next section, but the current server has some existing strengths:

Nearly stateless. With the exception of Java session state for authentication, and some metadata caching for performance, the current server is stateless. The server can be killed and restarted at any time with no data loss. All data is persisted to disk at the end of each HTTP request, with the exception of asynchronous long running operations that occur in a background thread and span multiple HTTP requests.
Fast startup. The current server starts within 2-3 seconds, most of which is Java and OSGi startup. No data is loaded until client requests start to arrive. The exception is the background search indexer, which starts in a background thread immediately and starts crawling. If the search index has been deleted (common practice on server upgraded) it can take several minutes for it to complete indexing.
Externalized configuration. Most server configuration is centrally managed in the orion.conf file. There are a small number of properties specified in orion.ini as well. Generally server configuration can be managed without code changes.
Scalable core technologies. All of the core technologies we are using on the server are known to be capable of significant scaling: Equinox, Jetty, JGit, and Apache Solr. In some cases we will need to change how these services are configured but nothing we need to throw away and start from scratch. Solr doesn't handle search crawling but we could use complementary packages like Apache Nutch for crawling if needed.

Work areas for scalability

Metadata backing store

As of Orion 2.0, the server metadata is stored in Equinox preferences. There are four preference trees for the various types of metadata: Users, Workspaces, Projects, and Sites. These are currently persisted in four files, one per preference tree. There are a number of problems with this implementation:

Storing large amounts of metadata in a single file that needs to be read on most requests is a severe bottleneck.
Migration of metadata format across builds/releases is ad-hoc and error prone
Individual preference node access/storage is thread safe, but many Orion API calls require multiple read/writes to multiple nodes. There is no way to synchronize access/update across nodes.
Metadata file format does not lend itself to administrative tools for managing users, migrating, etc.
There is no clean API to the backing store and the interaction between the request handling services and the backing store is tangled.

We need a clean backing store API, and a scalable implementation of that API. The actual backing storage should be pluggable

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Orion/Server scalability

Contents

Assessment of current state

Work areas for scalability

Metadata backing store

Move search engine/indexer to separate process(es)

Move long running Git operations out of process

Move Java session state out of process

Externalize log configuration

Search for implicit assumptions about system tools (exec calls)

Breadcrumbs

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Orion/Server scalability

Contents

Assessment of current state

Work areas for scalability

Metadata backing store

Move search engine/indexer to separate process(es)

Move long running Git operations out of process

Move Java session state out of process

Externalize log configuration

Search for implicit assumptions about system tools (exec calls)