Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Orion/Search Plan

< Orion
Revision as of 11:45, 27 August 2014 by John arthorne.ca.ibm.com (Talk | contribs) (Created page with "This page is a design document for the search component in the Orion project. = Current State = Orion currently has two search implementations, each with distinct advant...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

This page is a design document for the search component in the Orion project.

Current State

Orion currently has two search implementations, each with distinct advantages and drawbacks. They are used for different sets of end user search capabilities exposed in the user interface.

Crawler

The first implementation is a client-side search implementation written in JavaScript. It does not use an index, but traverses the entire document space on each search. It is highly accurate, and supports case-sensitive vs insensitive search, and regular expression search. Since it is client side, it is agnostic of server implementation. This means it works equally on Java and JavaScript server implementations. On the downside it is extremely slow, both because the content to be search has to be transferred across the network to the client, and because it exhaustively examines every document on every search.

Indexed Search

The second search implementation runs on the Java server using Apache Lucene/Solr. It is index based, making it very fast even over large document spaces. It supports both case-sensitive search on filenames, but only case-insensitive search of file content. Lucene can support both, but doing so doubles the size of the search index. Currently the indexer runs periodically and has no smarts about what files have changed. This means there can be a significant lag between changes to files, and when the indexer is updated (several minutes, or even an hour on a very large workspace such as OrionHub with 25,000 users.

Back to the top