Code Recommenders uses model archives to deliver its contents to the clients. A model archive is essentially a zip file that contains recommendation models for a certain library.
These model archives are organized in so called model repositories. A model repository is essentially a Maven repository which stores the model archives with a specific classifier.
For example, all Code Recommenders' call completion models use the "call" classifier, i.e., a model for SWT library is stored in a Maven repository under $M2_REPO/org/eclipse/swt/org.eclipse.swt/3.0.0/org.eclipse.swt-calls-3.0.0.zip.
Hosting a model repository is simple. As Code Recommenders uses a maven repository to distribute its model archives, these repositories can be served from any web server, and as such they can be mirrored easily across multiple servers.
The repository can be searched by using a local search index that get's downloaded once on first startup and updated (if changed since last access) on every restart. This is similar to the way how M2E assists its users by downloading the search index of Maven Central.
Requirements for Hosting a Server
Model Archive Disk Size: For each library an individual model archive is created. Model archive may support ranges of a library (e.g., [3.0.0,4.0.0)). Thus, a single model archive may serve SWT 3.7 and 3.8 as well as any platform such as OSX, Linux or Win32. The size of a model archive is not limited in any terms but typically does not exceed 2MB. For example, when generating call completion models for Eclipse APIs from Eclipse M6 classic, the size of all generated model archives amounts to ~50MB. Generating models for the complete Juno Release train may result in several hundreds MBs up to a few GBs of model archives.
Network Bandwidth: Model archives are downloaded on demand, i.e., as soon as a user triggers code completion on a variable of a supported type or selects a supported type with an open Extdoc View. If neither Code Recommenders intelligent code completions nor Code Recommenders Extdoc View is used, no models will be downloaded. In addition, this auto-download feature can (will - not implemented in M7 yet) be disabled. Since all models are stored in a local maven repository, all models are downloaded only once per workspace.
Hosting Recommendation Models for Juno Release
In the long run, Code Recommenders team would like to host a canonical model repository similar to Maven Central (http://repo1.maven.org/maven2/) for any open source library. As Eclipse statutes do not allow any software to refer to any external non-Eclipse URLs without permission, we see two good solutions (maybe there are more - but at the moment I see two good ones :):
1. eclipse.org solution: Recommenders delivers recommendation models from the Eclipse download server, e.g., from http://download.eclipse.org/recommenders/models/juno/. A more advanced solution may be to set up a virtual server for Code Recommenders where models are delivered from (e.g., http://recommenders.eclipse.org/repository/juno/) - but this is not needed for Juno.
2. coderecommenders.org solution: In the long run, we also want to enable users to share anonymized usage data with each other to build the ultimate goal of IDE 2.0 as described here. As pointed out by Wayne and others, sharing private usage data with a central server (independent where it's hosted) is something that should not be taken lightly. Many privacy concerns may arise and - if done wrong - may have a negative impact on the trademark Eclipse and Eclipse Code Recommenders as well.
Thus, we think that building and maintaining such advanced services' that enable users to share usage data at Eclipse.org is not the way to go given the current state of the project and the amount of work that needs to be done to convince users that it will be a good thing. Time will show if people want to participate in such systems - and it's not required to tackle the largest-and-hardest barrier first but to prove that Code Recommenders in general is a great and effective concept.
However, we'd like to provide such services to learn how far we get with these advanced IDE 2.0 services. We think that the tools we develop(ed) should not be tightly coupled to the Eclipse project for the reasons given above but in a external project that may also gives us the opportunity to build commercial services around like Tasktop successfully showed.
Summarizing the two solutions (hosted at eclipse.org vs. hosted at coderecommenders.org), for Juno release we'd prefer to host a model repository on the Eclipse infrastructure at, e.g., http://download.eclipse.org/recommenders/models/juno/. But on the long run, we'd like to move to a more central server like coderecommenders.org (as Sonatype did with Mavencentral).
Data Privacy in Juno Release
To state it clearly somewhere: For Juno - except from downloading a model index and model archives - Code Recommenders does not upload any usage data.