Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

PTP/designs/remote/sync final

< PTP‎ | designs‎ | remote
Revision as of 11:40, 9 October 2012 by G.watson.computer.org (Talk | contribs) (Preprocessing & Indexing)

Introduction

This document describes the design of the Remote Project Synchronization framework. It is based on the original proposal by Roland Schulz and others

Eclipse projects (including C/C++ projects) are traditionally stored on the local filesystem of the machine on which Eclipse is running. This is adequate for Java, since the bytecode is inherently portable, and for embedded development, since a cross-compiler environment is typically employed. However it does not suit the development of HPC applications because it is generally very difficult to replicate the environment of the target system on the local machine. In order to overcome this limitation, the Remote Development Tools (RDT) component of PTP added support for remote projects. However, the approach taken has a number of serious limitations that preclude it being used to develop many applications. Although the functionality of RDT has improved, the main issues that still remains outstanding is that it only supports C/C++ projects and is not a general solution that can be used for other languages and project types.

Remote Project Synchronization is an alternative to RDT that overcomes many of RDT's inherent limitations. Synchronized projects work by maintaining both local and a remote copies of the source code, and these two copies are kept in synchronization by Eclipse. The advantage of this approach is that Eclipse is able to operate on the local copy as normal, and does not need to be concerned with network delays or other issues. No special changes to the project infrastructure are required, since the project just looks like a normal local project. The remote copy of the code is maintained in order to be able to build the application in the environment present on the target system without needing to copy the entire source tree each time. This allows any native compilers or libraries that are present to be utilized without the need for this environment to be replicated on the local machine.

The three different project types are shown below.

Project types.png

In the local project, the primary Eclipse services, such as editing, indexing, searching, navigating, building etc. operate directly on the local project, which is located on the local filesystem of the machine running Eclipse. For remote projects, these services are proxied by an agent running on the remote machine, which must provide remote equivalents of the services Eclipse uses locally. Because the index, search, and navigation services tend to be very language specific, a different agent is required for each language that is to be supported. Currently an agent is only available for C and C++. In the case of synchronized projects, editing, indexing, etc. operate on the local copy of the project. Only the project build needs to be run on the target machine. Because this service is more generic than the language services, it is available for more languages (currently C, C++, and Fortran). To support other languages, such as python or Java, a remote version of their builders would also need to be provided. This is a much simpler task than adding services for a different language.

Design

A number of principles underpin the design of the synchronization framework. These provide goals that, although they may not be met initially, the implementation can work towards in future revisions. These principles are as follows:

  • Synchronized projects should not interfere with existing project natures. i.e. any kind of project should be synchronizable. While this goal is desirable, the current implementation only supports synchronization of CDT projects (C, C++, UPC, and Fortran) for a number of reasons that will be discussed in more detail below. Our plan is to eventually support any project nature, however.
  • The use of synchronized projects should be transparent to the user. This is largely the case for most user interface operations from within Eclipse. However, the project is ultimately located on a remote machine, so there is some degree of configuration that is required to set this up. Also, building the project occurs remotely, so the user generally needs to be aware that this is occurring.
  • Synchronized projects should be independent of Team support. Although there is a temptation to use synchronization for sharing project code, synchronized support is really orthogonal to Team support since it does not provide the same richness of features of revision control systems. Synchronization is primarily for keeping two copies of code in sync. In the current implementation, synchronized projects can also be shared with any of the Team providers that are available in Eclipse.

The synchronization framework is designed to be extensible, so that different synchronization providers can be provided to suit different environments. This is achieved using the services framework provided by PTP. To add a new synchronization provider, a plugin must supply a class that implements the ISyncServiceProvider interface, then register this class using the org.eclipse.ptp.services.core.providers extension point. The services framework will then manage loading the class at the correct time, persisting data, etc.

Project Nature

When a synchronized project is created, or an existing project is converted to synchronized, a remoteSyncNature is added to the project configuration. Only projects with this nature will be recognized by the synchronization framework. Projects that have this nature are decorated in the Project Explorer view to indicate that they are synchronized.

Synchronization Initiation

The synchronization framework registers a resource change listener for all resources in the workspace. Synchronization will be initiated for any resource delta that is of type IResourceDelta.CHANGED and after an event of type IResourceChangeEvent.POST_BUILD. The first synchronization will ensure that any resources that have been added, modified, or deleted in the local workspace will be reflected on the target system. The second synchronization will ensure that any artifacts that are generated as a result of the build will be copied to the local workspace. Only builds that were started manually will result in a synchronization, however, since automatic builds will occur after any resource change.

In addition to synchronizing on a resource change, projects will also be synchronized prior to a build. For CDT projects, the framework registers a remote builder using the org.eclipse.cdt.managedbuilder.core.buildDefinitions extension point, and adds this builder to the properties for the project. When the project is built, the ICommandLauncher#execute() method is invoked by the CDT managed build framework. PTP provides an implementation of ICommandLauncher called SyncCommandLauncher that will first initiate a synchronization, then issue the appropriate remote commands to build the project.

Multiple Hosts

Synchronization to multiple hosts is supported using the BuildScenario class. An object of this class is passed to the ISyncServiceProvider#synchronize() method along with the project reference. A BuildScenario encapsulates the connection and location information for the project on the remote target system. By maintaining multiple BuildScenarios, it is possible to perform synchronization to multiple systems.

For CDT projects, the BuildScenario information is stored in the project build configuration (IConfiguration interface). The framework provides the BuildConfigurationManager class for managing the interface between the synchronization framework and CDT's build configurations.

CDT provides the notion of an "active" build configuration, so the synchronization framework uses this as the default configuration when synchronizing the project. The user interface also provides a means of synchronizing all build configurations, which will cause the project to be synchronized with each target system that has been configured.

Off-line Operation

Filtering

Synchronization Providers

Currently there only mature synchronization provider uses the Git protocol to manage synchronization. Some work has been undertaken to develop a provider based on Rsync, but to date this has not been completed. The Git provider uses the JGit feature to provide a Java implementation of Git for Eclipse. Git also has the advantage of being very fast, even for a large number of small commits.

Scanner Discovery & Include Files

Many of the advanced features of CDT and Photran require indexing/parsing the source code. When a project is being built remotely, the source code presented to the user should reflect the environment on the remote system rather than the local system. The main mechanisms that distinguish a remote environment from the local environment are the macros that are predefined by the compiler and the system include files that are included in the source code.

Note that this support is independent of remote synchronization (although may use some remote synchronization functionality to achieve it). A remotely synchronized project can still use a purely local environment without requiring any additional functionality.

To present an accurate reflection of the remote environment, system include files need to be fetched from the remote system for the preprocessor and indexer. This requires changes to the preprocessor and other parts of CDT/Photran in order to obtain the header file from the correct remote system rather than from the local system. To avoid performance issues, header files should also be cached locally where possible, but this must be an option as licensing issues may prevent it on some systems.

Compiler (and makefile) defined macros play an important part in determining which header files will be included as well as which parts of the code will be enabled or disabled. CDT attempts to determine these macros using a process known as scanner discovery. This involves running commands on the target system to determine the macros that are defined. This process is inherently complex because every system has different compilers with different options for determining this information. RDT has already provided some remote scanner discovery functionality, so we plan to reuse this for synchronized projects.

Support for other Remote Tools besides Build

TBD

Milestones

  • Check feasibility of remote include path support in CDT (see #Preprocessing & Indexing)
  • Define a new Synchronization service type (which add synchronization/replication to the running EFS). It would have as public method guranteeSynchronized. The default server (for a purely local project or for remotetools/RSE) would do nothing.
  • Add to all remote operations (compile, remote index, ..) a call to gureanteeSyncronized
  • Implement a GIT based synchronization service (doing the GIT push in the gureanteeSyncronized call)
  • Add the GUI to configure the synronization service (including new project wizard)

Later an EFS which would do the asynchronous GIT push after a file modifcation(e.g. save) and the gureanteeSyncronized would just wait for the push to finish.

Timeline: TBD

Additional Features

Remote file view

To run the binary it is required to select it on the remote machine. For that it would be nice to be able to browse the remote machine using RemoteTools/RSE. Preferable the path shown is the path to the remote copy of the local project. Also it would be nice to have a "Project Explorer" View for the files on the remote machines. This would allow to view binaries, object files and other remote files not part of the synchronization. This view should also use RemoteTools/RSE

Build local and remote from same project

If the remote build is implemented as a builder which can be added to a standard CDT/Photran project, than it is possible to have both a local and remote builder for the same project. CDT already supports several builder configuration. Thus this can be used to switch between local and (potentially several) remote builder. It has to be checked that the indexer (including the remote include files) is updated correctly when the builder configuration switches.

Back to the top