Jump to: navigation, search

Difference between revisions of "PTP/designs/remote/sync"

< PTP‎ | designs‎ | remote
Line 59: Line 59:
  
 
== Outside from Eclipse ==
 
== Outside from Eclipse ==
 +
=== Synchronization using GIT ===
 
* [http://www.turnkeylinux.org/blog/website-synchronization website synchronization]
 
* [http://www.turnkeylinux.org/blog/website-synchronization website synchronization]
 
* [http://www.codebork.com/coding/2010/06/03/php-web-deployment-using-git.html php web deployment using git]
 
* [http://www.codebork.com/coding/2010/06/03/php-web-deployment-using-git.html php web deployment using git]
Line 64: Line 65:
 
* [http://insideria.com/2009/12/5-tips-for-deploying-sites.html 5 tips for deploying sites]
 
* [http://insideria.com/2009/12/5-tips-for-deploying-sites.html 5 tips for deploying sites]
 
* [http://stackoverflow.com/questions/883878/update-website-with-a-single-command-git-push-instead-of-ftp-drag-and-dropping update website with a single command git push]
 
* [http://stackoverflow.com/questions/883878/update-website-with-a-single-command-git-push-instead-of-ftp-drag-and-dropping update website with a single command git push]
 +
=== Remote file systems ===
 +
* [http://www.xtreemfs.org/]
 +
* [http://offlinefs.sourceforge.net/wiki/]
 +
* [http://www.microsoft.com/windowsserversystem/dfs/default.mspx]
 +
* [http://www.eetimes.com/electronics-news/4144653/Enabling-File-Sharing-over-the-WAN]
 +
* [http://sector.sourceforge.net/]
 +
* [http://www.coda.cs.cmu.edu/]
 +
* [http://www.cuteftp.com/wafs/]
 +
* [http://www.riverbed.com/products/compare/wafs.php]
 +
* [http://portal.acm.org/citation.cfm?id=844128.844131]
 +
* [http://userweb.cs.utexas.edu/users/dahlin/papers/FINAL-PRACTI-NSDI.pdf]
  
 
= Options at what time the synchronization is done =
 
= Options at what time the synchronization is done =
Line 72: Line 84:
 
better depends on the synchronization back-end and the user preferences and should just be configurable.   
 
better depends on the synchronization back-end and the user preferences and should just be configurable.   
  
 +
Pro after each save:
 +
* Required for Auto-Build and Indexing on Server
 +
* Reduces time to build (because is already synced)
 +
Contra
 +
* Causes larger repository (~2k per commit) and more traffic
  
 
= Possible Back-ends =  
 
= Possible Back-ends =  
Line 97: Line 114:
 
performance is mainly limited by the file system for the tree traversal.
 
performance is mainly limited by the file system for the tree traversal.
  
 +
== Implementation Issues with GIT ==
 +
=== To push to a non-bare repository is discouraged ===
 +
Their are different options
 +
* Fetch (not good option because it requires SSHD on the client)
 +
* Push into working branch with [http://utsl.gen.nz/git/post-update post-update hook]. Disadvantages: Requires
 +
stat of each file on server (slow over NFS) and doesn't allow merge on client
 +
side
 +
* Push to separate bare repository. Disadvantage: Requires 2 repositories
 +
* [http://thread.gmane.org/gmane.comp.version-control.git/42506/focus=42685 Push to remote branch]. Seems best option
  
 
=Indexing=
 
=Indexing=

Revision as of 21:14, 22 November 2010

Introduction

This document should describe the current state of the ideas and designs for a synchronization based file access for remote projects. This document in it it's original version has been created by Roland Schulz based on discussions on bug 316709.

Rationale

Both RemoteTools and RSE use SFTP to access files when editing them. This has several disadvantages:

  • Working on files when not connected to the internet
  • Responsiveness of UI
  • Not all PTP functions are supported using RemoteTools/RDT

The synchronization approach has different disadvantages (see below). Thus this is not meant to replace the current RemoteTools/RDT approach but offer an alternative. This will allow the user to choose the approach with those advantages more important to his working environment. Also this approach will reuse parts from RDT (e.g. scanner/indexer) from RDT.

Responsiveness

The Eclipse core and CDT doing most of the file operations in the main thread (assuming that all file operations are low latency). This causes a responsiveness problems with a remote file system.

Because the file operations are in the main thread they block the GUI until the IO operation finishes and thus preventing the user to continue the work while the IO operation is running. It also often prevents IO operations which could run in parallel to do so. See Bugs 160353, 177994, 195997, 218387, 219169 and wiki.eclipse.org/TM_and_RSE_FAQ from the RSE team regarding the same problem for RSE. Their seems to be no work-around for this problem. While it seams in theory to be possible to improve it somewhat by using Display.readAndDispatch, it is not advised and has been removed from RSE (160353). Having a responsive UI is considered by many extremely important thus this is an important point.

It is very unlikely, at least for the medium-term (meaning the next Eclipse release in 2011), that both Eclipse Core and CDT move all file operations into threads and hide latency by doing IO operations in parallel. Therefore a different approach is needed to have s performant remote IO method.

Disadvantages of Synchronizing approach

  • The entire project must be copied to the local

machine. This only happens once, but could take a very long time for large projects/slow connections.

  • Local indexing is problematic as the local environment will be different

from the remote environment, so macros and includes will be incorrect. Running scanner discovery remotely seems to be the obvious way to solve the macro problem, but scanner discovery is hopelessly broken and not even the CDT people seem to know how it works. In addition, the indexer would need to be modified to copy system and library includes from the remote machine as part of the indexing.

  • Some activities, such as building, will always need to be done remotely, so

the performance problems will always be evident to some degree.

Similar/Prior efforts

Within Eclipse

RSE

Phortran

Outside from Eclipse

Synchronization using GIT

Remote file systems

Options at what time the synchronization is done

  • Synchronizing before any remote operation (build, remote index, ...)
  • Synchronizing after each save

The 2nd option shouldn't wait on the sync but do it asynchronous. Otherwise the responsiveness problem (see above) wouldn't be addressed. Each remote operation would need to call a function to guarantee that all outstanding synchronization calls have finished. The same function would initiate the synchronization for option 1. Which option is better depends on the synchronization back-end and the user preferences and should just be configurable.

Pro after each save:

  • Required for Auto-Build and Indexing on Server
  • Reduces time to build (because is already synced)

Contra

  • Causes larger repository (~2k per commit) and more traffic

Possible Back-ends

Advantages of Rsync

TBD

Disadvantages of Rsync

  • no JAVA implementation is available
  • the synchronization is only one-way

The later is important if, either automatically or by the user, remote files get changed. The one-way synchronization of rsync would usually not synchronize changes to the client and would not detect conflicts caused by changes on both sides very well.

Advantages of GIT

  • It has a java implementation (shipping with Helios)
  • is known to be extremely fast (including the java implementation)
  • supports two way synchronization.

Of course GIT is not meant as a synchronization tool (but a DCVS) but it works as a synchronization tool extremely well. Using git for synchronization would work both for those users using it also for version control and for those users using some other tool for version control. As an example a remote synchronization of a folder containing ~4000files (1 changed - which unknown to GIT), ~100MB, where GIT detects file changes on both sides, over a remote connection (cable), takes less than one second. The performance is mainly limited by the file system for the tree traversal.

Implementation Issues with GIT

To push to a non-bare repository is discouraged

Their are different options

  • Fetch (not good option because it requires SSHD on the client)
  • Push into working branch with post-update hook. Disadvantages: Requires

stat of each file on server (slow over NFS) and doesn't allow merge on client side

  • Push to separate bare repository. Disadvantage: Requires 2 repositories
  • Push to remote branch. Seems best option

Indexing

Local indexing should be supported. Should we also support remote indexing? Local indexing requires remote include files.

Support for other Remote Tools besides Build

TBD