Jump to: navigation, search

Difference between revisions of "PTP/designs/remote/EFS"

< PTP‎ | designs‎ | remote
(New page: = EFS Notes and Designs = List of Authors: Chris Recoskie (recoskie@ca.ibm.com) Greg Watson (grw@us.ibm.com) == EFS Problems == # IPath <==> URI conversions are not always prope...)
 
(Add detail on CDT 6+ implementation)
 
(2 intermediate revisions by one other user not shown)
Line 4: Line 4:
 
     Chris Recoskie (recoskie@ca.ibm.com)
 
     Chris Recoskie (recoskie@ca.ibm.com)
 
     Greg Watson (grw@us.ibm.com)
 
     Greg Watson (grw@us.ibm.com)
 +
    James Blackburn (jamesblackburn@gmail.com)
 +
 +
== Overview ==
 +
 +
Eclipse Filesystem (EFS) support was added to the Eclipse Platform in version 3.2, and the conversion has been mostly completed in 3.3. However, CDT presents a much more complex and challenging environment for such a conversion process, and to date there has been no concerted effort to do so. With interest in remote development environments increasing, there is now enough momentum to attempt to transition CDT to be fully EFS compliant. The aim of this document is to provide a central resource for this conversion process.
 +
 +
=== File store layers ===
 +
 +
    Eclipse IResource API
 +
        |
 +
        |
 +
      EFS
 +
        |
 +
        |
 +
    {underlying file system/store}
 +
 +
The IResource API provides the rudimentary locking and access to resources from Eclipse API consumers.
 +
The EFS API is used to populate the IResources.
 +
The {underlying file system/store} may be a local filesystem, remote filesystem, database etc.
 +
 +
Each of these layers presents a filesystem tree to the layer above. It's worth noting that the trees need not be equal.  EFS might map a location URI to an arbitrary file-store path.  IResource may map any IResource full_path to an arbitrary EFS locationURI -- aka Linked Resources.
 +
 +
== Issues ==
 +
 +
EFS introduces some significant changes to the traditional notion of the workspace. The most important of these is that EFS abstracts away the notion that a particular physical filesystem is used to store data in the workspace. Pre-EFS, it was possible to operate directly on resources using Java classes such as java.io.File, however with EFS this relationship can no longer be assumed to work. The resource may be contained in a zip file, or may even be physically located on a different computer.
 +
 +
Eclipse plugins that operate entirely within the confines of the Eclipse platform can overcome this issue relatively easily: simply use the IResource interfaces, or where direct file/directory operations are required, fetch the IFileStore/IFileInfo interfaces using the resource URI. For CDT, however, a large portion of the functionality requires interaction with tools that are completely external to the Eclipse environment. Unfortunately, many of these tools are used to interact with both workspace and non-workspace resouces. This is further complicated because the workspace resources may not even exist on the local machine, but instead may be accessed from a remote system. In this case, some, or all, of the tools themselves may need to be executed remotely in order to gain access to the resources. Running tools remotely like this is beyond the scope of this document, but still needs to be taken into consideration during the EFS conversion process.
  
 
== EFS Problems ==
 
== EFS Problems ==
  
# IPath <==> URI conversions are not always properly handle
+
# IPath <==> URI conversions are not always properly handled
  
 
Consider the following example:
 
Consider the following example:
Line 31: Line 58:
  
  
== New Representation of Paths ==
+
== Implementation Assumptions ==
 +
 
 +
CDT depends heavily on external tool-chains. This involves:
 +
* Resolving IResources => File store/system locations.  This is straightforward on local filesystems, and tractable for more general EFS location URIs.
 +
* Resolving console output, having run external tools, to Eclipse IResources (for marker generation etc.) is more problematic.
 +
 
 +
To make the above simpler we currently have to assume that the File system tree presented by EFS is backed by a filesystem with similar structure.  This is obviously the case for a local file system, and is assumed to be true for a remote file system provider.  Note that this isn't true for a more complex EFS provider (e.g. one backed by 'the cloud', a zip file, etc.) -- but we can't run external tools here anyway...
 +
 
 +
=== Building projects: IResources -> Command Line Arguments ===
 +
In the first case, the managed builder (for example) generates paths relative to the build directory.  It does this by:
 +
* Find location URI of the build directory
 +
* Find location URI of the resource to be built
 +
* Compute the relative path from the build directory => the resource. Use this path in the Makefile.
 +
 
 +
=== Generating Markers: Tool output -> IResources ===
 +
 
 +
It's proposed that the Error Parser system, driven by ErrorParserManager, will move from IPaths to URIs to support the remote builder use case.  We'll perform the make file generation steps in reverse:
 +
* Take build directory URI
 +
* Take relative file path extracted from the console output
 +
* Append relative path to base directory and resolve locationURI to IResource.
 +
 
 +
NB we can't use IProject.findMemeber(IPath) or similar to resolve relative paths because of linked resources.
 +
 
 +
Imagine the resources:
 +
    IResource                =>  location_uri   
 +
  /project                  => remote_uri:/some/path/to/project
 +
  /project/build_dir        => remote_uri:/some/path/to/project/build_dir
 +
  /project/folder_foo/bar.c  => remote_uri:/some/other/path/bar.c
 +
 
 +
The Makefile will have something like, and the build output will show:
 +
  gcc ../../../../other/path/bar.c -o folder_foo/bar.o
 +
 
 +
There is no IResource path "other/path/bar.c" in 'project'.  Hence we need to resolve the computed locationURI using findFilesForLocation (or use ResourceLookup in CDT). NB: where "other/path/bar.c" is an unambiguous partial location, we can use ResourceLookup with the relative path.
 +
 
 +
=== Resolving absolute paths ===
 +
 
 +
What if the Makefile contains absolute paths?
 +
 
 +
Using the same resources:
 +
    IResource                =>  location_uri   
 +
  /project                  => remote_uri:/some/path/to/project
 +
  /project/build_dir        => remote_uri:/some/path/to/project/build_dir
 +
  /project/folder_foo/bar.c  => remote_uri:/some/other/path/bar.c
  
We have concluded that due to the above problems we need some kind of new representation for paths. In order to provide interoperability with EFS this new representation will have to be able to convert to a valid URI, but this should only be done when interacting directly with EFS itself, otherwise information may be lost.
+
The Makefile might have something like:
 +
  gcc /some/other/path/bar.c -o folder_foo/bar.o
  
  
=== Requirements: ===
+
Resolving /some/other/path/bar.c will fail in the Workspace.  It's also not relative to any existing URI.  However we can resolve 'some/other/path/bar.c' as a relative path which would unambiguously return the IResource '/project/folder_foo/bar.c'.
  
* Track the OS the original path was created for, and be able to extract the original path.
+
It's proposed that, in the case of no match, and a non-local filesystem EFS provider (for the project), we strip leading directories from the path to be resolved, until an IResource is found. i.e. we would try:
* Convert the path to another OS
+
  some/other/path/bar.c ; other/path/bar.c ; path/bar.c ; bar.c
* Convert the path to be relative to another machine (most likely with a different root directory and potentially on a different OS).
+
If nothing is found, or more than one entry found, the marker is set on the project.
* Distinguish between local, remote, and local-relative-to-remote paths.
+
* Provides a toURI() method
+
* Provides utility functions for path manipulation (append, get at segment, get the file extension, etc.) similar to what is contained in IPath.
+

Latest revision as of 10:55, 28 May 2009

EFS Notes and Designs

List of Authors:

   Chris Recoskie (recoskie@ca.ibm.com)
   Greg Watson (grw@us.ibm.com)
   James Blackburn (jamesblackburn@gmail.com)

Overview

Eclipse Filesystem (EFS) support was added to the Eclipse Platform in version 3.2, and the conversion has been mostly completed in 3.3. However, CDT presents a much more complex and challenging environment for such a conversion process, and to date there has been no concerted effort to do so. With interest in remote development environments increasing, there is now enough momentum to attempt to transition CDT to be fully EFS compliant. The aim of this document is to provide a central resource for this conversion process.

File store layers

   Eclipse IResource API
       |
       |
      EFS
       |
       |
   {underlying file system/store}

The IResource API provides the rudimentary locking and access to resources from Eclipse API consumers. The EFS API is used to populate the IResources. The {underlying file system/store} may be a local filesystem, remote filesystem, database etc.

Each of these layers presents a filesystem tree to the layer above. It's worth noting that the trees need not be equal. EFS might map a location URI to an arbitrary file-store path. IResource may map any IResource full_path to an arbitrary EFS locationURI -- aka Linked Resources.

Issues

EFS introduces some significant changes to the traditional notion of the workspace. The most important of these is that EFS abstracts away the notion that a particular physical filesystem is used to store data in the workspace. Pre-EFS, it was possible to operate directly on resources using Java classes such as java.io.File, however with EFS this relationship can no longer be assumed to work. The resource may be contained in a zip file, or may even be physically located on a different computer.

Eclipse plugins that operate entirely within the confines of the Eclipse platform can overcome this issue relatively easily: simply use the IResource interfaces, or where direct file/directory operations are required, fetch the IFileStore/IFileInfo interfaces using the resource URI. For CDT, however, a large portion of the functionality requires interaction with tools that are completely external to the Eclipse environment. Unfortunately, many of these tools are used to interact with both workspace and non-workspace resouces. This is further complicated because the workspace resources may not even exist on the local machine, but instead may be accessed from a remote system. In this case, some, or all, of the tools themselves may need to be executed remotely in order to gain access to the resources. Running tools remotely like this is beyond the scope of this document, but still needs to be taken into consideration during the EFS conversion process.

EFS Problems

  1. IPath <==> URI conversions are not always properly handled

Consider the following example:

IPath path = new Path("c:", "/a/b/c"); IFileStore f = EFS.getLocalFileSystem().getStore(path); then print out f.toURI() you get something like: file:/path/to/workspace/c:/a/b/c

  1. Lack of Windows path support
  • There is no device field in a URI. I.e., it's not legal to have c: in a URI.
  • There is no notion of root on windows
  How do you distinguish between the following?
  ** the full path C:\a which becomes file://c/a
  ** the relative path c/a which then also becomes file://c/a
  It might be possible to handle this by forcing the latter to be file://./c/a but neither URI nor EFS enforces anything like this.
  1. URIs lose OS specific information.

Since a URI is just a string, it doesn't really store information about where it came from. For example, say you have the URI file://a/b. Is that a UNIX-style path corresponding to /a/b, or was it a Windows path corresponding to a:\b? You just don't know.


Implementation Assumptions

CDT depends heavily on external tool-chains. This involves:

  • Resolving IResources => File store/system locations. This is straightforward on local filesystems, and tractable for more general EFS location URIs.
  • Resolving console output, having run external tools, to Eclipse IResources (for marker generation etc.) is more problematic.

To make the above simpler we currently have to assume that the File system tree presented by EFS is backed by a filesystem with similar structure. This is obviously the case for a local file system, and is assumed to be true for a remote file system provider. Note that this isn't true for a more complex EFS provider (e.g. one backed by 'the cloud', a zip file, etc.) -- but we can't run external tools here anyway...

Building projects: IResources -> Command Line Arguments

In the first case, the managed builder (for example) generates paths relative to the build directory. It does this by:

  • Find location URI of the build directory
  • Find location URI of the resource to be built
  • Compute the relative path from the build directory => the resource. Use this path in the Makefile.

Generating Markers: Tool output -> IResources

It's proposed that the Error Parser system, driven by ErrorParserManager, will move from IPaths to URIs to support the remote builder use case. We'll perform the make file generation steps in reverse:

  • Take build directory URI
  • Take relative file path extracted from the console output
  • Append relative path to base directory and resolve locationURI to IResource.

NB we can't use IProject.findMemeber(IPath) or similar to resolve relative paths because of linked resources.

Imagine the resources:

    IResource                =>  location_uri    
  /project                   => remote_uri:/some/path/to/project
  /project/build_dir         => remote_uri:/some/path/to/project/build_dir
  /project/folder_foo/bar.c  => remote_uri:/some/other/path/bar.c

The Makefile will have something like, and the build output will show:

  gcc ../../../../other/path/bar.c -o folder_foo/bar.o

There is no IResource path "other/path/bar.c" in 'project'. Hence we need to resolve the computed locationURI using findFilesForLocation (or use ResourceLookup in CDT). NB: where "other/path/bar.c" is an unambiguous partial location, we can use ResourceLookup with the relative path.

Resolving absolute paths

What if the Makefile contains absolute paths?

Using the same resources:

    IResource                =>  location_uri    
  /project                   => remote_uri:/some/path/to/project
  /project/build_dir         => remote_uri:/some/path/to/project/build_dir
  /project/folder_foo/bar.c  => remote_uri:/some/other/path/bar.c

The Makefile might have something like:

  gcc /some/other/path/bar.c -o folder_foo/bar.o


Resolving /some/other/path/bar.c will fail in the Workspace. It's also not relative to any existing URI. However we can resolve 'some/other/path/bar.c' as a relative path which would unambiguously return the IResource '/project/folder_foo/bar.c'.

It's proposed that, in the case of no match, and a non-local filesystem EFS provider (for the project), we strip leading directories from the path to be resolved, until an IResource is found. i.e. we would try:

  some/other/path/bar.c ; other/path/bar.c ; path/bar.c ; bar.c

If nothing is found, or more than one entry found, the marker is set on the project.