Jump to: navigation, search

E4/Resources/Work Areas

< E4‎ | Resources
Revision as of 13:33, 14 November 2008 by Martin.oberhuber.windriver.com (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The Scope of our Work is naturally the functionality provided by Eclipse Resources today, but also reaching into other areas. This page is about the implementation/design aspect of work. See the E4/Resources/Requirements, Use-Cases and Goals for user-facing descriptions of what we want to achieve.

Support Resources as of Today

What are Resources Today?
See the Resources Wiki.
In short: FCADBS = Files, Containers, Attributes, Deltas, Builders, Synchronization.
Resources are an IDE concept. Clients expect them to model reality (file system) - important for external tools. It's probably not a good idea to try and generalize resources too much.
EFS is stateless. Resources are adding state.

  • Files (==Streams), Folders (==Containers), Groups of them (==Projects) and their attributes (timestamp, encoding, nature, markers, ...)
    • Project is a logical group of resources which shares some common attributes and (currently) provides model-specific (nature) interpretation. In Eclipse, projects are always disjoint with each other (except for linked resources == aliases), which enables some concurrency. Projects can reference each other (dependencies).
  • Caching of attributes (statcache) with explicit refresh, notifications and synchronization, implemented with high performance and low memory footprint.
  • Resource Delta Notifications (Important: MoveDeleteHook for Team) also for Markers
  • Support for locking and atomic operations
  • Hooks for Natures, Model Mappings and Builders. Do these really belong into Resources? Perhaps, in the interest of RCP, these should be split off? There are several copies of the resource delta mechanism (also EMF and Databinding Observables have similar mechanisms), maybe the concept should be generalized.
  • Hooks for Team support and Synchronization
  • Special resource kinds (Derived, Hidden, Phantom, Team-private)

Additional Layers related to Resources

Separation of Concerns is important - keep independent layers with clearly specified responsibilities.

  • org.eclipse.core.filesystem - Filesystem (EFS)
  • org.eclipse.core.contenttype - Content Types
    • Encodings and other resource attributes
  • org.eclipse.ide (the UI side of dealing with resources)
    • Working Sets
    • Builders and their attributes (UI)
  • org.eclipse.team (team support / synchronization)

Big Rocks for Improvement

The "Big Rocks" are themes where multiple individual items should likely be looked at in the full context, and the overall story of E4 Resources must be consistent. For the three Big Rocks defined, it might be possible to install separate working groups.

Alias Management

Linked Resources and Symbolic Links are a means of introducing overlapping resources (Aliasing). In case of an Alias, all contributors of the aliased item must be kept in sync. Currently, Eclipse provides some support for this in terms of Workspace aliases, but not for file system based aliases (symbolic links).

  • Alias Handling (Make Symbolic Links first-class citizens): bug 198291 If the file system provides aliasing (e.g. by symbolic links), it can have problematic effects on Eclipse: E.g. 2 editors editing 2 file paths which really refer to the same file may lead to data corruption; findFilesForLocation() sometimes fails; getCanonicalPath() is not implemented in EFS; and others. Some of these problems are hard to solve, because it's sometimes not clear whether two aliases of the same file should actually be resolved in the same context, or with different context.
    The goal is to use Eclipse UI to see symbolic links, create/modify them, and provide the necessary support to handle them properly (i.e. be aware of aliasing). Accurately represent file system structure in Eclipse. bug 185509 comment 2. Many large real-world projects do have symbolic links. Depending on the actual file system, symbolic links may not be the only kind of aliasing. Perhaps we need an extendable alias manager? Visualization Galileo, rest later?
  • Clarify the Semantics of Aliases. One context, or multiple contexts for an aliased file? One or multiple sets of markers?
    • Suggest multiple contexts by default, unless overridden by user

Workspace and Project Structure

To date, Eclipse Projects are quite tightly tied to the underlying file system structure. This has the advantage of Separation of Concerns: The question what's added or removed to a project is cared for by the underlying file system or Team Provider. Also, ensuring disjoint non-overlapping resources makes it easier to support IWorkspace.find*ForLocation() queries, which are important when associating attributes and settings with resources, as well as concurrency and notifications/listeners.

On the other hand, not all real-world projects can be pushed into this tight jacket.

Some of the items below may be possible to solve in the Galileo Stream, remaining backward compatible.

  • Support physically nested projects: This has been identified as one of the Top Ten Architectural Problems in all of Eclipse long ago (item 5): Real-world projects, especially C/C++, often don't fit the "flat, disjoint" project structure on the filesystem which Eclipse would like to see today. Make setup easier and more flexible. Loosen restrictions of where a Project can be created. Create sub-projects for team sharing, re-use and multi-language. Get legacy code into Eclipse more easily. bug 245412, and bug 210907 (Wizard). Maybe Galileo?
  • Flexible Project Structure: File-list based projects. Allow adding files from "anywhere" to a project by drag and drop, like Visual Studio or Tornado, or file list from automated scripts. Support linking by relative paths and environment variables. Question: Just improve Linked Resources, or come up with something totally new? In terms of separation of concerns and team support, the current Eclipse way (project structure implicitly defined by the file system) is better, but real world lives with file-list-based projects. Existing Proposal by Serge Beauchamp (Freescale) on bug 229633, Discussion on platform-core-dev mailing list, see also the CDT:Flexible_Project_Structure. Maybe Galileo?
    • Variable-based Linked Resources: See bug 229633. At the minimum, ${PROJECT_LOC} and Environment Variables. Ideally more based on context. Variables that can dynamically change are problematic (need to be tracked by Alias Manager). Dependencies may be problematic, if variable providers are pluggable and depend on resources themselves.
    • Non-Workspace Resources: Support searching resources outside the workspace from within Eclipse (bug 192767). Support loading files from anywhere into Eclipse by dbl clicking them (bug 60289 - RSE Local Subsystem provides this now). Resource Deltas for External Folders. Other editors can do this. Make Eclipse more pervasive and sticky. Galileo?
  • Multi-Workspace: Allow looking at multiple workspaces at the same time in one instance (bug 245399). Support multiple different versions of a project at the same time. Solves part of "Namespace Resolution" issue. Also buys user-level Preferences to span multiple workspaces. Impact unclear, but probably related to "Session" concept which E4 might get anyways.
  • Resource Tree Filters. bug 252996. Allow adding / excluding files and folders from the resource tree by pattern. Allow clients to define their view (perspective) on resources. May improve Performance at the cost of being less intuitive. Be sure to give visual feedback for what is excluded. Note that explicit exclusion seems to be supported in Eclipse 3.4 with the new IResource#HIDDEN flag.
    • Make Working Sets First-Class-Citizens. The Resource System should support declaration of resource groups (akin to working sets) by means of patterns (and/or enumeration). Client-defined perspectives on the Resource System: This would allow interested parties to register for notifications on their subset of items only. Could that be a generic notification filter on top of Core Resources, or does it need to be in the Core?
  • Solutions: Logical Nesting and Project Grouping. bug 35973. Handle groups of projects together for open, close, search etc. Optionally inherit settings. See bug 229633 comment 7 for instance. Is this necessary on Core/Resource level? Probably, for namespace resolution. Project References exist already, though with limitations bug 128397. Related to Multi-Workspace? See also Eclipse 4.0/Wishlist
  • Namespace Resolution: This is a hard one: Allow multiple projects with the same name in a workspace bug 35973 comment 89. Less problems re-using a project with a given name. Allow looking at multiple versions of same project in one workspace (often requested!)
  • Overlapping Resources. Has been requested, but I think these are a bad idea. Or is this just another name for Aliasing? Don't make this the default, but the exception. -- Adding lots of complexity at very little gain. Probably better modeled with physically nested smaller subprojects which can be referenced. Or light-weight projects which are like a linked resource folder (container), plus resource filters, but do not allow overlap and do not introduce any project metadata (so they could live as a nested subproject only). Probably linked Resources and Symbolic Links as 1st class citizens. Why is this needed? Sub-item of Flexible Resources.
  • Logical and Virtual Resources. Is this about project structure or meta-data? Currently part of Resources but is it a different layer? See org.eclipse.core.resources.mapping package. Does this need improvement? Logical resources seem to exist in order to enjoy Workspace-provided markers, change notification and synchronization services. EMF seems interested in mapping Models to Resources.

Improved Metadata and Persistence

  • Attach arbitrary metadata to Resources and track its lifecycle. Provide a (pluggable) means for persisting that metadata, or driving the metadata directly from the file system. Might help builders a lot... bug 128100 seems related.
  • Pluggable Project Persistence. Why do we always save into .project? - What if we could directly edit (Devstudio, Netbeans, Maven...) projects just by means of a pluggable project persistence mechanism? Allow .project file to not be at project root. bug 78438 UI: how to list / import projects if the persisted project data has arbitrary file name/pattern? Content Types might help.
    • Sharing, Linking or Inheritance of Project Settings. bug 255371 Modeled Preferences, bug 194414 a the project level, and bug 70683 at the workspace level. Reduce duplication of settings in multiple projects, allow users to locally override any team settings. E.g. user-defined warning levels to try something out. Just a special case of a pluggable project persistence provider? Allow Preferences to span multiple workspaces. Simplify administration of many similar projects with globally administered settings. Makes projects more manageable at the cost of being less understandable. Should projects always be self-contained? How to share settings? - Project settings are really a level above plain Resources. Different clients (natures) can implement this differently, e.g. Maven settings. It would still be good to have some commonality in concept. Very Interesting for large organizations, but is it aligned with our most important work items? - Probably interesting related to Maven for build, which is hierarchical.
    • Multiple projects in 1 directory. Another use-case of pluggable project persistence? Nokia Request to be like Visual Studio: Multiple project files for users to easily find them. Not sure if this isn't only asking for trouble... perhaps better have project references in one folder, which reference the actual projects in subfolders ala bug 78438. This request generates overlap, needs filters. Do we really need this? MSA: I think this type of composition should be not be done at the Resource level. Something like working sets could be used for that.
  • Workspace Description Files: Allow opening a workspace or project by double clicking a file (like in MS Visual Studio), bug 245405. Facilitated on-boarding a team including import of all Preferences. Also for users with multiple workspaces (simpler switching). Maybe Galileo?
  • Add/Remove project type/nature. Meta-info about the project is a layer above plain Resources. Currently adding/removing a nature requires editing the project description file :-( Supporting addition of natures "officially" may make projects more flexible. But sometimes, new (additional) natures are likely better represented with a separate physically nested subproject for the new nature.

Non-Local Resources

The draft E4/Project Proposal talks about the E4 mission being to build a next generation platform for pervasive, component-based applications and tools. Distributed computing is an industry trend and must be supported. The question is, at what layers this support should reside. EFS provides transparent addition of remote resources already now. But the concept of "Deep Refresh" is problematic with remote resources and new concepts may be needed. In many cases, clients will need to be aware of resources being "remote" at a high level.

  • Non-Local Resources. Allow parts of the workspace to be virtual or non-local, represented by "The Network". Make non-local / non-physical elements first class citizens. Improvements of EFS, lazy Refresh, virtual resources, asynchronous calls etc. - see also IUniversalPath idea. Lots of work and risk. I'd strongly propose to keep "non-local stuff" and "local stuff" separate (e.g. separate projects, dont mix them). EFS has shown that transparently adding remote stuff is problematic: In terms of backward compatibility, old-style clients of the old API will never treat network failures and latency properly for non-local resources. It must be explicit.
  • Remote Workspaces. In a client/server Eclipse, the Workspace may be non-local. Not sure how this is related to non-local resources. Remote Projects might make more sense than fully remote workspaces? Or, in a multi-workspace scenario allow one (some) workspaces to be remote? John Arthorne has proposed that in a Client/Server based Eclipse, the whole workspace together with some code for caching and management could be remote (i.e. the code directly accessing the resources would be on the same machine as the workspace).
  • Weak Refresh and Precomputed Stat. For large dynamic clearcase views, or distributed workspaces on slow remote systems, a global workspace refresh is unacceptably slow. Support parts of the workspace to run with weaker refresh policy or import precomputed stat info.
  • Caching and Synchronization of workspace resources with other partners. Currently exposed by ISynchronizer in core.resources but isn't that another layer? In terms of separation of concerns, think about layers for remote support.
  • EFS Notification API. With improved asynchronous support, EFS will probably need a notification API - bug 112980. Or, EFS gets replaced by something else such as Apache Commons VFS.

Improve Concurrency and Programming Model

Multi-Core is a clear industry trend and needs to be accounted for by the Core architecture supporting improved concurrency. This is important in order to scale, especially with remote resources (which are slow to access). Easier programming means less time and less bugs for everyone.

This likely needs to be addressed in the context of all of E4 (to get consistent, pervasive patterns and idioms for concurrency and programming models).

  • Listener Order and Race Conditions: Right now, when a <nop>ResourceModificationListener performs some work as part of listening, when are other listeners notified? The order matters (e.g. JDT vs. CDT), requiring awkward workarounds. Especially during Project Open, events may not be received by parties not yet instantiated. Should there be a history of such events? Some clear APIs for influencing the order of listeners are needed, and ability to notify well-known "owners" of resources before all others.
  • Asynchronous APIs. Several resource operations are documented as potentially long-running. Most of these are synchronous. Asynchronous APIs might help improving workspace concurrency, since these may allow clients to give up unnecessary locks while the operation is running.
  • Improve Workspace Concurrency. SchedulingRules and Jobs, often need to lock the entire Workspace unnecessarily (bug 240888). Current Model makes it hard to have multiple background jobs run on disjoint parts of the workspace at the same time. Would this be fixed by more asynchronous access? How much do we really need this?
  • Avoid too much work in ResourcesPlugin#start(). See bug 181998

Other Ideas and Work Items

Depending on the priority that people working on E4/Resources think the following items have, some of them may be promoted to "big rocks". For now, these are listed separately because they seem smaller, or not directly related to the Core Resources layer (so they could probably be addressed on other layers, or by 3rd party plugins on top of core resources).

  • Shared Reference Workspace: Related to Multi-Workspace: Allow multiple users read-only access to a single shared workspace at the same time. Ideal for browsing shared 3rd party libs. Team-sharing more setup for shared resources.
  • Modeling the Resource System. Do we need this? Resources are performance critical. EMF might be interesting for arbitrary attached attribute, listeners and undo/rollback, but is it needed? Perhaps on a separate modeling/attributes layer on top of core resources? Decide together with "Modeling the Workbench", Eclipse Application Model, and a common Listener / Concurrency model.
  • Getting rid of Project for RCP. Is the notion of "Project" a plumbing or User Artifact? Projects are overloaded. Where should builders etc be hung off? What is the relevant Core of the Resource System that may be interesting for RCP and may be worth stripping?

Layers Other than Core Resources

  • Improve Working Sets. On the UI Working Sets, improvements are also due. Working sets by pattern, with automatic addition / removal. Better team sharing for working sets.
  • Improvements to Content-Types. UI for Project-specific content types. Associate files with a correct icon/editor even if the file extension is not unique. Adapt to legacy structures more easily (case sensitive, patterns). - Contenttypes are a separate plugin already now. Likely possible as a non-breaking incremental improvement. Enumeration might be a workaround.
  • Builders. ICommand and builders extension point. Do these belong into resources? Incremental builders are at the core of the Platform, but couldn't they also be clients of resource delta notifications? There is a need to make ICommand extensible in order for new CDT build to be better integrated and store its settings more easily. Related to Resource Attributes.