Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Flux/Prototype"

 
(9 intermediate revisions by the same user not shown)
Line 20: Line 20:
  
 
== file sync ==
 
== file sync ==
 +
 +
the file syncing in the Flux prototype is implemented on top of the asynchronous messaging. The underlying design decision is that there is no single master copy of the files/folders/projects. Instead, each participant in the Flux system (all the components that are connected to the messaging channel) can serve as a "repository" for individual files, complete projects, everything from a user, or everything in the Flux universe. To sync the content of files and projects, each participant is responsible for listening to the file sync messages, replying to file sync request messages, and to broadcast messages about changes. That creates a network of participants that sync those files and changes among each other (instead of relying on a master server that stores and keeps track of everything).
 +
 +
The syncing mechanism uses the following messages at the moment:
 +
* projectConnected: A new project is connected to Flux. This makes other participants aware of a new project being connected to Flux.
 +
* projectDisconnected: A project is disconnected from Flux. Since this means that a single participant has disconnected the project from Flux, is doesn't mean that all participants have disconnected this project.
 +
* resourceCreated: A new resource got created (as part of a project). This can be a file or a folder.
 +
* resourceChanged: A resource got changed (like a save event). This can be a file, a folder, or a project
 +
* resourceDeleted: A resource got deleted (can be a folder or a file)
 +
* resourceStored: A resource got stored. This means that a resource change as been stored persistently by a participant in the system. This is usually a reaction to a resourceChanged event in the case that the resource change got safely stored somewhere.
 +
* getProjectsRequest: A request to send the requester back information about the projects that are connected to Flux.
 +
* getProjectsResponse: The response, containing the list of projects that are connected to Flux
 +
* getProjectRequest: A request to send back information about the content of an individual project (like the list of contained resources)
 +
* getProjectResponse: The response, containing metadata and a list of contained resources.
 +
* getResourceRequest: A request for the content of an individual resource (e.g. the content of a file)
 +
* getResourceResponse: The response, containing the content of the resource (e.g. the content of a file)
 +
 +
 +
Example: The Flux plugin for Eclipse connects to the messaging channel of the user. Then it does two things to sync up the content of the Eclipse workspace with other participants of the Flux system. For each project in the workspace that is connected to Flux, it sends out a message that this project has been connected to Flux again. In addition to that it sends out a request message to all other participants to ask for the content of this project (to check whether there are other, more recent versions of the files anywhere). In case there is a more recent version, it asks for the updated content of this resource and stores it locally in the Eclipse workspace. That way the Eclipse workspace version is being updated in case the files and projects have been changed since the last time the Eclipse workspace has been used.
 +
 +
A similar flow happens inside other participants in the Flux universe. A backup repository that is running somewhere in the cloud, for example, receives a "projectConnected" method from the Eclipse plugin. It reacts to this by asking for the content of that project (to find out whether the connected Eclipse project contains more up-to-date files). In case it finds more recent versions of some files of that project, it asks for the content of those files and updates its own version of the file.
 +
 +
A browser editor, to mention a totally different example, listens to the same messages, but is interested in only those messages that contain information about the file that is open in the editor.
 +
 +
That way you could startup additional backup repositories in the cloud and they would automatically participate in the syncing mechanism. In the same way the Eclipse plugin serves as a repository as well. That means, as long as your Eclipse instance is running and connected to Flux, it might deliver file content to the browser editor as well (just to mention one possible example). Another side effect of this architecture is: if someone sends out a request for a file content, for example, various other participants may send back an answer (maybe different cloud repositories, the running Eclipse instance, another open browser editor, etc.). It means that even though you don't need multiple answers to this request message, you might receive multiple answers. And you have to choose, which one to use. Usually it will be the first one that arrives (the most performant repository component will win automatically).
 +
 +
A simple example for this is the in-memory implementation of the cloud backup repository that is implemented on top of node.js in JavaScript:
 +
 +
* https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/repository-message-api.js: This is the connector that listens to messages from the message channel and reacts to those messages by calling the in-memory implementation of a repository.
 +
* https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/repository-inmemory.js: This is the in-memory implementation of the cloud backup repository that stores everything in memory and sends out broadcast messages to the channel in case a resource is created, deleted, or changed in this repository. Everything else (reacting to incoming messages from other participants) is handled by the message-api module above.
 +
 +
 +
At the moment the file sync protocol doesn't deal with conflicts and/or parallel file changes. Versions of files and folders are managed using global timestamps (which is unreliable) and content hashes. A more comprehensive and robust mechanism has to be added to the mechanism while moving the project forward. The idea is to use vector clocks (for distributed systems) to check for collisions. ETags might be another way of doing this, but that gets more complicated in this scenario, since there is no single master server that serves as a central point of storing the content of files and keeping track of tags.
  
 
== real-time sync ==
 
== real-time sync ==
  
== cloud micro services ==
+
The real-time sync is based in the asynchronous messaging channel of Flux as well. Similar to the file sync are messages broadcatsed among the participants. Those messages are:
  
 +
* getLiveResourcesRequest: a request message send to all participants asking for the resources that are being edited at the moment
 +
* getLiveResourcesResponse: a response listening all the resources that are being edited at the moment (multiple participants can answer with a different set of resources)
 +
* liveResourceStarted: a participant announces to others that a resource is going to be edited
 +
* liveResourceStartedResponse: participants might answer to the "liveResourceStarted" announcement with information about the resource being already edited (including the latest version of the content)
 +
* liveResourceChanged: delta information about a live change to a resource
 +
* liveMetadataChanged: live metadata information about a resource (for example errors and warnings coming back from a reconciling service running in the cloud, reacting to a liveResourceChanged message)
 +
 +
The real-time sync works across multiple participants, but doesn't do any conflict resolution yet. This has to be added to this mechanism in the future. Options are OT (Operational Transforms, the technology behind Google Wave and Google Docs) or Differential Synchronizaion (a different approach using diff-patch-merge algorithms). Both options should be explored.
 +
 +
The web editor implementation of the Flux prototype deals with those messages, for example:
 +
https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/web-editor/js/editor/embeddededitor.js
 +
 +
== cloud micro services ==
  
 +
Services in the Flux architecture are relatively small components that can run anywhere, preferably in the cloud, but also in a running Eclipse instance, as a process on your local machine, or in a specific data-center. Services can connect to the Flux messaging channels and listen and react to those messages that are mentioned above (as well as others).
  
 +
One example is the JDT service that can run inside an Eclipse IDE instance or as a standalone headless service somewhere. This services participates in the file sync and live- sync mechanism in order to keep a local copy of projects and files in sync with other participants in Flux. In addition to that it listens for specific request messages (like a message asking for content assist information, or navigation information).
  
[[Category:Flux]]
+
* https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/eclipse.services.java/org.eclipse.flight.jdt.service/src/org/eclipse/flux/jdt/services/Activator.java: This is the entry point of the JDT service that starts up a number of JDT-related services (for content-assist, navigation, and rename-in-file) and re-uses the basic sync mechanism of https://github.com/eclipse/flux/tree/7a66a02e08b88611af88d95436f1215a2a44e922/eclipse.core/org.eclipse.flight.core to keep the projects in sync.

Latest revision as of 12:06, 2 July 2014

This is a brief description of the technical details of the Flux prototype implementation.

async messaging - the foundation

The back-bone of the Flux prototype implementation is an asynchronous messaging channel. The basic idea behind this architecture is that all communication in Flux is happening via asynchronous messages. There is no RESTful API, a dedicated server that can be called, or something like that. Instead every component in Flux is connected to this messaging channel and can send and receive messages over this wire. Everything else is implemented on top of this.

The prototype implementation of the asynchronous messaging channel is based on websockets. Since this is the only way to connect to Flux, every participant can open a websocket connection to the messaging system and send and receive messages from there on. The websocket messaging server is implemented as a node.js application written in JavaScript:

https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/startup-all-in-one.js#L31

Messages in the Flux prototype are in JSON notation. They can either be broadcasted to everybody in the system or send directly to a certain recipient. This is configured for each socket that is opened:

https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/messages-core.js

Different users are differentiated via different spaces using the room feature of the socket.io library. Therefore after connecting to the websocket server, every participant has to send a message "connectToChannel" to let that websocket connection participate in the channel for that particular user. As a result, if messages are broadcasted, they are not broadcasted "to the world", but within the channel of that user only. This is good to security as well as for keeping the traffic per socket down to the messages that are interesting for that user only.

https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/messages-core.js#L62

As an additional step, the connect to the websocket itself should be secured, so that only authenticated users can successfully connect to the websocket. In addition to that the reaction to the "connectToChannel" message can be diversified by checking whether the user is allowed to enter a certain channel. By default this would be the same (the authenticated user connects to his own channel), but it could also include something like an invitation mechanism, so that users can invite and allow other users to take a look at their resources in Flux.

file sync

the file syncing in the Flux prototype is implemented on top of the asynchronous messaging. The underlying design decision is that there is no single master copy of the files/folders/projects. Instead, each participant in the Flux system (all the components that are connected to the messaging channel) can serve as a "repository" for individual files, complete projects, everything from a user, or everything in the Flux universe. To sync the content of files and projects, each participant is responsible for listening to the file sync messages, replying to file sync request messages, and to broadcast messages about changes. That creates a network of participants that sync those files and changes among each other (instead of relying on a master server that stores and keeps track of everything).

The syncing mechanism uses the following messages at the moment:

  • projectConnected: A new project is connected to Flux. This makes other participants aware of a new project being connected to Flux.
  • projectDisconnected: A project is disconnected from Flux. Since this means that a single participant has disconnected the project from Flux, is doesn't mean that all participants have disconnected this project.
  • resourceCreated: A new resource got created (as part of a project). This can be a file or a folder.
  • resourceChanged: A resource got changed (like a save event). This can be a file, a folder, or a project
  • resourceDeleted: A resource got deleted (can be a folder or a file)
  • resourceStored: A resource got stored. This means that a resource change as been stored persistently by a participant in the system. This is usually a reaction to a resourceChanged event in the case that the resource change got safely stored somewhere.
  • getProjectsRequest: A request to send the requester back information about the projects that are connected to Flux.
  • getProjectsResponse: The response, containing the list of projects that are connected to Flux
  • getProjectRequest: A request to send back information about the content of an individual project (like the list of contained resources)
  • getProjectResponse: The response, containing metadata and a list of contained resources.
  • getResourceRequest: A request for the content of an individual resource (e.g. the content of a file)
  • getResourceResponse: The response, containing the content of the resource (e.g. the content of a file)


Example: The Flux plugin for Eclipse connects to the messaging channel of the user. Then it does two things to sync up the content of the Eclipse workspace with other participants of the Flux system. For each project in the workspace that is connected to Flux, it sends out a message that this project has been connected to Flux again. In addition to that it sends out a request message to all other participants to ask for the content of this project (to check whether there are other, more recent versions of the files anywhere). In case there is a more recent version, it asks for the updated content of this resource and stores it locally in the Eclipse workspace. That way the Eclipse workspace version is being updated in case the files and projects have been changed since the last time the Eclipse workspace has been used.

A similar flow happens inside other participants in the Flux universe. A backup repository that is running somewhere in the cloud, for example, receives a "projectConnected" method from the Eclipse plugin. It reacts to this by asking for the content of that project (to find out whether the connected Eclipse project contains more up-to-date files). In case it finds more recent versions of some files of that project, it asks for the content of those files and updates its own version of the file.

A browser editor, to mention a totally different example, listens to the same messages, but is interested in only those messages that contain information about the file that is open in the editor.

That way you could startup additional backup repositories in the cloud and they would automatically participate in the syncing mechanism. In the same way the Eclipse plugin serves as a repository as well. That means, as long as your Eclipse instance is running and connected to Flux, it might deliver file content to the browser editor as well (just to mention one possible example). Another side effect of this architecture is: if someone sends out a request for a file content, for example, various other participants may send back an answer (maybe different cloud repositories, the running Eclipse instance, another open browser editor, etc.). It means that even though you don't need multiple answers to this request message, you might receive multiple answers. And you have to choose, which one to use. Usually it will be the first one that arrives (the most performant repository component will win automatically).

A simple example for this is the in-memory implementation of the cloud backup repository that is implemented on top of node.js in JavaScript:


At the moment the file sync protocol doesn't deal with conflicts and/or parallel file changes. Versions of files and folders are managed using global timestamps (which is unreliable) and content hashes. A more comprehensive and robust mechanism has to be added to the mechanism while moving the project forward. The idea is to use vector clocks (for distributed systems) to check for collisions. ETags might be another way of doing this, but that gets more complicated in this scenario, since there is no single master server that serves as a central point of storing the content of files and keeping track of tags.

real-time sync

The real-time sync is based in the asynchronous messaging channel of Flux as well. Similar to the file sync are messages broadcatsed among the participants. Those messages are:

  • getLiveResourcesRequest: a request message send to all participants asking for the resources that are being edited at the moment
  • getLiveResourcesResponse: a response listening all the resources that are being edited at the moment (multiple participants can answer with a different set of resources)
  • liveResourceStarted: a participant announces to others that a resource is going to be edited
  • liveResourceStartedResponse: participants might answer to the "liveResourceStarted" announcement with information about the resource being already edited (including the latest version of the content)
  • liveResourceChanged: delta information about a live change to a resource
  • liveMetadataChanged: live metadata information about a resource (for example errors and warnings coming back from a reconciling service running in the cloud, reacting to a liveResourceChanged message)

The real-time sync works across multiple participants, but doesn't do any conflict resolution yet. This has to be added to this mechanism in the future. Options are OT (Operational Transforms, the technology behind Google Wave and Google Docs) or Differential Synchronizaion (a different approach using diff-patch-merge algorithms). Both options should be explored.

The web editor implementation of the Flux prototype deals with those messages, for example: https://github.com/eclipse/flux/blob/7a66a02e08b88611af88d95436f1215a2a44e922/node.server/web-editor/js/editor/embeddededitor.js

cloud micro services

Services in the Flux architecture are relatively small components that can run anywhere, preferably in the cloud, but also in a running Eclipse instance, as a process on your local machine, or in a specific data-center. Services can connect to the Flux messaging channels and listen and react to those messages that are mentioned above (as well as others).

One example is the JDT service that can run inside an Eclipse IDE instance or as a standalone headless service somewhere. This services participates in the file sync and live- sync mechanism in order to keep a local copy of projects and files in sync with other participants in Flux. In addition to that it listens for specific request messages (like a message asking for content assist information, or navigation information).

Back to the top