Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

EMF Compare/Specifications/LogicalMergeCommandLine

< EMF Compare
Revision as of 10:34, 25 July 2014 by Arthur.daussy.obeo.fr (Talk | contribs) (git logicalmerge)

Part 1: analysis of CGit and JGit

Within this part, we will describe the existing state of CGit and JGit concerning command line merging and their customization possibilities, then try and outline evolution axes to provide logical model support for these merge operations.

This part focuses on two different command line git providers: CGit, the most widely used front end to git, written in C, and JGit, the java alternative developed within the Eclipse foundation. We will not consider other potential command line providers. This document will call logical model a set of inter-related physical files that should always be considered together to read a unit of information.

Problem statement

We need to be able to launch merge operations from the command line while having support for logical model merging. When launching operations with inner merges from either of the two front ends, the standard behavior will be to rely on textual, file-by-file merging. This may cause issues with logical models where text merges might fail with conflicts while there are no logical conflict, or where text merges might end successfully even though there were logical conflicts (in which case the end model will end up corrupted and unloadable).

Git operations that will involve file merging are the following:

  • merge
  • cherry-pick
  • pull
  • rebase
  • revert
  • stash apply
  • submodule update

The merge operation is in charge of modifying the files and adding the necessary information in the index when conflicts are detected so that the merge tool can then be called on these files to solve said conflicts. Currently, git merge operations allow for the customization of the merge drivers.

A merge driver's responsibility is to handle the merge of a single file when it is not considered to be a trivial operation by the merge strategy. As such, drivers will be called on a file basis and only during non-trivial merges. With regard to logical model merging, this is far from sufficient since:

  1. it works on a file-by-file basis (a merge driver can only modify the single file it has been called on, and thus cannot account for larger logical models), and
  2. this mechanism is only called for non-trivial merges... in the textual sense. For example, deleting a file is considered a trivial merge by the default merging strategies available from git, whereas deleting part of a logical model mandates changes in other parts of said model, not to mention a lot of potential logical conflicts that are not reflected as textual conflicts.

The merge tool will be also be called file-by-file, and only on the files that have been marked as conflicting by the merge operations. It is thus too late to handle the logical models as a whole since trivial merges may already have corrupted the logical models, and the conflicts detected where... textual conflicts that did not take models into account at all.

However, even if we were to be able to merge multiple conflicting files at once through an individual merge tool, the “git mergetool” command would still fail since it could then iterate on files that have already been merged through the logical model of another. We'll come back on this issue in the specification section. We thus need to plug ourselves before the merge tool can kick in, and at a higher level than what the merge drivers allow. The only potential candidates for such a pluggable behavior are the merge strategies themselves, since they are what's in charge of deciding what a trivial merge is, and what is not.

Customizing merge strategies

Current Possibilities

CGit

CGit offers five distinct merge strategies available by default:

  • recursive
  • octopus
  • ours
  • resolve
  • subtree

Though all five strategies have their own specific uses, all are textual and neither handles logical models. The recursive strategy is the default when merging two distinct commits, octopus being the default when merging more than two. The other three are only available, by name, through specific options of the commands. For example:

 git merge -s ours <commit1> <commit2>

or

 git cherry-pick --strategy ours <commit>

Trying to use any other strategy than the default five will end in a failure from the command line:

 $ git merge -s unknown <commit1> <commit2>
 Could not find merge strategy 'unknown'.
 Available strategies are: octopus ours recursive resolve subtree.

However, even though that is undocumented, CGit allows users to add new, customized merge strategies to the list of available ones under the following conditions:

  • the merge strategy is implemented in shell,
  • the shell script implementing the strategy is named with the convention of using the git-merge- prefix followed by the name to use for this strategy, and
  • that shell script is available either in the same folder as the git command itself, or within the current user's PATH.
JGit

JGit provides four merge strategies available by default:

  • recursive
  • resolve
  • ours
  • theirs

Once again, none of these four strategies can handle logical models since they all operate on a textual level. recursive is always the default regardless of the merge operation that is to be performed. The other three are only available through specific options of the merge-involving commands. For example:

 jgit.sh merge -s ours <commit1> <commit2>

JGit does not allow for customized merge strategies to be used by the commands. Furthermore, it does not provide most of the commands involving merge operations. Of the seven such operations we previously listed, only “merge” is provided by JGit.

Specifications

First and foremost, take note that none of this can be contributed back to either cgit or JGit since it involves too deeply-rooted and eclipse-specific changes.

CGit
Implement a custom merge strategy

Since providing customized merge strategies is already possible, what we need to do here boils down to implementing a custom merge strategy in Shell.

In order to determine whether files are part of a larger logical model while remaining compatible with previous developments and existing tools, we need these files to be part of an Eclipse workspace. As such, this strategy must be able to check whether there are eclipse projects in the repository on which it has been called, then launch a headless Eclipse with a temporary workspace containing these projects.

Any file for which there is an existing logical merger will then be handled from within this Eclipse container, while the merging strategy should be able to fall back to the default (recursive) strategy for all other files.

This cannot and will not handle octopus merges.

Implement a custom merge tool

The merge tool will be called on each file which merge failed with conflicts. Since we've used our own custom merge strategy, these conflicts will have been properly detected as either logical conflicts (merge handled by the logical model merger) or textual conflicts (for files which didn't have a model merger). The standard git mergetool command will execute the individual merge tools (specified through git attributes) sequentially on each file in conflict on the repository. When we detected conflicts on logical model, we have set all files constituting a single logical model has being in conflict; which would mean that the merge tool would be launched three times in a row on that given logical model.

We thus cannot rely on the standard git mergetool command. What we propose here is to implement a new git command that will provide the same functionality as the standard merge tool while being capable of handling logical models. This new command could be called logicalmergetool and it would thus be callable through the command git logicalmergetool. This must be implemented in Shell.

Once again, the custom merge tool will need to check if the underlying repository contains Eclipse projects, but this time it will need to launch a full-fledged Eclipse (not a simple headless application) with a temporary workspace containing these files. The user will have to manually launch EGit's merge tool (Right-click > Team > Merge Tool) on the files in conflict to solve the issues and Add (Right-click > Team > Add) the resolved file to the index from there.

Any file that is in conflict but that is not contained in an Eclipse project will be handled by the custom merge tool command through a fall back to the standard individual merge tool defined for this file by the gitattributes.

JGit
Implement the missing merge-involving commands

There are already implementations of most of these commands in JGit, though the wrappers that allow these commands to be called from the command line JGit front end are missing. We need to implement wrappers for:

  • cherry-pick
  • pull
  • rebase
  • revert
  • stash apply
  • submodule update
Implement a mean for command line users to register custom merge strategies

JGit only looks up its own registry for the merge strategies, without allowing a user to register new ones in there.

We need to either implement a new look up for JGit to search for strategies within the user's PATH or for the user to register new strategies, either through the repository's configuration or from the command line itself.

Implement a custom merge strategy

Once we have a mean to register it, we'll need to implement a new custom strategy for JGit. The requirements for this will be very similar to what we previously outlined for the CGit variant.

This strategy needs to check whether the target repository contains Eclipse projects, then launch an headless Eclipse with a temporary workspace containing these projects. From there, it will look up for the files' specific model merger and use it, or fall back to standard git merging for any file which are not part of a logical model, or which model do not provide a custom logical merger.

This cannot and will not handle octopus merge, especially so since JGit does not support them natively.

Implement a custom merge tool

JGit does not provide a mergetool command yet, though we still propose to use a distinct name than mergetool since this will not be contributed back to the project and they will most likely implement one in the future to reflect what exists in other git front ends.

This task will be very similar to the same one that could be undertaken with CGit, with the same constraints to uphold apart from the coding language, since this one can be implemented directly in Java.

Part 2: general workflow and initial prototype

When a user wants to compare or merge EMF models from a command line, he needs to do that in an Eclipse environment similar to the one he used to create these models. As such, the environment requires some plugins to be installed but it may also requires some preferences to be set, some perspective to be activated etc.. Among these plugins, there are the mandatory ones that will be use to do the compare/merge operation: EMF Compare and EGit. Several options are possible to provision such an environment.

The first one is the manual way. It is necessary to download the Eclipse environment and install all the required plugins. Then, the git repository(-ies) that contains the models have to be cloned and binded to the Eclipse environment.. All these tasks have to be done manually, on each computer that wants to execute a comparison or a merge. Finally, it is necessary to write a program that allows to launch and manage the comparison/merge from the command line interface.

The second one is the programmatic way. All the tasks done manually in the first method have to be done programmatically on this one. That means we need to find a way to allow to the user to specify what he wants to provision in an Eclipse environment. It can be a very long and fastidious development that involves a lot of various APIs. The advantage of this method is there just to execute the final program on each computer that wants to execute a comparison or a merge, there is no further manual tasks.

Eclipse Oomph is a technology that allows to provision a set of plugins in an Eclipse IDE, clone Git repositories, bind Git repositories to this IDE, checking out projects, setting workspace preferences... The configuration is model driven, with files called Oomph setup model files. As such, Oomph seems to be a good framework on top of which we could implement the compare and merge command line. We only have to call the Oomph APIs instead of call a lot and various APIs from a lot of technologies. We think the Eclipse Oomph technology is the most appropriate for this need in terms of costs, time, maintainability, reliability and performances.

New shell commands

We will initially develop new shell scripts that will add new commands to git:

  • git logicalmerge
  • git logicaldiff
  • git logicalmergetool

These scripts must be added on each computer that need to do logical git operations from command line interface, to enable them.

On linux systems, to create a new git command named logicalmerge, the script must be named git-logicalmerge.sh. Then, the scripts have to be reachable from your PATH and have execution permissions.

Basically, each command will mimic its non logical counterpart. They will take a additional mandatory parameter: an Eclipse Oomph setup model file describing the environment into which the compare/merge operation should be handled. In a first time, we will handle only a subset of standard parameters of counterpart commands.

git logicalmerge

The logicalmerge command is the logical version of the git merge command. To see a full description of the git merge command, please visit http://git-scm.com/docs/git-merge.

The command is specified as below:

git logicalmerge <setup> <commit>

Assume the following history exists and the current branch is master:

A---B---C topic /

   D---E---F---G master

Then git logicalmerge mySetupModel.setup topic will replay the changes made on the topic branch since it diverged from master (i.e., E) until its current commit (C) on top of master, and record the result in a new commit along with the names of the two parent commits and a log message from the user describing the changes.

A---B---C topic / \

   D---E---F---G---H master

You can also replace the topic branch name by his commit id: git logicalmerge mySetupModel.setup 87ad5ff

Incorrect parameters messages

Logical merge command line signature is 'git logicalmerge <setup> <commitID> (-m "Merge message")?'

  • REQ_010: If the logical merge is called with 0,1 or more than 2 parameters then the software should display:
    "fatal: logicalmerge needs two parameters. git logicalmerge <setup_file_path> <commit>. Use git logicalmerge --help for further information."
  • IMPL: This verification is done before any other work.
  • REQ_020: If the logicalmerge is called with a incorrect path to the setup file the software should display:
    "fatal: No setup file found at {$SETUP_PATH_FILE}."
  • IMPL: This verification is done before loading the setup resource
  • REQ_030: If the logicalmerge is called with a corrupted setup file (unable to load setup resource) the software should display:
    "fatal: Corrupted set up file."
  • IMPL: This verification is done when trying to load the resource
  • REQ_040: If the "-m" argument is given to the logicalmerge command without a value then the software should display:
    "error: switch `m' requires a value."
  • REQ_050: If a merge message has been provided then it should be used for the merge commit.
  • REQ_060: If no message has been provided to the command line then an automatic message should be generated. In git the user is asked enter the commit message inside a editor in the console console with an already generated message. We do not need this kind of behavior for the moment
Error during logicalmerge messages
  • REQ_070: If the logicalmerge is called with a incorrect commit id the software should display:
    "fatal: {$WRONG_ID} - not something we can merge."


  • IMPL: This verification is done during the call to EGit
  • REQ_080: For any other error during the merge the software should display "fatal:" plus the error.
Result messages
  • REQ_100: If no merge has been done because it's already up to date then software should display:
    "Already up-to-date."
  • REQ_110: If the merge has no conflicts then software should display:
    "Merge made by the '{$UsedStrategy}' strategy.
    {$ListOfMergedFile}"
  • REQ_120: If the merge has conflicts the software should display:

FOR ALL {$FileName} in {$ConflictingFiles} DO
IF {$FileName} has been automatically resolver DO
Auto-merging {$FileName}
CONFLICT ({$ConflictTypeLeft}/{$ConflictTypeLeft}): Merge conflict in {$FileName}
Automatic merge success.
ELSE
Auto-merging {$FileName}
CONFLICT ({$ConflictTypeLeft}/{$ConflictTypeLeft}): Merge conflict in {$FileName}
Automatic merge failed; fix conflicts using "git logicalmergetooland" then commit the result.

Return code
  • REQ_130: Here is the list of return value that the software should return:

IF "Merge succeed and complete":
return 0
IF "Merge did not success. An action is required from user. For example a manual merge. Still this return does not mean an error":
return 1
IF "Error":
return 128

git logicaldiff

The logicaldiff command is the logical version of the git diff command. To see a full description of the git diff command, please visit http://git-scm.com/docs/git-diff.

The command is specified as below:

git logicaldiff <setup> <commit> [<commit>] [–- <path>]

To see the changes between a revision and the HEAD revision, you should omit the second commit.

git logicaldiff <setup> <commit> [--] [<path>...]

In all cases, [– <path>] option allows to filter the diff command only on files that match the <path>.

In all cases, <commit> can refers to a branch name or a commit id.

git logicalmergetool

The logicalmergetool command is the logical version of the git mergetool command. To see a full description of the git mergetool command, please visit http://git-scm.com/docs/git-mergetool. Here is the constructions allowing for the git logicalmergetool:

git logicalmergetool <setup>

Run logical merge conflict resolution tools to resolve logical merge conflicts. In our case, it means run Eclipse and call the EGit merge tool on file(s) in conflict(s).

Workflow

Each shell script will wrapper of an Eclipse standalone application (provided by the EMF Compare project). This standalone application will itself call some Oomph API.

First, Oomph will provision an Eclipse with all appropriate plugins to launch the logical git operation. These plugins are EGit, EMF Compare and their dependencies. If the Oomph setup model provided as parameter contains other plugins (represented by the name of the repository and the name of the plugin/feature), they will be provisioned too.

For a given Oomph setup model file, the provisioning operation is executed only once. Indeed, if you launch again a git logical operation with the same Oomph setup model file, then the already provisioned Eclipse IDE corresponding to the setup model will be reused. It avoids to execute this potentially costly task each time.

In order to retrieve the Eclipse associated to a given Oomph setup model file, we will store all provisioned Eclipses in the temporary folder of the system. We will use a hash function on the Oomph setup model file to generate/retrieve a unique id. This unique id will be the name of the folder containing the provisioned Eclipse.

Then, in this provisioned Eclipse, the list of tasks contained in the Oomph setup model will be executed.

This Oomph setup model will contain, at least:

  • The path where the workspace will be created.
  • The git repository(-ies) to clone/bind with the Eclipse IDE.
  • The project(s) (represented by his path on the computer) to import in the workspace associated with the Eclipse IDE.

Once all Oomph tasks executed, EMF Compare will call the logical git operation with the others parameters provided in the command line interface. Once the git logical operation has been executed, the user can see the results in his command line tool.

If the result shows conflict(s) on involved model(s), the user will call the git logicalmergetool command. This command will launch a full-fledged Eclipse IDE (not a simple headless application) with a workspace containing these files. This full-fledged Eclipse IDE is the same as the one provisioned previously by Oomph. The user will have to manually launch EGit's merge tool on the files in conflict to solve the issues, and then manually close the Eclipse to properly finish the process.

As an axis of evolution, in case of conflict(s), when the full-fledged Eclipse IDE has been launched, the EGit's merge tool could be automatically launched on file(s) in conflict(s).

Here is a schema representing the workflow of the process for the logical merge command (the workflow is nearly the same for the logical diff):

(Gray steps done by the user, blue steps done automatically)

EMFCompare GitLogicalMerge Workflow.png

An initial prototype of such a workflow is available on the EMF Compare's gerrit: https://git.eclipse.org/r/#/c/29889/

Detailed steps

Step 1: git logicalmerge

see git logicalmerge for more details

Step 2: provision an Eclipse IDE

After the step 1, the arguments passed with the command have been validated. Then, the step 2 will have to provision an Eclipse environment. The variables elements in this step are the installation path, the workspace path, and the additionals plugins to install in the environment. All these variables elements can be found in the setup model file.

EMFCompare GitLogicalMerge Workflow Step02.png
  • If the installation path is not provided in the user setup model, then a temporary folder (located in the temp folder of the system) will be used. If the temporary folder already contains an Eclipse environment, this environment will be deleted.
  • If the workspace path is not provided in the user setup model, then a temporary folder will be used. If the temporary folder already contains a workspace, this workspace will be deleted.
  • If the installation path found in the user setup model already contains an Eclipse environment, then no further plugins installation will be done, except for Oomph. Why ? Because in the step 4, the tasks executed require Oomph.
  • If the installation path found in the user setup model doesn't contains an Eclipse environment, then an environment will be provisioned with :
    • the Eclipse Luna release
    • EMF 2.10 release
    • EMF Compare 3.0 release
    • EGit x.y release (to be determined)
    • Oomph x.y release (to be determined)

As an axis of evolution, profiles could be added as arguments of command line. A profile would be a specific release of Eclipse (Luna, Kepler, Mars, ...) with the appropriate plugins versions.

Step 3: launch Eclipse as an headless application

After the step 2, we have a valid Eclipse environment and a clean workspace. Then, the step 3 will have to launch this Eclipse environment as an headless application.

The headless application launched will be one of the three existing: logicalmerge, logicaldiff or logicalmergetool, according to the command line typed by the user.

Step 4: execute tasks from setup model

After the step 3, the Eclipse is launched as an headless application. Then, the step 4 will have to execute some Oomph tasks. The variables elements in this step are the list of projects to import in the workspace. All these variables elements can be found in the setup model file.

All the projects (represented by their path) found in the user setup model file will be imported in the workspace.

If there is no project in the user setup model file, then all projects found in the git repository will be imported in the workspace.

Step 5: call the logical merge operation

After the step 4, the projects have been imported in the workspace. Then, the EGit merge operation can be called with the arguments passed from the command line. These arguments are the <commit> (id or branch), and eventually a message (with the -m option).

Step 6: git logicalmergetool

see git logicalmergetool for more details

Step 7: launch Eclipse (with GUI)

After the step 6, the merge command has return conflict(s). Then, the step 3 will have to launch the same Eclipse environment than the step 3 but this time with the GUI.

Step 8: call the EGit merge tool on file(s) in conflict(s)

After the step 7, the Eclipse is launched with a GUI. EGit is installed in this environment, so the user will have the possibility to call the Merge Tool on file(s) in conflict(s).

Step 9: resolve conflict(s) manually

In case of a conflict on model, the Merge Tool will launched EMF Compare. In other cases, the standard Merge Tool will be launched.

Step 10: close Eclipse manually

Once all conflict(s) have been resolved, the user have to close the Eclipse to end the process.

Step 11: end of process

Back to the top