EMF Compare/UML Compare
The primary goal of this section is to provide technical specification of the UML Comparison engine. This engine will provide support of UML specific concepts and operations.
There are two ways of implementing UML Compare engine:
- Use the provided extension mechanism and override the visit() method to implement UML specific difference semantic
- Implement our own Diff/Match Engine (or override Generic Diff/Mathc engine if suited) to implement specific diff detection.
The first one should be simpler as it uses a builtin mechanism of EMFcompare but with less ability to get out of the border of what it had been designed for while the second one would let us define whatever we want but with an increased implementation complexity.
We now detail the two options, their pros and cons, and finally conclude and propose a solution.
Builtin extension mechanism
The extension mechanism is not working in essence. The mechanism work as follow:
- In your extending Ecore, you define a subclass of the EClasses AbstractDiffExtension and DiffElement (or any of its subclasses).
- You generate the code of this Ecore, and implement the visit(DiffModel), getText(), getImage(), provideMerger() methods.
- When the DiffService.doDiff() is called, diff is realized by the best suited diff engine within the registered ones.
- Once the diff is over, for each registered AbstractDiffExtension, the visit() method is called.
There are two problems:
- AbstractDiffExtension are instantiated only once. How can we assume we will only get one difference of that kind in the model? That's dumb and not working.
- Even if AbstractDiffExtension is to instantied the correct number of time, the visit() method is taking the whole diff model as input. It has to be browsed for each extension. It will not scale.
The good parts of this mechanism come from AbstractDiffExtension#hideElements reference and its opposite DiffElement#isHiddenBy. It provides a neat solution to hide differences created by generic engines as it taken into account by the GUI.
The AbstractDiffExtension#provideMerger() are also a good idea as it let us provide our own merger to the framework. But it is best to provide the mergers through the IMergerProvider extension point.
The AbstractDiffExtension#getText(), AbstractDiffExtension#getImage() are bad idea. Why does EMF compare not use the edit framework and its providers to display labels and images? Even for extension, it can work!
Implements specific Diff/Match engine
The GenericXXXEngine (where XXX is Match or Diff) implementations provide useful base implementation of EMF model comparison. Then we will use it as our direct superclass.
The core algorithm is quite well splitted into several method and some of them are protected to allow subclassing. We can change the reference/attribute checker to avoid the comparison of certain features. The main point is that we are able to make some post processing by browsing the DiffModel only once! It will scale much more than the extension mechanism.
We will use the best of the two worlds: extension mechanism for out of the box UI support of diff extension and subclass of diff and match engine to provide working and scalable post processing of the DiffModel.
AbstractDiffExtension specific behavior should be remove from the core except the GUI hidding one. Moreover, the AbstractDiffExtension#visit(), AbstractDiffExtension#getImage(), AbstractDiffExtension#getText() and AbstractDiffExtension#provideMerger() methods should be make deprecated in EMF Compare 1.2 and removed from 2.0.
UML Compare match engine
There is only one need that force us to subclass the match engine: to ignore EObject storing stereotypesproperties. This is detailled in section #Profile / Stereotype support.
UML Compare diff engine
The subclass Generic Diff Engine will do two things:
- Ignore references that are of no need for accurate comparison
- Add post processing of the DiffModel to add UML extensions and hide other generic DiffElements.
GenericDiffEngine of EMF Compare ignore by default the following EReferences kinds (via the ReferencesCheck class):
Actually, it ignores containment references because it already checks the contents of EObject via reflective eContents() method. Then, checking containment references would lead to duplicate differences. In UML2, there is the notion of subset references. Some of them are not derived (they are not automatically deduced from the superset) and subsets a containment references. With EMF, an EObject can be referenced by only one containment reference. Then the subset references of containment one are not declared as is. Thus, GenericDiffEngine checks them against differences and it leads to duplication. To avoid this situation, we have to ignore those references.
As the "subset" information is only available in the uml.ecore file (in EAnnotation) and that this information is not persisted in the generated code, we have to store this information somewhere. We propose a property file with a set a key/value pair (fully qualified name of the reference: <containingEClass.name>.<reference.name>=true). We reproduce here the file with the selected references:
TemplateParameterSubstitution.actual=true Action.input=true Association.memberEnd=true ActivityGroup.subgroup=true Classifier.attribute=true ActivityGroup.containedNode=true Element.ownedElement=true StructuredClassifier.role=true Namespace.member=true TemplateParameter.parameteredElement=true Action.output=true ActivityGroup.containedEdge=true NamedElement.clientDependency=true Classifier.feature=true TemplateParameter.default=true Namespace.ownedMember=true TemplateSignature.parameter=true
We will implement a subclass of ReferencesCheck to ignore references described in this file.
Infrastucture to support UML extensions
EMF Compare extension mechanism does not scale well. We propose here a new set of APIs to extend EMF Compare.
This infrascture is called just after the end of the normal processing of the doDiffTwoWay/doDiffThreeWay. Those methods will be overriden, then call their super() and finally branch our post processing infrastructure.
First, the diff extension factory registry is initialized and returns a Set of IDiffExtensionFactory. At each step of a whole DiffModel browsing each factory are asked if they handle the given DiffElement. If true, it is asked to create an AbstractDiffExtension and also asked for the parent DiffElement of the just created AbstractDiffElement. The newly created element is then added to its parent. The create() method has to handle the hidding of concerned DiffElement.
For each AbstractDiffExtension, a Factory is implemented and the pair of method handles/create gives the semantic of creation of this extension. The handles method should return quickly to minimize the overhead of the extension in case it is not relevant for the current case.
With this infrastructure, we are able to provide a scalable system to extend the GenericDiffEngine for UML difference computation.
Profile / Stereotype support
There are two issues to properly support UML profiles in UML compare engine:
- Ignore the EObjects that store the stereotypes properties from the match engine in the object matching phase but instead include their properties in the "base" EObject during their matching.
- Compute the differences of stereotypes application properties and attached them to their base objects.
The second one is simply addressed by the compare extension mechanism. We will define specific extension to handle properties diff on stereotype application and to attached them to their base EObject.
The first one forced us to subclass the GenericMatchEngine. We have two things to do:
- First we have to provide our own IMatchScopeProvider implementation (or GenericMatchScopeProvider subclass) to also provide our own IMatchScope implementation to ignore EObject in the direct contents of the resources corresponding to stereotype applications. This test will be implemented in the isInScope() method of the IMatchScope.
- Second, we have to extend the internal contents of the base EObject with the stereotype application EObject in order to match base EObject with all their properties (including the ones from the applied stereotypes).
|ProfileApplication||cf. profile support||cf. profile support|
Use case diagram