Jump to: navigation, search

Difference between revisions of "Index based model compare match engine"

 
(8 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
Student: Stefan Leopold  
 
Student: Stefan Leopold  
  
This project is part of the [[Google Summer of Code 2010]]  
+
This project is part of the [[Google Summer of Code 2010]].
 +
 
 +
Also refer to the corresponding [http://code.google.com/a/eclipselabs.org/p/model-compare-match-engine-playground/ EclipseLabs project] for further details and the codebase.
  
 
== Abstract  ==
 
== Abstract  ==
Line 9: Line 11:
 
Model elements matching is the most critical phase during model comparison regarding performance and memory consumption. Further improvement of the current EMF Compare GenericMatchEngine and the adoption and integration of new ideas and concepts in this part of the EMF Compare framework can largely help to get better scalability.
 
Model elements matching is the most critical phase during model comparison regarding performance and memory consumption. Further improvement of the current EMF Compare GenericMatchEngine and the adoption and integration of new ideas and concepts in this part of the EMF Compare framework can largely help to get better scalability.
  
== Detailed Information ==
+
== Details ==
  
 
Providing out-of-the-box solutions is essential in today's business, especially if they can be adapted and - if required - customized easily. EMF Compare already provides such experience for usecases of different size and of different complexity. Live model editing with comparison support further raises the expectations - and requirements with regard to performance and scalability.
 
Providing out-of-the-box solutions is essential in today's business, especially if they can be adapted and - if required - customized easily. EMF Compare already provides such experience for usecases of different size and of different complexity. Live model editing with comparison support further raises the expectations - and requirements with regard to performance and scalability.
  
Looking at the current implementation - while having these changed requirements in mind - reveals some "open space for improvement", e.g.
+
So an in-depth analysis of the current implementation combined with the adoption of new ideas and/or other existing concepts may be a first step for better integration of EMF Compare with live model editing, even with huge model graphs.
* for the computation of contentSimilarity, for each object the same "key" - as long as the model object (live editing!) and the filter of  the GenericMatchEngine (adapted during MatchEngine run?) remains unchanged - is computed several times, so computing and maintaining an model object index could largely improve performance,
+
* the current implementation of the nameSimilarityMetric in der GenericMatchEngige is not symmetric (in many cases it is but not always, e.g. bbb->abb 100% match, abb->bbb 50% match - algorithm works pair by pair!), having the guarantee it is symmetric, some optimization to the MatchEngine could be applied easily,
+
* ...
+
 
+
So an in-depth analysis of the current implementation combined with the adoption of new ideas and/or other existing concepts (there seem to be synergy's e.g. with EMF Index) may be a first step for better integration of EMF Compare with live model editing, even with huge model graphs.
+
  
 
== Deliverables ==
 
== Deliverables ==
Line 28: Line 25:
 
{| style="text-align: center" class="wikitable FCK__ShowTableBorders"
 
{| style="text-align: center" class="wikitable FCK__ShowTableBorders"
 
|- style="background: #efefef"
 
|- style="background: #efefef"
! Milestone
+
! Week
! Weeks
+
 
! Step
 
! Step
 
! Details
 
! Details
 
|- style="background: lightgrey"
 
|- style="background: lightgrey"
! M1
+
! 1-4  
| 1-4  
+
 
| research  
 
| research  
 
| align="left"| identifying performance problems and analysing bottlenecks, preparing test data, suggesting improvements
 
| align="left"| identifying performance problems and analysing bottlenecks, preparing test data, suggesting improvements
 
|- style="background: lightgrey"
 
|- style="background: lightgrey"
! M2
+
! 5-8  
| 5-8  
+
 
| implementation  
 
| implementation  
 
| align="left"| coding of prototypes and patches with accepted improvements
 
| align="left"| coding of prototypes and patches with accepted improvements
 
|- style="background: lightgrey"
 
|- style="background: lightgrey"
! M3
+
! 9-12  
| 9-12  
+
 
| integration
 
| integration
 
| align="left"| committing work, providing benchmark report with performance impacts
 
| align="left"| committing work, providing benchmark report with performance impacts
 
|}
 
|}
  
== References ==
+
== Links ==
  
[1] [http://wiki.eclipse.org/index.php/EMF_Compare EMF Compare]<br>
+
[1] [http://code.google.com/a/eclipselabs.org/p/model-compare-match-engine-playground/ EclipseLabs project]<br>
[2] [http://www.eclipse.org/modeling/ EMF]<br>
+
[2] [http://wiki.eclipse.org/index.php/EMF_Compare EMF Compare]<br>
[3] [http://www.eclipse.org/forums/index.php?t=thread&frm_id=108 EMF Forum]<br>
+
[3] [http://www.eclipse.org/modeling/ EMF]<br>
 +
[4] [http://www.eclipse.org/forums/index.php?t=thread&frm_id=108 EMF Forum]<br>
  
 
[[Category:SOC]]
 
[[Category:SOC]]

Latest revision as of 04:02, 4 July 2010

Mentor: Cédric Brun

Student: Stefan Leopold

This project is part of the Google Summer of Code 2010.

Also refer to the corresponding EclipseLabs project for further details and the codebase.

Abstract

Model elements matching is the most critical phase during model comparison regarding performance and memory consumption. Further improvement of the current EMF Compare GenericMatchEngine and the adoption and integration of new ideas and concepts in this part of the EMF Compare framework can largely help to get better scalability.

Details

Providing out-of-the-box solutions is essential in today's business, especially if they can be adapted and - if required - customized easily. EMF Compare already provides such experience for usecases of different size and of different complexity. Live model editing with comparison support further raises the expectations - and requirements with regard to performance and scalability.

So an in-depth analysis of the current implementation combined with the adoption of new ideas and/or other existing concepts may be a first step for better integration of EMF Compare with live model editing, even with huge model graphs.

Deliverables

The main objective of this project is to measurably improve the performance of the EMF Compare model matching algorithm. This improvement will be documented by benchmarks performed on provided "Real-World" data.

Timeline

Week Step Details
1-4 research identifying performance problems and analysing bottlenecks, preparing test data, suggesting improvements
5-8 implementation coding of prototypes and patches with accepted improvements
9-12 integration committing work, providing benchmark report with performance impacts

Links

[1] EclipseLabs project
[2] EMF Compare
[3] EMF
[4] EMF Forum