Skip to main content
Jump to: navigation, search


< VIATRA‎ | Query
Revision as of 07:35, 11 August 2015 by (Talk | contribs)

Frequently Asked Questions

Performance optimization guidelines

This content is migrated from

On this page, we aim to summarize our experiences regarding peformance benchmarking with model transformation and model query tools. In particular, we attempt to provide advice on how to do accurate performance benchmarking and avoid typical pitfalls. We also aim to answer frequently asked questions regarding our technologies and performance/scalability/usability/functionality issues. Finally, we provide a detailed list of references to all academic papers, reports and supplementary material that are related to performance/scalability experiments.

Our most important goals with this page are transparency and reproducibility, that is, to provide precise descriptions, code and model examples, and evaluation guidelines that anyone can use to reproduce and check VIATRA and EMF-IncQuery for performance and scalability.


The most important configuration step is to ensure that the Java Virtual Machine (JVM) running the Eclipse environment (and VIATRA/EMF-IncQuery inside) has access to as much memory (RAM) as possible. The reason for this is that by default, the JVM is not configured (by the settings in eclipse.ini) to be able to use all the available RAM in your computer. If the Eclipse application uses up all memory within the - rather low - default limit, thrashing and other kinds of performance degradation might occur, potentially corrupting performance measurement results.

For information on how to specify JVM boot parameters in Eclipse, we refer the reader to:

For Eclipse applications, a performance benchmark setup typically requires the appropriate setting of two boot JVM parameters:

  • maximum heap size: -XmxHEAPSIZEm (larger is better)
    • e.g. -Xmx2048m (for a 2GB heap limit)
    • if you wish to use EMF-IncQuery or VIATRA with large instance models (>100MB in serialized size), specify a limit which is as close as the physical RAM in your computer as possible
  • maximum permgen space: -XX:MaxPermSize=PERMSIZEm
    • e.g. -XX:MaxPermSize=256m (for a 256M permgen space limit)

There are a number of other JVM boot parameters as well, which might have a beneficial effect on overall performance. On 64 bit systems, we recommend to use the followings:

  • -XX:+UseCompressedOops
  • -XX:-UseParallelGC

Best practices

In the followings, we summarize our recommendations for conducting performance benchmarks. These tips apply not just to VIATRA or EMF-IncQuery, but to any other (modeling) tool as well.

For query/pattern matching performance, focus your measurements strictly on the query/pattern matching execution phase. In other words, try to avoid including other activities (such as model initialization, the printing of debug/output information to standard output etc.) in the recorded execution time values. For instance, emitting textual output may have considerable overhead (e.g. as is the case in VIATRA, due to the rather complex formatting/output buffering infrastructure in place, to support advanced code generation use-cases) that have nothing to do with the (pure) performance of query evaluation/pattern matcher.

Measure wall times, preferably with System.nanotime() or something similar, for maximum accuracy. Whenever possible (especially with open source tools), use source code instrumentation (or simply adding a few lines of code to the source) to precisely isolate the execution phases of interest. As observed e.g. in our Train Benchmarks, the specific lifecyle of incremental pattern matching (that is, the overhead on model initialization and modification operations) mean that various use-cases (such as the "morning boot", i.e. loading the model for the first time, or "reboot", i.e. the re-execution of queries or transformations after they have been executed previously) may have characteristically different speed that are practical to be measured separately from each other.

A simple example illustrating this technique with EMF-IncQuery is as follows:

long start = System.nanoTime();
MatchedClassMatcher matcher = MatchedClassMatcher.FACTORY.getMatcher(emfRoot); 
// initialization phase, the Rete network is constructed (involves model traversal)
long matcherInit = System.nanoTime();    
Collection matches = matcher.getAllMatchesAsSignature();
// pattern matching phase, results are retrieved from the Rete network    
long collectedMatches = System.nanoTime();
System.out.println("Init took: " + (matcherInit-start)/1000000 + 
 " Collecting took: " + (collectedMatches-matcherInit)/1000000 + " ms");

Take the average of at least 10 runs, excluding the worst and best results. Due to frequently encountered auxiliary distortion effects such as OS-level caching, or JVM-level class loading, we usually perform several (at least 10) measurement runs, leave out the best and worst results, and take the average of the remaining data. For the Train Benchmarks, we even have taken special care (relying on specific features of the Linux kernel) to disable OS-level caching effects since the speed of model loading/initialization phases (especially for very large models) may also significantly depend on such low-level features.

Take special care for measuring memory overhead. Measuring the memory usage of Java programs is widely known to be a difficult task. For the Rete-based incremental pattern matchers in EMF-IncQuery and VIATRA, it is relatively straightforward to define the memory overhead as the "retained" (steady-state) memory usage that is registered after a query has been evaluated on an instance model (since Rete maintains an in-memory cache that is kept in-sync with the model as long as it is explicitly disposed or the model itself is disposed).

To measure this, in simple measurements, we commonly use the following code snippet to report the current memory usage of the JVM:

try {
  Thread.sleep(1000); // wait for the GC to settle
 } catch (InterruptedException e) { // TODO handle exception properly }  
long usedMemory = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println("Used memory: " + usedMemory + " bytes");
System.out.println("Used memory: " + (usedMemory/1024)/1024 + " megabytes");

To obtain the overhead due to EMF-IncQuery, we simply measure the consumption for the case when only the EMF instance model is loaded, and subtract that value from the original measurement.

In more precise measurements, we use the JConsole or a profiler (such as YourKit) to obtain more accurate results. This method is also the preferred approach if you are evaluating the transient memory impact of tools (i.e. temporary heap allocations that are released by the garbage collector after the execution has reached a steady state). Note however, that such heap transients may be very hard to reproduce deterministically due to i) aliasing effects of the profiler (the transients may be so short lived that they do not show up on the chart) and ii) inherent non-determinisms in the way the JVM works (garbage collection anomalies, or operating system kernel-specific issues).

Optimizing queries and transformations

To optimize VIATRA and/or EMF-IncQuery patterns (queries) for performance, we recommend to keep to the following simple best practices:

  • Write reusable patterns: factor out commonly used sub-patterns into separate patterns and use find() calls for re-use. This helps cleaning up your code, and also helps the Rete engine to store the matches for commonly used sub-patterns only once (thereby reducing memory consumption). Constraints already expressed in call patterns need not be repreated in the calling pattern.
  • Avoid "Cartesian product" patterns if possible: pattern variables in a pattern should be "connected" to each other via positive constraints (node and edge constraints, positive pattern calls), otherwise all combinations of their potential values must be enumerated and individually checked by the pattern matcher. Note that other constraints (e.g. negative calls, check() expressions) are not suitable for "connecting" the pattern.
  • Simplify check() expressions. Check() expressions may contain additional constraints (typical examples include string operations such as .contains or .startsWith, arithmetic/logical comparisons or equivalence tests, etc) that may include (very) costly operations. In the case of performance issues, it is a good idea to start looking for bottlenecks inside check() expressions and if possible, eliminate them.
  • Linking by edges is good, linking by check() is bad. When expressing the correspondence of two model elements, it is best if you can link them via graph edges, as opposed to comparing their attributes. Or you can check that two objects have the same attribute value by using the same pattern variable to represent the value, and connect it to both objects to this value via attribute edge constraints. Comparing attributes in check() expressions will always be more expensive then these elegant solutions, since the check will have to be evaluated individually for each potential pair of elements (see the Cartesian product problem above).
  • As a last measure, you may also optimize the Rete layout by manual pattern factorization. To improve the performance on patterns with a large number of constraints, try to identify group of constraints that "belong together" and factor them out as subpatterns. For instance, if an expensive operation such as a check() can be evaluated with a subset of a pattern's variables and constraints, they are a good candidate to be factored out together.

Typical performance benchmarking aspects

For model transformation tools, a number of performance benchmark experiments have been reported so far (see Pointers below). We briefly summarize some general remarks below.

  • For model simulation scenarios (e.g. petrinet firing, antworld) that measure the time it takes to execute a single, or a sequence of simulation steps, take care of randomization/non-deterministic effects by e.g. averaging the results.
  • Scenarios involving code generation are sometimes problematic as output formatting, buffering, file operations etc may interfere with your results.

For benchmarking model query tools, we have defined a scenario in our AutoSAR and Train Benchmarks that aims to simulate the most performance-critical aspects of typical modeling tool use cases. It consists of four phases:

  • Model initialization, measuring the time it takes to load (de-serialize) the model, and, in the case of EMF-IncQuery or the OCL Impact Analyzer, the overhead of additional cache initialization.
  • First query evaluation, measuring the running time for retrieving the query results for the first time. This case corresponds to batch model validation.
  • Applying model manipulation operations, measuring any overhead on model management (as is the case of EMF-IncQuery or the OCL Impact Analyzer).
  • Second query (re-)evaluation, measuring the running time for subsequent queries (that are typically much faster for incremental technologies).

Some general recommendations that are good to check for performance benchmarks:

  • Check for correctness and fairness: are compared tools returning the same results and doing functionally equivalent computations?
  • Minimize auxiliary distortion effects such OS-level caching, file operations, background task interference, and memory management issues (see the HOWTO section for details).
  • Design measurement parameterization carefully; it is frequently non-trivial to scale up instance model sizes in a way that corresponds to actual modeling practice (threats to validatity).
  • If possible, audit your configuration and code with experts of the measured tools.
  • Interpret results carefully. Consider to what extent are certain factors (such as model management differences between interpretative and compiled transformation tools) dominating in results? How do certain characteristics of your model/queries/transformations favor certain tools/technologies?

Back to the top