Skip to main content
Jump to: navigation, search


Revision as of 09:49, 2 March 2014 by (Talk | contribs) (Running EMFTVM modules)

ATL EMF Transformation Virtual Machine (research VM)

Since 2011, the ATL tools include a research VM (EMFTVM), which allows for experimentation with advanced language features. Currently, these features include:


The EMF Transformation Virtual Machine (EMFTVM) is derived from the current ATL VMs and bytecode format. However, instead of using a proprietary XML format, it stores its bytecode as EMF models, such that they may be manipulated by model transformations. A special EMF resource implementation allows EMFTVM models to be stored in binary format, which is faster to load and save, and results in smaller files.

Apart from the standard ATL bytecode primitives, such as modules, fields, and operations, EMFTVM bytecode includes rules and code blocks. Fig. 1 shows the structure of rules and code blocks. Code blocks are executable lists of instructions, and have a number of local variables and a local stack space. Operation bodies and field initialisers are represented as code blocks in EMFTVM. Code blocks may also have nested code blocks, which can be manipulated and invoked from its containing block. These nested code blocks therefore effectively represent closures, which are nameless functions that can be passed as parameters to other functions. Closures are helpful for the implementation of OCL's higher-order operations, such as select and collect, which are parametrised by nested OCL expressions.

Structure of EMFTVM rules and code blocks
Fig. 1: Structure of EMFTVM rules and code blocks

Rules consist of input and output rule elements, a matcher code block, applier code block, and post-apply code block. The matcher code block takes potential input element candidates as parameters, and returns a boolean value, representing a match. The applier code block takes the input and (newly created) output elements as parameters, and assigns the bindings of the output elements. The post-apply code block also takes the input and output elements as parameters, and performs any (imperative) post-processing specified in the rule. Execution of rules is therefore done in three phases: (1) matching; only input elements are guaranteed to be present, (2) applying; all output elements and traces are guaranteed to exist, but no bindings may have been applied, (3) post-apply; all input and output elements, traces, and bindings are guaranteed to be present.

Rules can be invoked manually, automatically, and recursively automatically. Manual rules correspond to ATL lazy rules (and called rules). Automatic rules correspond to ATL matched rules. Recursively automatic rules do not apply to ATL, but can be used when compiling other transformation languages to EMFTVM. Rules can also be marked as default, which causes that rule to create default traces. Default traces can be looked up using ATL's implicit tracing mechanism, and only one default trace may exist for any given source pattern. Non-default traces are just stored in the trace model, and are not used by the EMFTVM transformation engine.

Rules can have a number of super-rules, which are stored by name. This decision allows EMFTVM to resolve and link the super-rules of each rule at load-time, whereas storing a super-rule reference would have hardcoded the super-rule in the bytecode. This is comparable to how the Java VM does super-class lookup. Finally, rules can be marked as abstract, which means that they are only applied as part of a non-abstract sub-rule, but never by themselves.

To summarise: by explicitly representing rules in the bytecode, rule inheritance can be resolved at load-time. As a consequence, rules stored in imported modules can be taken into account, and super-rules can be redefined by module superimposition before the reference to the super-rule is resolved in the sub-rules. This solves the historic mismatch between ATL's rule inheritance and module superimposition.

How to use EMFTVM?


EMFTVM is included as a separate feature in the latest ATL release (3.3): Snapshot releases of EMFTVM can be downloaded as an ATL add-on feature from [1]

Compiling to EMFTVM

To use EMFTVM for your ATL transformations, just add "-- @atlcompiler emftvm" on the first line of your ATL module to compile them to EMFTVM. An example ATL file can be found at [2].

Running EMFTVM modules

EMFTVM includes a separate launch configuration dialog that looks very much like ATL's launch configuration dialog (see Fig. 2). It can be found via "Run -> Run Configurations...".

EMFTVM Launch Configuration Dialog
Fig. 2: EMFTVM Launch Configuration Dialog

The specific launch parameters for EMFTVM are (see Fig. 3):

  • "Display Timing Data" (default: on) - displays a summary of the execution time after running a transformation.
  • "Disable JIT compiler" (default: off) - disables the use of dynamic Java bytecode generation for a transformation, which can be useful when running into a JIT bug, or when running in an environment that does not allow dynamic Java bytecode loading (e.g. MIDP, Android).
  • "Display Profiling Data" (default: off) - displays the execution time/count statistics for each executed codeblock in the transformation, which is useful for finding performance bottlenecks in your transformation.

EMFTVM Launch Configuration Parameters Tab
Fig. 3: EMFTVM Launch Configuration Parameters Tab


EMFTVM also includes its own Ant tasks:

  • emftvm.loadModel
  • emftvm.loadMetamodel
  • emftvm.saveModel

See [3] for example uses of these Ant tasks. To run an Ant script with EMFTVM tasks, do the following:

  1. Right-click the build.xml file, and select "Run As -> 3 Ant Build...".
  2. Go the the "JRE" tab, and select "Run in the same JRE as the workspace" (see Fig. 4.).
  3. Click "Run".

EMFTVM Ant Task Launch Configuration
Fig. 4: EMFTVM Ant Task Launch Configuration


The API pattern for using EMFTVM from Java looks like this:

ExecEnv env = EmftvmFactory.eINSTANCE.createExecEnv();
ResourceSet rs = new ResourceSetImpl();

// Load metamodels
Metamodel metaModel = EmftvmFactory.eINSTANCE.createMetamodel();
metaModel.setResource(rs.getResource(URI.createURI(""), true));
env.registerMetaModel("METAMODEL", metaModel);

// Load models
Model inModel = EmftvmFactory.eINSTANCE.createModel();
inModel.setResource(rs.getResource(URI.createURI("input.xmi", true), true));
env.registerInputModel("IN", inModel);

Model inOutModel = EmftvmFactory.eINSTANCE.createModel();
inOutModel.setResource(rs.getResource(URI.createURI("inout.xmi", true), true));
env.registerInOutModel("INOUT", inOutModel);

Model outModel = EmftvmFactory.eINSTANCE.createModel();
env.registerOutputModel("OUT", outModel);

// Load and run module
ModuleResolver mr = new DefaultModuleResolver("platform:/plugin/", new ResourceSetImpl());
TimingData td = new TimingData();
env.loadModule(mr, "Module");

// Save models

Note that it is possible to reuse an ExecEnv instance. While you cannot load other metamodels or transformation modules into an existing ExecEnv after it has already run, you can change the input/inout/output models. The advantage of reusing an ExecEnv is that you save on loading time, and the JIT has already warmed up. Please also note that ExecEnv instances are not thread-safe, and should only be used by a single thread. You should create new ExecEnv instances for use in other threads.

To make ExecEnv reuse easier, the ExecEnvPool utility class is provided. The ExecEnvPool allows you to store and retrieve ExecEnv instances for a fixed set of metamodels and transformation modules. This is how to use it:

 // pool initialisation
 final ExecEnvPool pool = new ExecEnvPool();
 final Metamodel ecoreMetamodel = EmftvmFactory.eINSTANCE.createMetamodel();
 pool.registerMetaModel("ECORE", ecoreMetamodel); // ECORE is built-in, so this is not strictly necessary
 final ModuleResolverFactory mrf = new DefaultModuleResolverFactory("platform:/plugin/");
 // use pool
 final ExecEnv env = pool.getExecEnv();
 final TimingData td = new TimingData();
 final ResourceSet rs = new ResourceSetImpl();
 final Model in = EmftvmFactory.eINSTANCE.createModel();
 in.setResource(rs.getResource(URI.createURI("in.ecore"), true));
 final Model out = EmftvmFactory.eINSTANCE.createModel();
 env.registerInputModel("IN", in);
 env.registerOutputModel("OUT", out);

Standalone use

EMFTVM can also run outside Eclipse. To do this, add the following plug-in jars to your classpath (based on Juno):


Instead of the DefaultModuleResolver, which uses EMF URI prefixes, you can also use the ClassModuleResolver that loads modules using Java class resource streams, like so:

ClassModuleResolver mr = new ClassModuleResolver(MyClass.class);

If you want to use multiple ExecEnv instances in parallel, running the same transformation on different models, you can use an ExecEnvPool instance:

ExecEnvPool pool = new ExecEnvPool();
Metamodel metamodel = EmftvmFactory.eINSTANCE.createMetamodel();
pool.registerMetaModel("ECORE", metamodel);
ModuleResolverFactory mrf = new DefaultModuleResolverFactory(PLUGIN_URI + "/test-data/EcoreCopy/");

The created pool can be shared between threads, and is used within one thread or session like this:

ExecEnv env = pool.getExecEnv();

TimingData td = new TimingData();
ResourceSet rs = new ResourceSetImpl();
Model in = EmftvmFactory.eINSTANCE.createModel();
in.setResource(rs.getResource(URI.createPlatformPluginURI(EMFTVM_PLUGIN_ID + "/model/emftvm.ecore", true), true));
Model out = EmftvmFactory.eINSTANCE.createModel();
env.registerInputModel("IN", in);
env.registerOutputModel("OUT", out);


When using the ClassModuleResolver instead of the DefaultModuleResolver, you can use the SingletonModuleResolverFactory instead of the DefaultModuleResolverFactory (no new ModuleResolver needs to be created for each ExecEnv instance with a ClassModuleResolver):

ClassModuleResolver mr = new ClassModuleResolver(MyClass.class);
ModuleResolverFactory mrf = new SingletonModuleResolverFactory(mr);

In addition to the default generated code EMF models, EMFTVM can also work with "POJO models", which are POJOs that are embellished with EMF reflective methods. This is useful for combining JPA and JAXB annotated POJOs with model transformation. To make this work, EMFTVM can work with regular List and Set instances instead of just EList instances for the representation of multiple attributes/references. See [4] for more info.


Rule Inheritance

ATL rule inheritance is limited to single inheritance between rules within the same module. This limitation is hardwired in the ATL grammar, and cannot easily be changed. Therefore, the EMFTVM compiler introduces support for a rule inheritance @extends annotation, which supports multiple inheritance between rules situated in any module:

-- @extends SuperRule1, SuperRule2
rule SubRule {

To use this annotation, the regular ATL extends keyword must not be used: the regular ATL extends keyword will override the @extends annotation.

Detailed information on the rule inheritance semantics can be found in the paper "Towards a general composition semantics for rule-based model transformation."

Module import

Module import is the new module superimposition. Instead of providing a list of superimposed modules in your launch configuration, EMFTVM automatically loads any modules that are mentioned in an ATL uses clause. It does this on the basis of a "module path", which is the EMFTVM equivalent of Java's classpath: modules are looked up within certain base URIs (EMF resources always have a URI). A module path is a comma-separated list of base URIs, to which the module file name can be appended. Module paths are specified in the ATL EMFTVM launch configuration under "Path" (see Fig. 5).

EMFTVM launch configuration with module path
Fig. 5: EMFTVM launch configuration with module path

The loading order of modules is specified by the order of the uses clauses of each module; in addition, modules are loaded depth-first. As in module superimposition, rules and helpers in imported modules can be re-defined in the importing module. Detailed information on the module import semantics can be found in the paper "Towards a general composition semantics for rule-based model transformation."

Multiple dispatch

EMFTVM supports multiple dispatch of ATL helpers. This means you can greatly reduce the amount of "oclIsKindOf()" checks in your code, because you can simply write a new helper with the required context/parameter type (not only the context is virtual, but also all parameters). To illustrate how this works, look at the following ATL code:

-- These helpers check for value equality (as opposed to object identity equality).
-- The base case checks meta-type equality, and covers
-- comparison of different element types, including OclUndefined.
helper context OclAny def : sameAs(other : OclAny) : Boolean =
	self.oclType() = other.oclType();

helper context OCL!"OclType" def : sameAs(other : OCL!"OclType") : Boolean =
	super.sameAs(other) and =;

helper context OCL!CollectionType def : sameAs(other : OCL!CollectionType) : Boolean =
	super.sameAs(other) and

In regular ATL, you cannot assume the parameter types to be correct: you must make sure you invoke the helpers with the correct parameter types. You certainly cannot expect ATL to invoke a different helper, based on the types of the parameters! In the case of the above code, regular ATL will invoke one of the three "sameAs" helpers based only on the context type (and the number of parameters). That means that it is pointless to change the type of the "other" parameter to "OCL!OclType" in the second "sameAs", because ATL will not use it.

EMFTVM does use the parameter types, however. EMFTVM will only invoke the second "sameAs" as soon as both the context and the "other" parameter are of type "OCL!OclType". Otherwise, it will invoke the first "sameAs". As a comparison, the second "sameAs" would have to look like this in regular ATL:

helper context OCL!"OclType" def : sameAs(other : OclAny) : Boolean =
	if other.oclIsKindOf(OCL!"OclType") then
		super.sameAs(other) and =

Lazy collections

EMFTVM provides a lazy implementation of the OCL collection types. That means you can invoke operations on the collections, but those operations will not be executed until you actually evaluate the collection. Also, collection operations will only be evaluated partially, depending on how much of the collection you evaluate. To illustrate how this works, look at the following example code:

query lazytest = (100).run()->collect(x | x.expensive())->last();

helper context Integer def : run() : Sequence(Integer) =
	if self <= 0 then
		(self - 1).run()->append(self)

helper context Integer def : expensive() : Integer =
	(self * self).debug('expensive');

The above query will first invoke "run" on 100, which creates a Sequence of all numbers from 0 to 100. Then, the query invokes "collect" on that Sequence, and replaces each value in the Sequence by its squared value (the "expensive" operation). Finally, we're only interested in the last value of the changed Sequence.

In regular ATL, everything is evaluated left-to-right, and the whole {0..100} Sequence is converted to a Sequence of squared values before the last value is returned. In EMFTVM, "collect" returns a lazy Sequence, which is just waiting to be evaluated. Only when "last" is invoked on the lazy Sequence will the Sequence invoke the "expensive" operation on the last element of the input Sequence. As a result, "expensive" is only invoked once by EMFTVM.

Of course, the above example does not seem a very practical one. However, the following OCL patterns probably do ring a bell for you:

mySequence->select(x | x.oclIsKindOf(MM!MyType))->first()

In regular ATL, all elements are computed before invoking "first". In EMFTVM, only the first element of each sequence is evaluated. In fact, the "select(...)->first()" code was common enough for the OCL committee to come up with the "any(...)" iterator.

On a sidenote: because EMFTVM performs lazy evaluation of collections, it also performs lazy evaluation of boolean operators (and, or, etc.). Regular ATL does not do this, and requires you to use nested if expressions to optimise your code. It is important to be aware of EMFTVM's lazy evaluation, because it may not invoke all of your code! If you're invoking a lazy rule from inside a lazy collection iterator body, you must evaluate the collection to trigger the lazy rule invocation!

Advanced tracing

In 2009, Andrés Yie did some work on ATL's tracing mechanism, which involved advanced reflection on the traces generated by ATL, as well as storing the traces to an EMF model. EMFTVM takes this work a step further, and provides an extended tracing metamodel (see Fig. 6). This metamodel allows efficient lookup of "default" traces, as well as "unique" traces. The "default" traces are used by ATL's implicit tracing mechanism to translate source values to target values in rule bindings. The "unique" traces are an addition that allows reflective lookup of target values for source values transformed by lazy unique rules or "nodefault" rules (a hidden feature of ATL that switches off implicit tracing for a matched rule). Note that the TraceElement metaclass has one more reference, which is not shown in the diagram: "object : EObject [0..1]". This is the reference to the external model element that is being traced.

Structure of the EMFTVM trace metamodel
Fig. 6: Structure of the EMFTVM trace metamodel

At runtime, the EMFTVM execution environment provides a "traces" field, which contains a instance of the TraceLinkSet metaclass shown above. You can navigate this TraceLinkSet to find the tracing information you need. The following example code is part of the "UML2Copy.atl" transformation module, and reflectively copies all stereotype applications:

lazy rule ApplyStereotypes {
	from s : UML2!"uml::Element" in IN
	using {
		t : UML2!"uml::Element" = s.resolve();
	do {
		for (st in s.getAppliedStereotypes()) {
			for (a in st.getAllAttributes()) {
				if (not'base_') and s.hasValue(st, {
					t.setValue(st,, s.getValue(st,;

endpoint rule ApplyAllStereotypes() {
	do {
		for (element in thisModule.traces.defaultSourceElements
				->select(o|o.oclIsKindOf(UML2!"uml::Element"))) {

The "thisModule.traces" field is used to iterate over all transformed elements (represented by the "defaultSourceElements"). Then, for each source UML2!Element, the "ApplyStereotypes" rule is invoked. The "ApplyStereotypes" rule then just invokes the reflective operations provided by the UML metamodel.

In order to store the trace model to a file for later processing, you can just add an output or in/out model named "trace" to the launch configuration. EMFTVM does not require you to specify the metamodel for a model, because EMF will work this out by itself.

Apart from navigating the trace model yourself, you can use one of these built-in helper operations as well:

helper def : resolveTemp(var : OclAny, target_pattern_name : String) : OclAny
helper def : resolveTemp(var : OclAny, rule_name : String, target_pattern_name : String) : OclAny
helper context OclAny def : resolve() : OclAny
helper context OclAny def : resolve(rule : String) : OclAny
helper context Collection(OclAny) def : resolve() : Sequence(OclAny)
helper context Collection(OclAny) def : resolve(rule : String) : Sequence(OclAny)

The first "resolveTemp" operation is well-known, and resolves a named target element for a given source element, using the default traces. The second "resolveTemp" resolves a "unique" trace, and also requires the name of the rule for which the requested trace is unique.

The first "resolve" operation implements the implicit tracing mechanism: it is invoked on a source element, and returns its default target element, if any. If there is no default target element, it just returns the source element. The second "resolve" operation also takes a rule name as parameter, and uses the unique traces for that rule to resolve a target element. If there is no unique target element for that rule and source element, it returns the source element.

The last two "resolve" operations are defined on Collections, and resolve each element in the collection to a target element. The elements are returned in a Sequence: the order in which the input Collection returns the source elements is maintained in the output.

Fun fact:

target_property <- s.source_property

is equivalent to

target_property <:= s.source_property.resolve()

In-place transformation

ATL/EMFTVM supports in-place transformation through refining mode since January 4th, 2013. Refining mode allows one to write an in-place transformation as if it were a copy transformation, where all unmatched elements are copied by default. This is different from the recursive in-place transformation supported by SimpleGT/EMFTVM, as no recursive matching takes place in ATL.

In order to use refining mode, use the refining keyword instead of the from keyword in the create clause:

create OUT : EMFTVM refining IN : EMFTVM;

You can now adapt the "copying" behaviour by adding rules. The following rule adapts the name of all EMFTVM!Rule elements:

rule RuleAppendedName {
  from s : EMFTVM!Rule
  to t : EMFTVM!Rule (
    name <- + 'Appended')

In addition to the standard situation, where properties of an existing model element are changed, you can also delete elements. This is done by leaving out the to part. The following rule will match and delete instances of EMFTVM!Delete:

rule Delete {
  from s : EMFTVM!Delete

Finally, it is possible to replace an existing element by another element. This feature is new to ATL, and allows you to implement migration transformations. To replace an input element by another element, just specify a different metaclass for the output element. The following rule replaces all instances of EMFTVM!Set with instances of EMFTVM!Add:

rule SetToAdd {
  from s : EMFTVM!"Set"
  to t : EMFTVM!Add (
    fieldname <- s.fieldname)

The above rule also takes care of any existing references to "s" in the loaded models. Each reference to "s" is remapped to "t" instead. As a result, everything that used to point to an EMFTVM!Set instance now points to an EMFTVM!Add instance instead. This is in line with the copy transformation metaphor, where "s" resolves to "t" using the implicit tracing mechanism.

Lazy rules

Lazy rules work slightly different in EMFTVM than in regular ATL. In EMFTVM, lazy rules are not just invoked, but are also matched on the provided input elements. This guarantees that all input elements of a lazy rule are of the correct type, and also comply with any filter expression in the rule's from part.

In addition, lazy rules return "OclUndefined" by default; you have to specify the return value in a do block, as is done with called rules. Lazy rules that do not match against a given input also return "OclUndefined".

Finally, lazy rules generate traces in EMFTVM that can be retrieved via reflection on the trace model. Unique lazy rules generate "unique" traces that can be resolved using the thisModule.resolveTemp(var : String, rule_name : String, target_pattern_name : String) method.


EMFTVM performance is roughly 80% better than the default ATL EMF-specific VM. EMFTVM has a JIT-compiler that improves performance of complex code blocks. It also allows for reuse of a pre-loaded VM instance (when invoking from Java), which is useful when invoking the same transformation on different models many times over. Finally, it uses an adaptive rule matching algorithm that configures itself against the metamodels and transformation modules used on the first run of the VM. Below is a graph that shows the performance of copying the "emftvm.ecore" model:

Fig. 7: Ecore copy performance

The EcoreUtil.Copier entry is the standard Java implementation for copying Ecore models, and forms the baseline ("it doesn't get faster than this"). On the following lines one can see the evolution in performance of the various ATL VMs.

The source data for this graph can be found at [5].

Additional API

EMFTVM provides some additional API over-and-above the built-in data types and the ATL Standard Library.

  • First of all, EMFTVM departs from OCL 2.2 instead of 2.0 (with currently the exclusion of the "collectNested" collection operation).
  • Apart from that, all standard Java API is available directly from EMFTVM.
  • Finally, EMFTVM provides extra operations through its the built-in library. Below is a list of these operations per context type:

OclAny operations

  • resolve() returns the implicit tracing result of self, or self if no default trace exists.
  • resolve(rule : String) returns the tracing result of self for the given rule, or self if no default trace exists. Works on unique rules.
  • remap(to : OclAny) translates all references to self into references to to in all in/out models, and returns to.

OclType operations

  • newInstance() creates and returns a new instance of the given type.
  • refNewInstance(args : Sequence(OclAny)) creates a new instance of the given type with the given arguments, and returns it.
  • refInvokeStaticOperation(opname : String, arguments : Sequence(OclAny)) invokes a static operation on the given type and returns its result.

String operations

  • toDate(format : String) returns the EDate (java.util.Date) representation of self using the given SimpleDateFormat format.
  • toDate(format : String, locale : String) returns the EDate (java.util.Date) representation of self using the given SimpleDateFormat format and locale (of the form "<language>[_<country>[_<variant>]]", e.g. "nl_BE". See also Locale).

Sequence operations

  • includingRange(first : Integer, last : Integer) returns a Sequence including the range first..last. It serves as a replacement for specifying e.g. the OCL expression Sequence{0..5}, which is not possible in ATL's syntax.

Bag operations

  • includingRange(first : Integer, last : Integer) returns a Bag including the range first..last. It serves as a replacement for specifying e.g. the OCL expression Bag{0..5}, which is not possible in ATL's syntax.

Set operations

  • includingRange(first : Integer, last : Integer) returns a Set including the range first..last. It serves as a replacement for specifying e.g. the OCL expression Set{0..5}, which is not possible in ATL's syntax.

OrderedSet operations

  • includingRange(first : Integer, last : Integer) returns a OrderedSet including the range first..last. It serves as a replacement for specifying e.g. the OCL expression OrderedSet{0..5}, which is not possible in ATL's syntax.

Tuple operations

  • toDate() returns the EDate (java.util.Date) representation of self using the tuple values. Supported tuple value names are: timezone, locale, year, month, day_of_month, day_of_week, day_of_week_in_month, day_of_year, era, hour, hour_of_day, minute, second, millisecond, am_pm, week_of_month, week_of_year (see also Calendar).

EDate operations

  • toString(format : String) returns the formatted date string using the given SimpleDateFormat format.
  • toString(format : String, locale : String) returns the formatted date string using the given SimpleDateFormat format and locale (of the form "<language>[_<country>[_<variant>]]", e.g. "nl_BE". See also Locale).
  • toTuple() returns a Tuple with tuple values representing the date parts of self. Provided tuple value names are: timezone, year, month, day_of_month, day_of_week, day_of_week_in_month, day_of_year, era, hour, hour_of_day, minute, second, millisecond, am_pm, week_of_month, week_of_year (see also Calendar).
  • toTuple(timezone : String) returns a Tuple with tuple values representing the date parts of self using the given timezone ID (see also TimeZone.getTimeZone(String ID)). Provided tuple value names are: timezone, year, month, day_of_month, day_of_week, day_of_week_in_month, day_of_year, era, hour, hour_of_day, minute, second, millisecond, am_pm, week_of_month, week_of_year (see also Calendar).

ATL Module operations

  • resolveTemp(var : String, rule_name : String, target_pattern_name : String) returns the target element named target_pattern_name tracing back to var for the given rule_name. Works on unique rules.


Back to the top