Skip to main content
Jump to: navigation, search

Difference between revisions of "JDT Core/Null Analysis/External Annotations"

(Started documenting WIP on using external annotations for null analysis)
 
(Added section "File format")
Line 32: Line 32:
 
;JDT/Core
 
;JDT/Core
 
:{{bug|440687}}  [compiler][batch][null] improve command line option for external annotations
 
:{{bug|440687}}  [compiler][batch][null] improve command line option for external annotations
 +
 +
 +
==File format==
 +
On the one hand, the file format ''could'' remain a private implementation detail of the compiler, if the IDE just provides the necessary operations for manipulating external annotations (like, e.g., quick assists). This ''would'' make the binary .class file format a natural choice for implementation, as it allows to store all required information and can already be interpreted by the compiler.
 +
 +
On the other hand, annotation files should be amenable to storing, comparing and merging using any version control system. This advocates the use of a textual format.
 +
 +
Additionally, a publicly defined textual file format will allow other tools to operate on annotation files (e.g., conversion from/to other formats, automatic merging etc.), and seasoned users could even directly edit annotation files (although no dedicated editors are planned).
 +
 +
The design is strongly influenced by the need for a low-footprint implementation: files must be reasonably small (speaking against any xml formats) and implementation of file reading must be small and efficient.
 +
 +
The specification of this file format is split into the aspects '''layout''' and '''encoding'''.
 +
 +
===File layout===
 +
External annotations for a particular type consist of a type header and entries for individual members.
 +
 +
The basic layout is line-based, i.e., each line represents one element and linebreaks within an element are not allowed.
 +
Each line can be either of '''empty''', '''type header''', '''member name''', '''original signature''', or '''annotated signature'''.
 +
 +
;type header: starts with one of the keywords '''class''', '''interface''' or '''enum''' followed by the qualified name of the type.
 +
;member name: the simple name of a field or method
 +
;original signature: directly follows a type header or member name; starts with a single blank followed by the original signature of the preceding element using the encoding discussed below
 +
;annotated signature: optionally follows the originally signature, and shares the same format, except that this will contain the actually information about annotations
 +
 +
It may become relevant to include '''meta data''' (version of a library, origin of an annotation etc.). While no format for this has been defined as of yet, it might be prudent to specify that any of the line formats may contain meta data, which are not to be interpreted by the compiler. To allow for such future extensions, we may define that the each line may contain arbitrary trailing content separated from what has been specified above by any amount of white space (blanks and tabs).
 +
 +
<u>Status:</u> implemented [[Image:Ok_green.gif]], ''except for the skipping of meta data.''
 +
 +
===Textual encoding of signatures===
 +
The current WIP implementation is based on signatures as defined in [http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.9.1 JVMS 4.7.9.1]. This format is used unaltered for "original signatures", and for "annotated signatures" the following changes are applied to the grammar from JVMS (additions in bold face):
 +
 +
;''ClassTypeSignature:''
 +
:L '''''[Annot]''' [PackageSpecifier] SimpleClassTypeSignature {ClassTypeSignatureSuffix}'' ;
 +
;'''TypeVariableSignature:'''
 +
:T '''''[Annot]''' Identifier'' ;
 +
;''ArrayTypeSignature:''
 +
:[ '''''[Annot]''' JavaTypeSignature''
 +
;''TypeParameter:''
 +
:'''''['' @ ''Annot]''' Identifier ClassBound {InterfaceBound}''
 +
;''Annot'':
 +
:'''0'''
 +
:'''1'''
 +
 +
This basically means, that after any of the tokens "L", "T", or "[" an optional "0" or "1" can occur, where "0" represents the nullable annotation, and "1" represents the nonnull annotation. Thus this format is independent of the concrete configured annotations to be used by the JDT compiler, and thus will not create a conflict, when external annotations are contributed from different sources.
 +
 +
Note that in this format, nullness of an array dimension ''follows'' the corresponding "[" token, whereas in Java source syntax it ''precedes'' that token, i.e., "@NonNull String @NonNull[] @Nullable[]" translated to "[1[0L1java/lang/String;". ''Curiously, the class file signature better represents the order of annotations from outer to inner.''
 +
 +
<u>Status:</u> implemented [[Image:Ok_green.gif]], ''except for annotations on type parameters (declarations). This part needs an additional token (proposed: @) to disambiguate the grammar, since no prefix token is used like it is in the other cases.''
 +
 +
 +
As an '''alternative''', it has been proposed to use an encoding similar to '''Java source code''' (but without argument names in method signatures). This format would be somewhat easier to edit manually, but more difficult to match to signatures found in the libraries (jar).
 +
 +
===Example===
 +
Declaring that "V Map.get(Object key)" may return null (independent of constraints on V) would be written like this:
 +
interface java/util/Map
 +
  <KV>
 +
get
 +
  (Ljava/lang/Object;)TV;
 +
  (Ljava/lang/Object;)T0V;

Revision as of 09:17, 27 January 2015

Note.png
Note
This page describes unreleased work in progress. See bug 331651 as the umbrella bug for this work area


Attaching External Annotations to libraries

Annotation-based null analysis is incomplete as long as libraries consumed by a project have no null annotations in their API. To fill this gap, it should be possible to attach "external annotations" to a library after the fact, i.e., without touching and re-packaging the library itself.

Configuration in the IDE

JDT should understand a new classpath attribute called "annotationpath", which - similar to attaching sources and javadoc - attaches a collection of external annotations to a particular classpath entry pointing to a library. Several packagings of such a collection of external annotations have been considered:

directory tree
this packaging is similar to how Java sources are stored in a directory tree, where each directory represents a package and each contained file represents one primary type (class, interface, enum)
Status: implemented Ok green.gif
zip
this packaging simply stores a directory tree in one compressed archive file, inside of which the same structure as above is preserved
Status: implemented Ok green.gif
big file
if annotations are sparse, instead of many tiny files it might be convenient to combine annotations for many types into one file. This could be one file per library or one file per package
Status: not implemented
combined
this packaging is a combination of "zip" and "directory tree". It is proposed in order to consume existing distributions of external annotations in zip format, while still being able to incrementally add more annotations as required by the consuming project. This packaging is inspired by Eiffel's melting ice technology. For external annotations to melt would mean that annotations for a particular type would be extracted from the zip archive and stored as an editable file in the corresponding directory tree. Changes are then made to this file, which will override the content of the zip archive.
Status: not implemented

Details are tracked in these bugs

JDT/Core
bug 440477 [null] Infrastructure for feeding external annotations into compilation
JDT/UI
bug 440815 External annotations need UI to be attached to library jars on the classpath

Headless consumption

It should also be possible to consume external annotations when compiling outside the IDE or under the control of a build system like ant, maven or gradle. For this reason the batch compiler must be configured in situations where no Eclipse project is available as the context for compilation.

one dedicated path
Using the command line option "-annotationpath location" the batch compiler can be configured to read external annotations from one specified locations, which requires to merge annotations for all relevant libraries into one package (see the packagings discussed above)
Status: implemented Ok green.gif
from classpath
Alternatively, the batch compiler could be configured to scan the classpath used for compilation in order to search external annotations for any type it is reading from a library. This option may be less efficient than other approaches, but it provides the easiest means for integrating external annotations into existing concepts of dependency management - external annotation packages would be just additional artifacts to put on the classpath for compilation. This option would be enabled by an argument-less switch on the command line, say "-externalAnnotationsFromClasspath"
Status: not implemented

Details are tracked in this bug:

JDT/Core
bug 440687 [compiler][batch][null] improve command line option for external annotations


File format

On the one hand, the file format could remain a private implementation detail of the compiler, if the IDE just provides the necessary operations for manipulating external annotations (like, e.g., quick assists). This would make the binary .class file format a natural choice for implementation, as it allows to store all required information and can already be interpreted by the compiler.

On the other hand, annotation files should be amenable to storing, comparing and merging using any version control system. This advocates the use of a textual format.

Additionally, a publicly defined textual file format will allow other tools to operate on annotation files (e.g., conversion from/to other formats, automatic merging etc.), and seasoned users could even directly edit annotation files (although no dedicated editors are planned).

The design is strongly influenced by the need for a low-footprint implementation: files must be reasonably small (speaking against any xml formats) and implementation of file reading must be small and efficient.

The specification of this file format is split into the aspects layout and encoding.

File layout

External annotations for a particular type consist of a type header and entries for individual members.

The basic layout is line-based, i.e., each line represents one element and linebreaks within an element are not allowed. Each line can be either of empty, type header, member name, original signature, or annotated signature.

type header
starts with one of the keywords class, interface or enum followed by the qualified name of the type.
member name
the simple name of a field or method
original signature
directly follows a type header or member name; starts with a single blank followed by the original signature of the preceding element using the encoding discussed below
annotated signature
optionally follows the originally signature, and shares the same format, except that this will contain the actually information about annotations

It may become relevant to include meta data (version of a library, origin of an annotation etc.). While no format for this has been defined as of yet, it might be prudent to specify that any of the line formats may contain meta data, which are not to be interpreted by the compiler. To allow for such future extensions, we may define that the each line may contain arbitrary trailing content separated from what has been specified above by any amount of white space (blanks and tabs).

Status: implemented Ok green.gif, except for the skipping of meta data.

Textual encoding of signatures

The current WIP implementation is based on signatures as defined in JVMS 4.7.9.1. This format is used unaltered for "original signatures", and for "annotated signatures" the following changes are applied to the grammar from JVMS (additions in bold face):

ClassTypeSignature:
L [Annot] [PackageSpecifier] SimpleClassTypeSignature {ClassTypeSignatureSuffix} ;
TypeVariableSignature:
T [Annot] Identifier ;
ArrayTypeSignature:
[ [Annot] JavaTypeSignature
TypeParameter:
[ @ Annot] Identifier ClassBound {InterfaceBound}
Annot
0
1

This basically means, that after any of the tokens "L", "T", or "[" an optional "0" or "1" can occur, where "0" represents the nullable annotation, and "1" represents the nonnull annotation. Thus this format is independent of the concrete configured annotations to be used by the JDT compiler, and thus will not create a conflict, when external annotations are contributed from different sources.

Note that in this format, nullness of an array dimension follows the corresponding "[" token, whereas in Java source syntax it precedes that token, i.e., "@NonNull String @NonNull[] @Nullable[]" translated to "[1[0L1java/lang/String;". Curiously, the class file signature better represents the order of annotations from outer to inner.

Status: implemented Ok green.gif, except for annotations on type parameters (declarations). This part needs an additional token (proposed: @) to disambiguate the grammar, since no prefix token is used like it is in the other cases.


As an alternative, it has been proposed to use an encoding similar to Java source code (but without argument names in method signatures). This format would be somewhat easier to edit manually, but more difficult to match to signatures found in the libraries (jar).

Example

Declaring that "V Map.get(Object key)" may return null (independent of constraints on V) would be written like this:

interface java/util/Map
 <KV>
get
 (Ljava/lang/Object;)TV;
 (Ljava/lang/Object;)T0V;

Back to the top