Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "JDT Core/Null Analysis/Options"

(Bypassing compile time checks)
(Admitting field assignments)
Line 181: Line 181:
 
* we can statically determine that a field f being assigned is ''different'' from the field being dereferenced afterwards.
 
* we can statically determine that a field f being assigned is ''different'' from the field being dereferenced afterwards.
  
For the latter assumption we apply one of several strategies:
+
For the latter assumption we could apply one of these strategies:
 
* full alias analysis (not realistic for the JDT)
 
* full alias analysis (not realistic for the JDT)
 
* static approximation: ignore that different objects hold different incarnations of the same field, treat any assignment to a field as invalidating the analysis results for all objects of that class.
 
* static approximation: ignore that different objects hold different incarnations of the same field, treat any assignment to a field as invalidating the analysis results for all objects of that class.

Revision as of 10:47, 16 February 2012

Note.png
Disclaimer
This page is currently subject to discussion and does not describe the actual implementation in the JDT


The analysis of possible null pointer exceptions performed by the Eclipse Java Compiler is powerful, but the full power may result in a large number of errors and warnings. Not all projects can afford addressing all these issues (at once).

This page discusses the problem from two different directions:

  • On the road towards a fully checked, NPE-free program, what are the risks needing to be addressed before the solution can be considered complete, i.e., fully safe?
  • Should a more lenient analysis be applied to reduce the number of warnings perceived as uninteresting?
    • What are the risks that more relaxed rules would introduce?
    • What are the patterns of coding that (some) users would like to see accepted without a warning? What assumptions need to be maintained to render these patterns safe?

Risks of NPE despite annotation based null analysis

If annotation based null analysis applies strict pessimistic rules, a full guarantee of absence of NPE is only threatened by two risks:

  • specification of nullness is either incomplete or incompletely checked
  • checked null specifications are bypassed at runtime

Aside from these an approach that strictly views null annotations as an extension of the type system is fully safe.

Incomplete Specification

If you start with an existing project and only turn on annotation based null analysis, you will not notice any difference. This is because the analysis implicitly supports three kinds of reference types:

  • @NonNull types
  • @Nullable types and
  • types with unspecified nullness (legacy types).

So before you actually add annotations to your methods and fields, all types will be considered as legacy types and so the unsafe semantics of Java apply where both assigning null and dereferencing are legal.

Adding individual annotations

This path is obvious: add just those annotations where you are certain about the design intent and after each annotation added address the problems reported by the compiler.

Making nonnull the default

Instead of going in tiny little steps you may want to do jumps of some size (cf. bug 331647):

  • annotate a method as @NonNullByDefault to affect all its parameters and its return
  • annotate a type as @NonNullByDefault to affect all its methods and fieds
  • annotate a package as @NonNullByDefault (using package-info.java) to affect all its types
  • define a global policy that all packages should be @NonNullByDefault
    Unfortunately, a global default cannot directly be established but only via the indirection of package level defaults (see bug 366063 for background).

A particular problem arises when a type affected by @NonNullByDefault is a subtype of a legacy type with legacy signatures. It is illegal to override a method with legacy parameters by a method with @NonNull parameters. Specifying @Nullable for all parameters in such an overriding method may be too pessimistic, forcing the method body to do more null checks than actually useful.

For this situation a parameter has been added to the default annotation. Declaring a type that inherits from a legacy type with @NonNullByDefault(false) cancels the applicable default.

Third-party code

Every project depends on third party code, which it doesn't control, so adding null annotations is not directly possible.

To address this issue, the Eclipse Java Compiler should support nullity profiles, aka external annotations: separate files that capture the factual null contracts of all API methods and fields contained in a given library.

Support for this feature is planned for version 3.9, see bug 331651.

Avoiding the risk of incompleteness

In order to ensure that no unchecked legacy types are used in an application, these things must be ensured:

  • @NonNullByDefault is the globally enforced default
  • @NonNullByDefault(false) is never used
  • all libraries come with API-complete nullity profiles

Each of these steps brings a project closer to the safety guarantees of complete analysis.

TODO: Issue a configurable warning when @NonNullByDefault(false) is used.

TODO: Add a note to bug 331651 for checking complete coverage of referenced libraries by available nullity profiles.


Bypassing compile time checks

Any guarantees given by the analysis can still be bypassed at runtime if one of the following is applied:

  • runtime reflection beyond pure introspection, i.e, whenever reflective field access or method invocation is involved.
  • bytecode manipulation, i.e., whenever the bytes being executed by the VM are not identical to the bytes produced at the time when null analysis was performed.

These risks are typically considered outside the scope of linguistic means, developers know about these issues at a general level and don't expect a static analysis to consider these.

Idea.png
New Projects
Readers interested in using null annotations for new projects may safely stop reading here, because writing your code in a style that passes the strict null analysis is much cheaper than going through all the alternatives discussed below.


Lenient analysis

Some warnings/errors flagged by a strict analysis may seem counter intuitive. Some may actually be irrelevant in specific contexts.

Practical investigations showed that many of those "unwanted" warnings/errors relate to field access. When focusing on these, the most obvious approach at distinguishing interesting from uninteresting warnings is to apply some kind of flow analysis for fields (which the pessimistic approach strictly avoids).

Risks of flow analysis for fields

When flow analysis is applied to fields the following reasons may cause unexpected NPE:

  • side effects
  • aliasing
  • concurrency

Null analysis for the following kinds of fields is essentially unaffected by these risks:

  • final fields
  • @NonNull fields

For both kinds of fields only object initialization may produce unexpected results (yes, even Java's definite assignment rule can be circumvented). Once initialized their null status is either known or amenable to flow analysis.

TODO: more precise distinction for final fields: is static relevant? does initialization occur directly as part of the field declaration?

Side effects

The simplest problem with flow analysis for fields results from side effects in methods:

class X {
   @Nullable Object f;
   void test() {
       if (this.f != null) {
           foo();
           this.f.bar();
       }
   }
   void foo() { /* arbitrary code here */ }
}

The execution of foo() could potentially assign null to f thus invalidating the above null check.

Aliasing

When applying flow analysis for checking nullness of fields the following snippet demonstrates how aliasing threatens the validity:

class X {
    @Nullable Y f;
    void test(X other) {
        if (this.f != null) {
            other.f = null;
            this.f.bar(); // potential NPE
        }
    }
    void breakIt() {
        test(this); // definitely triggers the NPE
    }
}

Without further annotations an intra-procedural analysis cannot see that the assignment to other.f affects the value of this.f.

Concurrency

Also concurrency can invalidate the null related guarantees of flow analysis for fields. In a concurrent application not even this snippet is safe:

   if (this.f != null)
       this.f.bar();

Thus concurrency adds to the problems of side effects and aliasing, because null information from one expression may already be invalid when evaluating the very next expression.

Drawing a line: what intermediate code to admit?

Given the above risks, we can admit certain styles of code if a certain risk is considered irrelevant in a given context. The discussion will use this basic question:

class X {
   @Nullable Object f;
   void test() {
       if (this.f != null) {
           // what code is allowed here?
           this.f.bar(); // want to be sure there's no NPE here
       }
   }
}

What code can we admit at the designated location? What assumptions must a developer ensure to render that code valid?

Admitting method calls

If the code between check and dereference contains any method call, we have the fragment used for demonstrating side effects:

class X {
   @Nullable Object f;
   void test() {
       if (this.f != null) {
           foo();
           this.f.bar();
       }
   }
}

It turns out that all three risks mentioned above are effective in this situations.

This requires to confidently state the assumption that one's application is free from side effects, aliasing and concurrency. In typical object-oriented code that assumption, however, is unfounded. Thus admitting any method call in this location lacks theoretical nor practical foundation.

Admitting field assignments

class X {
   @Nullable Object f;
   void test(X other) {
       if (this.f != null) {
           other.f = null;
           this.f.bar();
       }
   }
}

When demonstrating the effects of aliasing we discussed that the above fragment is not OK.

Still similar patterns could be accepted, if both of the following hold:

  • no other thread concurrently accesses the field f, and
  • we can statically determine that a field f being assigned is different from the field being dereferenced afterwards.

For the latter assumption we could apply one of these strategies:

  • full alias analysis (not realistic for the JDT)
  • static approximation: ignore that different objects hold different incarnations of the same field, treat any assignment to a field as invalidating the analysis results for all objects of that class.

Using static approximation the above fragment would be considered as unsafe, because the assignment refers to the same field, although at runtime different incarnations (of different receiver instances 'this' and 'other') may be involved. This static approximation will thus flag a few locations that might be OK given more knowledge about the instances 'this' and 'others', but the guarantees given by this approach can only be broken by concurrency.

Drawing a line: what kinds of field references to include?

TODO

...


Summary

TODO: clean-up this section and incorporate findings from team discussions

For non-final @Nullable fields the simplest strategy for avoiding any nullness related risks is to pessimistically assume potential null at every read. This means for strict checking no flow analysis should be applied to @Nullable fields.

While this appears to be a very drastic restriction, the remedy is quite easy: before dereferencing a @Nullable field it has to be assigned to a local variable. Flow analysis is then safely applied to the local variable with no risk of side effects, aliasing nor concurrency, since local variables are not shared with any code locations that would be outside the scope of the analysis. I.e., the flow analysis can see everything it needs to consider regarding local variables.

Any flow analysis performs much better for local variables than for fields. Shifting all serious computation to local variables makes for much safer code. Whether or not a project can afford to fully enforce this strategy depends on many factors.

This lets me conclude that these questions deserve further investigation:

  • How can gradual migration to a fully safe coding style fully based on local variables be supported. I.e., what intermediate levels are useful, what configuration options are needed and what warning messages best convey the vagueness/certainty of each issue?
  • How can the above options be communicated to users? Is it OK to offer options like "perform flow analysis for fields", or should options be labeled as "consider aliasing", or "pessimistically consider concurrency for null analysis"?
  • How can initialization be made safer? How relevant are these issues in practice? Which annotation / set of annotations is the best buy? Is support for several styles required?

Back to the top