The Java Memory Model is a predicate on execution traces1 of
multi-threaded Java programs. It gives meaning to programs using
Java’s concurrency facilities, and (hence) constraints transformations
that may be performed on the program by the program’s execution
environment. The execution environment (a non-standard term I just
made up) typically includes
javac and the JVM the generated
bytecodes execute on, but in theory could involve other tools
depending on the toolchain. For example, if you post-process the
.class files using some bytecode level tool, that tool
needs to understand and respect the JMM if it wants to make semantic
My interest in the Java Memory Model stems from working on JIT compilers for a JVM2. JIT compilers too are a part of the execution environment as defined above, and can optimize programs only in ways allowed by the JMM specification. People much smarter than me have come up with ways to safely (but conservatively) compile standard Java concurrency primitives3, but it is a nice exercise to try to prove legality (or illegality, in case of this post) of certain transforms from scratch. This blog post tries to show how some re-orderings (that may be done in as part of optimizing a Java program for performance) that intuitively seem illegal are, in fact, illegal, by deriving that illegality from directly the JMM spec. I’ve assumed passing familiarity with the Java Memory Model, but I’ve also tried to give references to the actual memory model specification wherever appropriate.
A Correctly Synchronized Example
Assume we’ve declared a class
Now consider the following snippet in Java (we’ll call this the pre-transform snippet):
Can the above be transformed to the following (that we will call the post-transform snippet)?
Giving a precise answer involves examining the complete Java program
– depending on the context in which this snippet executes, the
transformed snippet may or may not be equivalent to the original. For
example, if there are no writes to
volatileF (and so
0 and so is
regNV) in the Java program then the transform
is trivially safe. However, it is more interesting to be able to make
local judgments when JIT compiling a single method. Specifically,
we’d like to know if a certain transformation is legal (or not) in
all contexts because if it is, we can make the transformation
without probing the rest of the environment at all4. It
follows that we can prove a transform illegal (in this restricted
sense of illegal) by giving a context that acts as a counterexample
– the existence of one context in which the pre-transform Java
snippet and the post-transform Java snippet behave in incompatible
ways is enough to make the transform unsafe to do locally, without
If you’re familiar with the Java Memory Model, you’d probably guess that the transform above is not legal. We’ll show this by choosing a global context in which the pre-transform snippet guarantees something that the post-transform snippet doesn’t.
Let’s say that there is some other thread running
writer, making the
writer are being called on the same
that was allocated at program startup.
It is easy to show that all sequentially consistent execution traces
of this program are race free, and hence the program as such is
correctly synchronized (all these terms have very specific meanings,
see 17.4.5 5). Correctly synchronized programs behave as if
they were executing on a sequentially consistent runtime6, so
every execution of this program will look like an interleaving of
actions performed by each of the individual threads (including
non-volatile reads and writes). Given that, we can show that
regNV == 5.
Now let us put the post-transform Java snippet in the same context.
Note that this is no longer correctly synchronized and hence the
above reasoning doesn’t follow directly. To show that the assertion
regV == 1 implies
regNV == 5” no longer holds, we will
demonstrate a legal execution that results in
regV == 1 and
Let be an execution (see 17.4.65) with obvious ,
, , ; =
int regV = tuple.volatileF for some 7;
0; and computed and
(their values are functions of the other parameters of the
execution). Something to note about : “(17.4.45) The
write of the default value (zero, false, or null) to each variable
synchronizes-with the first action in every thread.”
We now need a set of committing actions (17.4.85) culminating in the execution . Fortunately this is simpler than it sounds, let:
is (by definition)
includes everything up to the point the two threads started. I’m hand-waving here – in general you won’t be able to compress the entire sequence of actions leading up to the thread launch state into one commit. But I don’t think that restriction affects this proof.
tuple.nonVolatileF = 5,
tuple.volatileF = 1
int cache = tuple.nonVolatileF,
int regV = tuple.volatileF
is all thread finalization logic (the conditional assignment to the local variable
regNV, calls to
join(), the exit sequence deployed by
mainetc.). too is hand-wavy, and in the same sense as .
From this we create with such that . We now need to show that the reads in can see the write thinks they should. The specification allows “freshly committed reads” (i.e. reads in that aren’t in ) to see writes in the execution they’re introduced in () that are different from what they see in the final execution but we won’t need to exploit that allowance here8.
From JMM section 17.4.85: “All reads in that are not in
must see writes that happen-before them. Each read
in must see writes in in both
and , but may see a different write in from the one it
sees in .” and section 17.4.55: “The default
initialization of any object happens-before any other actions (other
than default-writes) of a program.” give us that it is legal for
cache = tuple.nonVolatileF to see the initial write to
of its default value (
0) that happens-before it (and everything
int regV = tuple.volatileF to see the write
tuple.volatileF = 1 that also happens-before it (since
tuple.volatileF = 1 is before
int regV = tuple.volatileF in the
synchronization order, and hence they have a happens-before
relationship via the synchronizes with relationship). Since none of
these judgments break happens-before consistency, we’ve thus
justified , and hence proved that
regV == 1 and
regNV == 0 is
a legal result.
Find a series of committing actions that end with
5 in this second example.
A Racy Example
The first example had an interesting property – the pre-transform program was correctly synchronized while the post-transform program wasn’t. Let’s try an example where the pre-transform program isn’t correctly synchronized either. Is it legal to transform
to the following?
Like in the previous case, the real, precise answer depends on the global context in which the snippets are being evaluated for equivalence. But we can still try to answer the question in terms of the stronger notion of “equivalent in every context”.
We build a set-up similar to what we used last time, and also use the
same guarantee (or invariant) – that
regV == 1 implies
regNV == 5
to differentiate the pre-transform and the post-transform programs:
Observe that there is a race here9 so the earlier logic that exploited “as-if” sequentially consistency semantics will not directly work. However, happens-before consistency is enough to give us what we need.
“(17.4.55) A set of actions is happens-before consistent if for all reads in , where is the write action seen by , it is not the case that either or that there exists a write in such that and and .”
In plain English, this says the following two things:
- A read cannot see a write that it happened before
- A write overrides other writes to the same memory location that happened before it (in terms of the happens before relation) – a read cannot observe a write that happened before a write that happened before it (i.e. is two or more steps away on the happens before relation).
We can use (2) to prove the required guarantee – given that
1, we know (since volatile loads and stores have a total order) that
tuple.volatileF = 1 happens before
int regV = tuple.volatileF.
This means (by program order and transitivity)
5 happens before
int regNV = tuple.nonVolatileF. The only other
nonVolatileF in the system, the write of the default
0 in this case), happens before
tuple.nonVolatileF = 5 so by (2) above,
int regNV =
tuple.nonVolatileF cannot see it. In the language of the JMM, we
int regNV = tuple.nonVolatileF as , the initial write of
tuple.nonVolatileF as and
tuple.nonVolatileF = 5
as . By 17.4.55, cannot observe .
The post-transform snippet in the same context looks like
We’ll again try to show the legality of an execution that results in
regNV == 0 with
regV == 1. Reasoning about this is very similar
to reasoning about the previous case, and we can use the same commit
sequence (modulo changing in obvious ways,
since we don’t have a local variable involved here) to show legality
of an execution that ends with
regNV == 0 and
regV == 1.
Using similar techniques, show that you cannot optimize (the proof will have the same structure as the second example):
Conclusion and Future Posts
There are several aspects of the Java Memory Model specification that I’m confused about:
What does the JSR-133 Cookbook do to prevent values from appearing of “thin-air”? Does the cookbook by itself prevent problematic causality loops?
There are several interesting and well-known categories of optimizations that are allegedly prevented by the memory model. What are those optimizations, what is the impact of this restriction? Is there a clean subset of compiler optimizations that are allowed? This is more of a reading project – several papers have been published on this topic.
It will be interesting to see how my understanding of the JMM evolves over time. I think I’ve sort of started to see the big picture, but there is still a long way to go.
An execution trace is a tuple consisting of sets and relations on those sets. See section 17.4.6 in the “Java Language Specification” for a precise definition. ↩
Unfortunately, unlike Java the programming language, Java bytecodes don’t have a defined memory model. In practice this isn’t a problem since the bytecodes closely follow constructs present in the Java programming language, and hence have a default “inherited” memory model. ↩
In general it isn’t even possible to make global statements about a program that stay correct throughout the program lifetime because Java bytecodes are allowed to load arbitrary Java classes at run-time. So even if some forms of such global analysis may be feasible from a computational perspective, predicates computed by such global analyses will not stay correct throughout the lifetime of the program. There is a standard solution for such issues though: you register dependencies on bits of compiled code that records the falsifiable assumptions made during the compilation and when any of these assumptions are violated, the associated blobs of code are thrown away. Of course, the more aggressive you are in making falsifiable assumptions the more frequently you’ll have to throw away compiled code, so there is a natural trade-off here. ↩
Interestingly, this isn’t an axiom (even though we state as such). Any Java runtime providing happens before consistency and well-ordered executions automatically provides the “if correctly synchronized then sequentially consistent” invariant. For a proof see “The Java Memory Model”, POPL ‘05. ↩
This notation is just a fancy way of saying
tuple.volatileF = 1is before
int regV = tuple.volatileFin and I don’t care about the remaining part of the relation. Also, there is something worth noting here – volatile reads and writes have a total order to them, and given that
1, there can be only one way
tuple.volatileF = 1relates to
int regV = tuple.volatileFin . ↩
If you pick a sequentially consistent execution where
int regV = tuple.volatileFis before
tuple.volatileF = 1in the synchronization order, then the pair of conflicting access (17.4.55)
tuple.nonVolatileF = 5and
int regNV = tuple.nonVolatileFare not ordered in happens-before. ↩