Notes on some interesting (for me) types of nodes of the Ideal Graph.

In this post I will write down my notes on the behavior, inputs and outputs of some nodes of the Ideal Graph (IG). This is not an introduction to the IG. I’m assuming you have already some knowledge on the subject and have read some of the related documentation. If you haven’t, I strongly recommend to read at least the paper by Cliff A Simple Graph-Based Intermediate Representation.

A Sea of Nodes

There are about 380 types of nodes currently declared in the C2’s code. Nodes takes input, produce output and do… stuff.  The C2’s machine independent optimizations are performed creating, deleting and moving these nodes around. If you want to understand these optimizations, you really need to understand the behavior of the nodes.

Fortunately, many of these nodes perform similar functions, so is possible to group them into families. Nodes  like AddI and AddL performs the exact same function on different types (integer and longs, respectively). Figure 1 represents a class inheritance diagram of some of these nodes (actually there is no Math node, but it helps the mental map). Not all families of nodes are in Figure 1, a full class hierarchy of the 380 or so types or nodes can be found here (is BIG).


Nodes controlling the execution of the graph

The IG is an executable representation of a Java program.  This section describes the nodes controlling the execution of the IG.

Region Nodes: The IG has an special type of node called Region which is used to define control flow basic blocks. If any node A has an edge coming from a Region node B, it means that A belongs to the basic block defined by B. Edges coming from Region nodes are called control edges. A Region node C that has a control edge to another Region D would mean that C should be executed after D.

Math Nodes: Not all nodes must be connected to a region, in that case, the order in which the operations are executed is defined by their value dependencies. For example in the expression H(b) + F(c), the Add node representing the addition must be executed after the return of both functions F and H. Not linking an operation directly to a region allows some cool optimizations like Global Code Motion.

Phi Nodes: The IG is in Static Single Assignment form. Therefore the IG has Phi Nodes merging the values of different Regions.

Projection Nodes: A node can output a tuple. However, nodes cannot take tuples as inputs. Also, edges cannot contain information or weights. The solution is that nodes producing tuples of size N are connected to N Projection nodes. Projection nodes are the only ones that can take a tuple as an input and they separates the outputs in N atomic outputs. For example a node A producing a tuple {1, 2} will be connected to Proj1 and Proj2 which will output {1} and {2} respectively.

The Model of Execution of the IG can be found in Section 3 of the Cliff paper.

Nodes related to Memory and IO operations

Store Nodes: The memory and the IO are represented as an state. Don’t confuse the state of the memory with its address or its value. Just like variables values are represented in SSA form with different variables, the memory is represented in the IG with different states. The state represents the WHOLE memory. Every time the memory is written a new state is produced. The Store node represent the operation of writing something to memory. It takes as input the address of the memory, its current state and a value to store and produces a new memory state (remember the state is a kind of variable name in SSA form). Cliff calls the global state of the memory also store but I changed this to state because is awful confusing for me.

Load Nodes: Load Nodes represent the action of reading from memory. They take as input an state of the memory, the address and returns the value of it.

Pokemon Nodes

In this section I will list those nodes for which I did not knew their behavior the first time I found them.

RageCheckNode: Checks that an arrays access is within range.  Inheritor of IfNode. Pass control signal to an IfTrue node when the array is whithing range otherwise projects to an exceptional behavior.

CmpU: Compares two unsigned integers and produces a conditional code. A condition code can take three values -1 (less than), 0 (equal) or 1 (greater that)

BoolNode: convert a Condition Codes to a Logical result. Boolean can only be false or true, while a condition code can take three values -1 (less than), 0 (equal) or 1 (greater that)

MultiNode:  Base class for all nodes producing a tuple as an output.

DivModNode: Is a Multinode. Performs a division with quotient. One output is the division, the other one the modulo.

CallLeafNode: Is a Multinode. Make a direct subroutine call node into compiled C++ code, without safepoints. Outputs the result of the call and the memory status.

MergeMemNode: This node works as a sort of Phi node but for memory states. Read more here.

Related documentation


Leave a Reply

Your email address will not be published. Required fields are marked *