Skip to content

Latest commit

 

History

History

compiler

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Eclipse OMR Compiler Technology

The Eclipse OMR compiler technology is a collection of stable, high performance optimization and code generation technologies suitable for integration into dynamic and static language runtime environments.

It is not a standalone compiler that can be linked into another environment. Rather, it provides all the essential building blocks for integrating and adapting this advanced compiler technology for different language environments.

It originates from a mature, feature-rich compiler technology developed within IBM called Testarossa. This technology was developed from the outset to be used in highly dynamic environments such as Java, but has proven its adaptability in static compilers, trace-based compilation, and binary re-translators.

This technology is consumed as-is in some of IBM's high-performance runtimes such has the J9 Java virtual machine and used in production environments.

What has been contributed?

  • high-level optimization technology featuring classic compiler optimizations, loop optimizations, control and data flow analyses, and support data structures
  • code generation technology with deep platform exploitation for x86 (i386 and x86-64), Power, System Z, and ARM (32-bit)
  • a robust, tree-based intermediate representation (or IL) and support code for producing IL from different method representations
  • expressive tracing and logging infrastructure for problem determination
  • JitBuilder technology to simplify the effort to integrate a JIT compiler into an existing language interpreter
  • a framework for constructing language-agnostic unit tests for compiler technology

What's next?

While this is a significant initial contribution of compiler technology, more is on the way, including:

  • integration of conformance tests with the Eclipse OMR makefiles, also demonstrating a sample hookup of the technology
  • lots more documentation, in source and in the doc/compiler directory describing the compiler technology architecture and its inner workings
  • further code refactoring and design consistency enhancements
  • additional optimizations and code generation technology upon refactoring from within the IBM Testarossa code

A Tour of the Source Code

The compiler technology is written largely in C++, but there is a handful of support functions written in C and assembler.

The structure of the codebase and the design of the class hierarchy reflects this technology's heritage and the requirement to adapt to a wide variety of compilation environments (or projects as they are often referred).

Many of the classes in Testarossa use a design pattern we call extensible classes. This is a pattern to achieve extension through composition and static polymorphism. Extensible classes are an efficient and useful means to extend and specialize the core technology provided by Eclipse OMR for a particular project and for a particular processor architecture. The extensible design is the reason for the shape of the class hierarchy, the layout of directories, and file naming.

The core compiler components are provided under the compiler/ top-level directory and are organized as follows:

Directory Purpose
codegen/ Code for transforming IL trees into machine instructions. This includes pseudo-instruction generation with virtual registers, local register assignment, binary encoding, and relocation processing.
compile/ Logic managing the compilation of a method.
control/ Generic logic to decide on when and how to compile a method.
cs2/ A legacy collection of utilities providing functionality such as container classes and lexical timers. The functions within this directory are deprecated and are actively being replaced with C++ STL equivalents or new implementations based on STL.
env/ Generic interface to the environment that is requesting the compilation. In most cases this is the interface to the VM or compiler frontend that is incorporating the Eclipse OMR compiler technology. For example, it can be used to answer questions about the VM configuration, object model, classes, floating point semantics, etc.
il/ Intermediate language definition and utilities.
ilgen/ Utilities to help with the generation of intermediate language from some external representation.
infra/ Support infrastructure.
optimizer/ High-level, IL tree-based optimizations and utilities.
ras/ Debug and servicability utilities, including tracing and logging.
runtime/ Post-compilation services available to compiled code at runtime.
arm/ ARM processor specializations
x/ X86 processor specializations
p/ Power processor specializations
z/ System Z processor specializations

Other resources can be found in the Eclipse OMR project as follows:

Directory Purpose
jitbuilder/ JitBuilder technology extending Eclipse OMR technology
fvtest/compilertest Unit tests for compiler technology
doc/compiler Additional documentation

Namespaces

The OMR::, TestCompiler::, and JitBuilder:: namespaces are used to isolate compiler technology for those particular environments. Processor architecture specialized namespaces (e.g., X86::, Power::, Z::, and ARM::) can be nested within them. If you extend the Eclipse OMR technology you should choose a unique namespace for your project.

In general, the TR:: namespace (short for Testarossa) is the canonical namespace for all of the compiler technology that is visible across multiple projects.

You may encounter references that are in the global namespace but whose identifiers are preceeded simply with TR_. This is inconsistent with namespace convention just described and they are being moved to the TR:: namespace as refactoring continues.

XXX_PROJECT_SPECIFIC macros

Throughout the codebase you may find code guarded with #ifdef XXX_PROJECT_SPECIFIC directives. Project-specific macros are an artifact of the refactoring process that produced this code. The initial code contribution originated from a much larger codebase of compiler technology used in several compiler products and compilation scenarios. In order to contribute this code sooner we eliminated most, but not all, of code that has a tighter coupling to a particular environment. For example, guarded code may require data structures or header files not present in the initial contribution.

Generally there should not be a need to enable these macros. Our expecatation is that either that guarded code will be enabled over time as it is made more general purpose for other language environments, or it will be removed outright as that code is refactored as part of the Eclipse OMR project.

We recognize the presence of these macros is far from ideal, and IBM will be working to eliminate them over time so that the codebase is self-contained and fully testable. These macros should not be imitated in newer code commits except where absolutely necessary.

Build Info

The compiler technology has been built successfully with the following compilers:

OS Architecture Build Compiler Version
Linux x86 g++ 4.4.7
Linux s390x g++ 4.4.7
Linux ppc64le XLC 12.1
Linux ppc64le g++ 4.4.7
AIX ppc64 XLC 12.1
z/OS s390x XLC v2r2

Older compilers may not support all the C++11 features required, and while we endeavour to build with newer compilers, incompatibilities have occurred previously. Issues are welcome where forward compatibility is broken.

We are completely aware not everyone will have access to all these build compilers, so we will help with any compatability fixes required to make a pull request mergeable. To minimize issues, see the table below containing a summary of our ability to support C++ language features.

C++11 Core Language Features Supported
Variadic templates Some
static_assert Yes
auto Yes
decltype Yes
Delegating constructors No
Inheriting constructors No
Extended friend declarations No
Range-based for-loop No