This is a brief tutorial overview of JIT compilation for the SPARC instruction set.
The instruction set is generally pretty straightforward. I assume you can handle most of it. A couple of weirdnesses to get out of the way:
Register zero, known as %r0 or %g0 , always reads as zero and can be written without changing its value.
No single instruction can set a register to a large constant value directly. Two choices: Use sethi to set the high-order 22 bits, and add to set the rest; or use a global register to point to a data block containing constants, and use ld instructions with offsets from that register to grab the value. The former is faster, the latter is smaller.
There are delay slots after branches: the instruction that appears after the branch in the instruction stream executes before the branch is taken.
Beware: the instruction set and the assembly language are surprisingly different in this architecture. There are lots of pseudo-instructions in the assembler and the register convention is confusing!
We continue our discussion of bad engineering. Today's example: register windows.
The idea is to provide hardware support for register allocation in the compiler. Each subroutine call pops up 16 new registers; each subroutine return restores the previous. 8 registers are global, 8 are shared between caller and callee.
8 global registers, %r0-%r7 , %g0-%g7 . %g0 is always zero. %g1 can always be used as a temporary. %g2-%g4 can be used as true global registers if you can figure out how to get the compiler either to set them up or leave them alone for you.
8 out registers, %r8-%r15 , %o0-%o7 . These are outgoing parameters, except that %o6 %sp ) ( is the stack pointer and %o7 is the saved program counter.
8 local registers, %r16-%r23 , %l0-%l7 . Used for automatics.
8 in registers, %r24-%r31 , %i0-%i7 . These are incoming parameters, except that %i0 is also the return value, %i6 %fp ) ( is frame pointer (saved stack pointer)) and %i7 is the return pc, minus 8.
Subroutine call is either a call instruction with an immediate or a jump-and-link using %o7 as the link register (saved pc). This just does the linkage; it doesn't move the register windows.
Subroutine return is a jump-and-link using %i7 -8 as the return address and %g0 as the link register. Again, this doesn't move the windows.
The SPARC assembler has synthetic instructions call and ret that hide the details.
The save instruction actually pushes down a set of registers, moving the caller's out registers into the current procedure's in registers and exposing a new set of local and out registers. It's actually a funny form of add instruction, and it's usually used like this:
sethi %hi(-100), %g1 add %g1, %lo(-100), %g1 save %sp, %g1, %spThe restore instruction inverts a save . It is also a form of add, but because of the automatic relationship between %sp and %fp , the magic form is never needed; we just say
restore %g0, %g0, %g0
For leaf functions - functions that don't call any other function - the register window movement may be unnecessary. As long as the function doesn't need many registers, it can be written to work entirely within the caller's register set and still obey the rules of register lifetime across a procedure call.
Don't use save and restore , and instead of ret use retl , which uses %o7 -8 as the return address instead of %i7 -8.
I won't go through a full worked example here, but you will need to. The easiest way is to write a simple C program, say test.c , then say
$ cc -S test.c $ cat test.sThis test.s file is the assembled source file. Look at the way the C compiler manages things; it'll teach you a lot.
Also have a look at ~cs3/se/proj/jit/callerpc.s . You should be able to understand it now.
You need to maintain all these conventions when you enter and leave the on-the-fly compiler. You don't need to obey them within the compiled code, provided you honour the rules about the stack pointer. But you need to get the parameters into the code and the return result back using the windows.
Moreover, since you will need to call C procedures from within the compiled code, it may be best to bite the bullet and do it their way. But you can instead decide to use the register set as you choose, and save and restore (not save and restore ) the conventions as required.
Designed for an earlier era of compilers that were bad at optimisation (although never, it turns out, bad at register allocation).
Therefore they provide hardware assistance for a solved software problem: always a bad idea.
Too many gates in the critical path, too much memory traffic (most procedures don't need 16 registers), etc. etc.
Result: slow processor.
There are instructions in the SPARC architecture to help manage the caches across a dynamic compilation.
The file ~cs3/se/proj/jit/cacheflush.s contains the SPARC assembly language for a routine to help.
Call cacheflush after generating the code but before you call the code.
To write the compiler, you'll need lots of debugging etc. You know all that.
Here's some advice you might not think of.
Even when the compiler is done, there will probably be a few instructions that are best done by the regular interpreter. Complicated ones that do memory allocation, such as the string routines, come to mind. So you will need compiled code that calls the interpreter again.
Here's the tip: get that working first.
Then you can write a complete working compiler by having every instruction call the interpreter instead.
Then, one at a time, pick off the instructions you can compile directly. If you ever have a bug, you can always return to the dumb version to see if the bug is in the compiled code or somewhere else.