For several components of Self, it is necessary to have a few bits of assembly. For example, executing a method compiled by the JIT-compiler requires a small stub, named
EnterSelf. About 5 % of Self’s codebase is written in assembler, mostly AT&T style.
As you remember, the two main compilers I use to build Self are
gcc is no problem, it also has been tested and used for years with Self. However,
clang turned out to be troublesome.
Background: Assembling with
gcc until 4.2 was a (relatively) cross-platform operating thing, it always relayed on platform-specific assemblers or the GNU assembler
If you happen to use a Linux Distribution, try to run
$ as -v </dev/null
For the Ubuntu box I use I get
GNU assembler version 2.22 (x86_64-linux-gnu) using BFD ↩ version (GNU Binutils for Ubuntu) 2.22
and for the Fedora box it is
GNU assembler version 18.104.22.168.1 (x86_64-redhat-linux) ↩ using BFD version version 22.214.171.124.1-10.fc17 20120131
Now go and try it on your Mac. You will get something like
Apple Inc version cctools-822, GNU assembler version 1.38
with varying values for the
cctools version. The “GNU assembler version”, however, will stay the same: 1.381
You see, there is at least one major version difference between the assemblers. The Apple assembler even isn’t an actual GNU assembler but merely supports the version 1 syntax of GNU
This divergence has been the case for a long time and in Self’s codebase you find abstractions to cope for exactly these divergences:
- On Linux, use syntax features of GNU
- On Mac OS X, use syntax features of GNU
Self Assembler Code
There is one very important assembly function already mentioned:
EnterSelf. This function, besides entering compiled code, has the responsibility to serve as data; to provide references to certain addresses of code in the Self VM2,3:
start_exported_function firstSelfFrame_returnPC jmp_label send_desc_end .long 0 jmp_label contNLR // … .long 0 .long 0 .long 0 .long 20
Without going too much into details, the Self VM code expects the 0×14 (the
.long 20) at the 26th byte after the label. Now the difficulty: the
send_desc_end label follows immediately after the code just given. Hence, most assemblers—when using a normal
jmp instruction—will generate a short jump, resulting in our 0×14 being placed at the 23rd instead of the 26th byte. The effect is that our compiled VM crashes as soon as it tries to enter Self compiled code.
To alleviate this, the
jmp_label macro is defined in a way that the resulting jump is always to a 32bit address. Disregarding a syntax change in Apples assembler around 2007,
jmp_label on OS X is a simple jump, which always is handled with 32bit addresses.
Original Self New version .macro jmp_label MACRO(jmp_label, label) ↩ // jump to label with four bytes for label jmp $0 jmp $0 .endmacro ENDMACRO
On Linux, however, we work around the generated “short” jump by introducing a global label, which forces relocation.
Original Self New version .macro jmp_label label MACRO(jmp_label, label) .globl \label ↩ .globl \label ↩ // force 32-bits for the label // force 32-bits for the label jmp \label jmp \label .endm ENDMACRO
That way, the structure of
firstSelfFrame_returnPC is retained on both platforms.
Clang’s Integrated Assembler
Clang aims to be a drop-in replacement for
gcc under nearly any circumstances. It is, however, supported the most by the FreeBSD and Apple communities. And the latter fact seems to leak through.
On OS X, you won’t notice any difference. Using the integrated assembler of
clang is just the same as using assembly with
gcc. This also holds for the generation of jumps. However, running
clang on Linux turns out to be surprising. After some elaboration, it turns out that the internal assembler of
clang mimics just the Apple assembler with its GNU
as 1.38 compatibility.
The effect of this is that the abstractions in the Self codebase for different assemblers had to be adjusted to differentiate between OS X, Linux with
gcc and Linux with
clang. Yet, this did not suffice. Our
jmp_label macro seemed not to work for
firstSelfFrame_returnPC on Linux with
- neither using a simple
jmpas on OS X,
- nor forcing the label to be global as on Linux with
clang insisted on generating a “short” jump, which is encoded in only two bytes instead of five bytes for the other cases. I tried other different ways of forcing a 32 bit address jump, but to no avail.
In the end, I disabled using the integrated assembler of
clang on Linux.
clang now—happily?—uses the GNU
as of the system as if it were
gcc and our
jmp_label again encodes to a 32 bit address jump, a five byte instruction.
Arguably, it is possible to adjust the Self
C++ code to not expect the 0×14 at the 26th byte but earlier and then using short jumps. After all, this code still dates from days where SPARC and PPC, and not Intel CPUs, where the main supported platforms within Self. After all, the assembly code in Self is not position independent and Assembly will be an issue once Mac OS X 10.8 is there. But again, this is another story.
1 This holds at least for Xcode 3.2 to Xcode 4.2
start_exported_function is an Assembler macro to ensure that a certain label is global and linkable.
3 I did a refactoring of the assembler code in my Self VM fork for better portability. Hence, whenever they diverge, the assembler code is given in two ways, the original form and my “portable” version.