For several components of Self, it is necessary to have a few bits of assembly. For example, executing a method compiled by the JIT-compiler requires a small stub, named EnterSelf
. About 5 % of Self’s codebase is written in assembler, mostly AT&T style.
As you remember, the two main compilers I use to build Self are gcc
and clang
. Using gcc
is no problem, it also has been tested and used for years with Self. However, clang
turned out to be troublesome.
Background: Assembling with gcc
While gcc
until 4.2 was a (relatively) cross-platform operating thing, it always relayed on platform-specific assemblers or the GNU assembler gas
(or as
).
If you happen to use a Linux Distribution, try to run
$ as -v </dev/null
For the Ubuntu box I use I get
GNU assembler version 2.22 (x86_64-linux-gnu) using BFD ↩
version (GNU Binutils for Ubuntu) 2.22
and for the Fedora box it is
GNU assembler version 2.22.52.0.1 (x86_64-redhat-linux) ↩
using BFD version version 2.22.52.0.1-10.fc17 20120131
Now go and try it on your Mac. You will get something like
Apple Inc version cctools-822, GNU assembler version 1.38
with varying values for the cctools
version. The “GNU assembler version”, however, will stay the same: 1.381
You see, there is at least one major version difference between the assemblers. The Apple assembler even isn’t an actual GNU assembler but merely supports the version 1 syntax of GNU as
.
This divergence has been the case for a long time and in Self’s codebase you find abstractions to cope for exactly these divergences:
- On Linux, use syntax features of GNU
as
v2 - On Mac OS X, use syntax features of GNU
as
v1
Self Assembler Code
There is one very important assembly function already mentioned: EnterSelf
. This function, besides entering compiled code, has the responsibility to serve as data; to provide references to certain addresses of code in the Self VM2,3:
start_exported_function firstSelfFrame_returnPC jmp_label send_desc_end .long 0 jmp_label contNLR // … .long 0 .long 0 .long 0 .long 20
Without going too much into details, the Self VM code expects the 0x14 (the .long 20
) at the 26th byte after the label. Now the difficulty: the send_desc_end
label follows immediately after the code just given. Hence, most assemblers—when using a normal jmp
instruction—will generate a short jump, resulting in our 0x14 being placed at the 23rd instead of the 26th byte. The effect is that our compiled VM crashes as soon as it tries to enter Self compiled code.
To alleviate this, the jmp_label
macro is defined in a way that the resulting jump is always to a 32bit address. Disregarding a syntax change in Apples assembler around 2007, jmp_label
on OS X is a simple jump, which always is handled with 32bit addresses.
Original Self New version .macro jmp_label MACRO(jmp_label, label) ↩ // jump to label with four bytes for label jmp $0 jmp $0 .endmacro ENDMACRO
On Linux, however, we work around the generated “short” jump by introducing a global label, which forces relocation.
Original Self New version .macro jmp_label label MACRO(jmp_label, label) .globl \label ↩ .globl \label ↩ // force 32-bits for the label // force 32-bits for the label jmp \label jmp \label .endm ENDMACRO
That way, the structure of firstSelfFrame_returnPC
is retained on both platforms.
Clang’s Integrated Assembler
Clang aims to be a drop-in replacement for gcc
under nearly any circumstances. It is, however, supported the most by the FreeBSD and Apple communities. And the latter fact seems to leak through.
On OS X, you won’t notice any difference. Using the integrated assembler of clang
is just the same as using assembly with gcc
. This also holds for the generation of jumps. However, running clang
on Linux turns out to be surprising. After some elaboration, it turns out that the internal assembler of clang
mimics just the Apple assembler with its GNU as
1.38 compatibility.
The effect of this is that the abstractions in the Self codebase for different assemblers had to be adjusted to differentiate between OS X, Linux with gcc
and Linux with clang
. Yet, this did not suffice. Our jmp_label
macro seemed not to work for firstSelfFrame_returnPC
on Linux with clang
,
- neither using a simple
jmp
as on OS X, - nor forcing the label to be global as on Linux with
gcc
.
Either way, clang
insisted on generating a “short” jump, which is encoded in only two bytes instead of five bytes for the other cases. I tried other different ways of forcing a 32 bit address jump, but to no avail.
Solution?
In the end, I disabled using the integrated assembler of clang
on Linux. clang
now—happily?—uses the GNU as
of the system as if it were gcc
and our jmp_label
again encodes to a 32 bit address jump, a five byte instruction.
Aftermath
Arguably, it is possible to adjust the Self C++
code to not expect the 0x14 at the 26th byte but earlier and then using short jumps. After all, this code still dates from days where SPARC and PPC, and not Intel CPUs, where the main supported platforms within Self. After all, the assembly code in Self is not position independent and Assembly will be an issue once Mac OS X 10.8 is there. But again, this is another story.
1 This holds at least for Xcode 3.2 to Xcode 4.2
2 start_exported_function
is an Assembler macro to ensure that a certain label is global and linkable.
3 I did a refactoring of the assembler code in my Self VM fork for better portability. Hence, whenever they diverge, the assembler code is given in two ways, the original form and my “portable” version.
Like you, I faced a situation where I needed my jump instruction always to be five bytes long (a ‘near’ jump, with a 32-bit displacement) even if the jump could conceivably be squeezed into an instruction that was only two bytes long (a ‘short’ jump, with an 8-bit displacement). I solved it by defining a ‘jmp32’ instruction with macros:
clang:
.macro jmp32
.byte 0xe9
.long $0 – (.+4)
.endmacro
Linux/gcc:
.macro jmp32 label
.byte 0xe9
.long \label – (.+4)
.endm
Use in your code on any platform:
jmp32 contNLR
Cheers,
Eric