For several components of Self, it is necessary to have a few bits of assembly. For example, executing a method compiled by the JIT-compiler requires a small stub, named EnterSelf. About 5 % of Self’s codebase is written in assembler, mostly AT&T style.

As you remember, the two main compilers I use to build Self are gcc and clang. Using gcc is no problem, it also has been tested and used for years with Self. However, clang turned out to be troublesome.

Background: Assembling with gcc

While gcc until 4.2 was a (relatively) cross-platform operating thing, it always relayed on platform-specific assemblers or the GNU assembler gas (or as).

If you happen to use a Linux Distribution, try to run

$ as -v </dev/null

For the Ubuntu box I use I get

GNU assembler version 2.22 (x86_64-linux-gnu) using BFD ↩ 
 version (GNU Binutils for Ubuntu) 2.22 

and for the Fedora box it is

GNU assembler version 2.22.52.0.1 (x86_64-redhat-linux) ↩ 
 using BFD version version 2.22.52.0.1-10.fc17 20120131

Now go and try it on your Mac. You will get something like

Apple Inc version cctools-822, GNU assembler version 1.38

with varying values for the cctools version. The “GNU assembler version”, however, will stay the same: 1.381

You see, there is at least one major version difference between the assemblers. The Apple assembler even isn’t an actual GNU assembler but merely supports the version 1 syntax of GNU as.

This divergence has been the case for a long time and in Self’s codebase you find abstractions to cope for exactly these divergences:

  • On Linux, use syntax features of GNU as v2
  • On Mac OS X, use syntax features of GNU as v1

Self Assembler Code

There is one very important assembly function already mentioned: EnterSelf. This function, besides entering compiled code, has the responsibility to serve as data; to provide references to certain addresses of code in the Self VM2,3:

start_exported_function firstSelfFrame_returnPC
   jmp_label send_desc_end     
  .long 0                      
   jmp_label contNLR           
  // …
  .long 0                      
  .long 0                      
  .long 0                      
  .long 20

Without going too much into details, the Self VM code expects the 0x14 (the .long 20) at the 26th byte after the label. Now the difficulty: the send_desc_end label follows immediately after the code just given. Hence, most assemblers—when using a normal jmp instruction—will generate a short jump, resulting in our 0x14 being placed at the 23rd instead of the 26th byte. The effect is that our compiled VM crashes as soon as it tries to enter Self compiled code.

To alleviate this, the jmp_label macro is defined in a way that the resulting jump is always to a 32bit address. Disregarding a syntax change in Apples assembler around 2007, jmp_label on OS X is a simple jump, which always is handled with 32bit addresses.

Original Self                New version

.macro jmp_label             MACRO(jmp_label, label) ↩
                               // jump to label with four bytes for label
    jmp $0                       jmp $0
.endmacro                    ENDMACRO

On Linux, however, we work around the generated “short” jump by introducing a global label, which forces relocation.

Original Self                        New version

.macro jmp_label label               MACRO(jmp_label, label)
    .globl \label ↩                    .globl \label ↩
      // force 32-bits for the label     // force 32-bits for the label
    jmp \label                         jmp \label
.endm                                ENDMACRO

That way, the structure of firstSelfFrame_returnPC is retained on both platforms.

Clang’s Integrated Assembler

Clang aims to be a drop-in replacement for gcc under nearly any circumstances. It is, however, supported the most by the FreeBSD and Apple communities. And the latter fact seems to leak through.

On OS X, you won’t notice any difference. Using the integrated assembler of clang is just the same as using assembly with gcc. This also holds for the generation of jumps. However, running clang on Linux turns out to be surprising. After some elaboration, it turns out that the internal assembler of clang mimics just the Apple assembler with its GNU as 1.38 compatibility.

The effect of this is that the abstractions in the Self codebase for different assemblers had to be adjusted to differentiate between OS X, Linux with gcc and Linux with clang. Yet, this did not suffice. Our jmp_label macro seemed not to work for firstSelfFrame_returnPC on Linux with clang,

  • neither using a simple jmp as on OS X,
  • nor forcing the label to be global as on Linux with gcc.

Either way, clang insisted on generating a “short” jump, which is encoded in only two bytes instead of five bytes for the other cases. I tried other different ways of forcing a 32 bit address jump, but to no avail.

Solution?

In the end, I disabled using the integrated assembler of clang on Linux. clang now—happily?—uses the GNU as of the system as if it were gcc and our jmp_label again encodes to a 32 bit address jump, a five byte instruction.

Aftermath

Arguably, it is possible to adjust the Self C++ code to not expect the 0x14 at the 26th byte but earlier and then using short jumps. After all, this code still dates from days where SPARC and PPC, and not Intel CPUs, where the main supported platforms within Self. After all, the assembly code in Self is not position independent and Assembly will be an issue once Mac OS X 10.8 is there. But again, this is another story.


1 This holds at least for Xcode 3.2 to Xcode 4.2
2 start_exported_function is an Assembler macro to ensure that a certain label is global and linkable.
3 I did a refactoring of the assembler code in my Self VM fork for better portability. Hence, whenever they diverge, the assembler code is given in two ways, the original form and my “portable” version.