Archives for category: Uncategorized

David Ungar and Harold Ossher’s talk at SPLASH 2013, called “Dancing with Symmetry to Harness the Power of Complexity: Subjective Programming in Context” is now online with slides and video at InfoQ.

David and Harold discuss a new programming language and environment they have been researching which allows to contextual, multidimensional dispatch of messages so that the behaviour of objects depends on the context of the method invocation. They also give a quick demonstration of a development environment they have build to explore these ideas in Self.

Check it out!

Named after the majestic, awe inspiring Anas platyrhynchos, the latest Self release 4.5.0 is now available for download.

mallard

What’s new since 4.4?

  • Build system redone by Tobias Pape. Now based on cmake, with a single modern build process for both Linux and OS X. The VM can be built on both GCC and Clang on the latest vesions of both operating systems.
  • New Self Control.app on OS X to manage your running worlds as a more robust and featured replacement for the older ‘Self Droplet’. Use of this app is optional and you can still access the Self VM through the command line.
  • New look for standard world, with better fonts, colours and greater use of space.
  • Various fixes to the standard world, including a new build script ‘worldBuilder.self’ replacing several ad hoc build scripts.
  • Updated Self Handbook at handbook.selflanguage.org

Sources for the VM and for the basic Self world are available as always from the GitHub repository. Although Self, like Smalltalk80, can use an image (in Self called a snapshot of a world), this can be built from text sources.

Self Mallard is available for OS X and Linux. You can download binaries:

  • For OSX, a disk image containing the Self Control.app and a prebuilt clean snapshot (“Clean.snap”). Copy the app to your Applications folder, run it and click the Choose snapshot… button.
  • For Linux, a gripped tar file containing a prebuilt 32-bit binary and Clean.snap. Unpack, then run ./Self -s Clean.snap

Following on from his SelfVM build for Android x86, Chris Double has fixed up some bitrot in an old Self application, which allows sharing a Self world through a Java applet (without requiring a X11 server).

Check out his writeup here, including a screencast of how to make it work. His changes have been merged into the Self tree.

Chris Double has announced a build of the Self VM for x86 Android. No GUI yet and only x86:

Image

For several components of Self, it is necessary to have a few bits of assembly. For example, executing a method compiled by the JIT-compiler requires a small stub, named EnterSelf. About 5 % of Self’s codebase is written in assembler, mostly AT&T style.

As you remember, the two main compilers I use to build Self are gcc and clang. Using gcc is no problem, it also has been tested and used for years with Self. However, clang turned out to be troublesome.

Background: Assembling with gcc

While gcc until 4.2 was a (relatively) cross-platform operating thing, it always relayed on platform-specific assemblers or the GNU assembler gas (or as).

If you happen to use a Linux Distribution, try to run

$ as -v </dev/null

For the Ubuntu box I use I get

GNU assembler version 2.22 (x86_64-linux-gnu) using BFD ↩ 
 version (GNU Binutils for Ubuntu) 2.22 

and for the Fedora box it is

GNU assembler version 2.22.52.0.1 (x86_64-redhat-linux) ↩ 
 using BFD version version 2.22.52.0.1-10.fc17 20120131

Now go and try it on your Mac. You will get something like

Apple Inc version cctools-822, GNU assembler version 1.38

with varying values for the cctools version. The “GNU assembler version”, however, will stay the same: 1.381

You see, there is at least one major version difference between the assemblers. The Apple assembler even isn’t an actual GNU assembler but merely supports the version 1 syntax of GNU as.

This divergence has been the case for a long time and in Self’s codebase you find abstractions to cope for exactly these divergences:

  • On Linux, use syntax features of GNU as v2
  • On Mac OS X, use syntax features of GNU as v1

Self Assembler Code

There is one very important assembly function already mentioned: EnterSelf. This function, besides entering compiled code, has the responsibility to serve as data; to provide references to certain addresses of code in the Self VM2,3:

start_exported_function firstSelfFrame_returnPC
   jmp_label send_desc_end     
  .long 0                      
   jmp_label contNLR           
  // …
  .long 0                      
  .long 0                      
  .long 0                      
  .long 20

Without going too much into details, the Self VM code expects the 0×14 (the .long 20) at the 26th byte after the label. Now the difficulty: the send_desc_end label follows immediately after the code just given. Hence, most assemblers—when using a normal jmp instruction—will generate a short jump, resulting in our 0×14 being placed at the 23rd instead of the 26th byte. The effect is that our compiled VM crashes as soon as it tries to enter Self compiled code.

To alleviate this, the jmp_label macro is defined in a way that the resulting jump is always to a 32bit address. Disregarding a syntax change in Apples assembler around 2007, jmp_label on OS X is a simple jump, which always is handled with 32bit addresses.

Original Self                New version

.macro jmp_label             MACRO(jmp_label, label) ↩
                               // jump to label with four bytes for label
    jmp $0                       jmp $0
.endmacro                    ENDMACRO

On Linux, however, we work around the generated “short” jump by introducing a global label, which forces relocation.

Original Self                        New version

.macro jmp_label label               MACRO(jmp_label, label)
    .globl \label ↩                    .globl \label ↩
      // force 32-bits for the label     // force 32-bits for the label
    jmp \label                         jmp \label
.endm                                ENDMACRO

That way, the structure of firstSelfFrame_returnPC is retained on both platforms.

Clang’s Integrated Assembler

Clang aims to be a drop-in replacement for gcc under nearly any circumstances. It is, however, supported the most by the FreeBSD and Apple communities. And the latter fact seems to leak through.

On OS X, you won’t notice any difference. Using the integrated assembler of clang is just the same as using assembly with gcc. This also holds for the generation of jumps. However, running clang on Linux turns out to be surprising. After some elaboration, it turns out that the internal assembler of clang mimics just the Apple assembler with its GNU as 1.38 compatibility.

The effect of this is that the abstractions in the Self codebase for different assemblers had to be adjusted to differentiate between OS X, Linux with gcc and Linux with clang. Yet, this did not suffice. Our jmp_label macro seemed not to work for firstSelfFrame_returnPC on Linux with clang,

  • neither using a simple jmp as on OS X,
  • nor forcing the label to be global as on Linux with gcc.

Either way, clang insisted on generating a “short” jump, which is encoded in only two bytes instead of five bytes for the other cases. I tried other different ways of forcing a 32 bit address jump, but to no avail.

Solution?

In the end, I disabled using the integrated assembler of clang on Linux. clang now—happily?—uses the GNU as of the system as if it were gcc and our jmp_label again encodes to a 32 bit address jump, a five byte instruction.

Aftermath

Arguably, it is possible to adjust the Self C++ code to not expect the 0×14 at the 26th byte but earlier and then using short jumps. After all, this code still dates from days where SPARC and PPC, and not Intel CPUs, where the main supported platforms within Self. After all, the assembly code in Self is not position independent and Assembly will be an issue once Mac OS X 10.8 is there. But again, this is another story.


1 This holds at least for Xcode 3.2 to Xcode 4.2
2 start_exported_function is an Assembler macro to ensure that a certain label is global and linkable.
3 I did a refactoring of the assembler code in my Self VM fork for better portability. Hence, whenever they diverge, the assembler code is given in two ways, the original form and my “portable” version.

[The important summary is at the end.]

Three weeks ago I said

I plan to frequently post a similar table showing my progress in building the Self VM with the different OS–compiler combinations.

This will be such an update post although, however, it is neither in the frequency I hoped I would deliver nor will it contain a table in the way promised. Both is easy to explain: onehundredandfiftytwo different ways to build the Self VM.

That much? Well, yes. To quote my earlier post (and extend it):

Make the Self VM build on major operating systems […]

  • Mac OS X 10.6 (Snow Leopard)
  • Mac OS X 10.7 (Lion)
  • Ubuntu 12.04
  • Fedora 17

[…]

  • GCC
    • 4.2 (Apple LLVM-gcc) [and 4.2 non-LLVM]
    • [4.4, 4.5,] 4.6
    • 4.7
  • Clang

There are certainly different configurations for debugging/optimization, and when we look at the Apples, even two different means to carry out the actual compilation of Self. In the default case, Makefiles, this makes two configurations, and three for Apple Xcode.

Not enough, Self itself provides some kind of configurability. It supports a ‘profiled’ build and the possibility to compile with ‘fast floats’, any of which is optional. That makes four configurations.

Are you still with me? Fine. So let’s dissect all possibilities.

Linux: 32 VMs

Linux is easy, there are two of them, both with a gcc and a clang compiler, both using Makefiles to carry out compilation. Ubuntu brings gcc 4.6 by default, Fedora 4.7. For both, the clang version is 3.0. Think of the four Self configurations and the two build configurations, and we get

  • Ubunutu: 16 VMs
    • gcc 4.6: 8 VMs (4 Self configs × 2 build configs)
    • clang 3.0: 8 VMs (4 Self configs × 2 build configs)
  • Fedora: 16 VMs
    • gcc 4.7: 8 VMs (4 Self configs × 2 build configs)
    • clang 3.0: 8 VMs (4 Self configs × 2 build configs)

Mac OS X 10.7 (Lion): 40 VMs

As I already pointed out, on OS X, it is possible to use Xcode as means of building besides using Makefiles. Also, I enabled an ‘optimized with debug symbols’ build in Xcode, yielding three build configurations for Xcode. The four Self configurations stay the same.

  • Xcode 4.3: 24 VMs
    • LLVM-gcc 4.2: 12 VMs (4 Self configs × 3 build configs)
    • Clang 3.1: 12 VMs (4 Self configs × 3 build configs)
  • Makefile build: 16 VMs
    • LLVM-gcc 4.2: 8 VMs (4 Self configs × 2 build configs)
    • Clang 3.1: 8 VMs (4 Self configs × 2 build configs)

Mac OS X 10.6 (Snow Leopard): 80 VMs

This is mostly the same as for Lion with the exception that I also tries building using the Xcode 3.2 toolchain. Hence twice the VMs.

  • Xcode 4.2: 24 VMs
    • LLVM-gcc 4.2: 12 VMs (4 Self configs × 3 build configs)
    • Clang 3.0: 12 VMs (4 Self configs × 3 build configs)
  • Makefile build (Xcode 4.2 based): 16 VMs
    • LLVM-gcc 4.2: 8 VMs (4 Self configs × 2 build configs)
    • Clang 3.0: 8 VMs (4 Self configs × 2 build configs)
  • Xcode 3.2: 24 VMs
    • LLVM-gcc 4.2: 12 VMs (4 Self configs × 3 build configs)
    • Clang 1.6: 12 VMs (4 Self configs × 3 build configs)
  • Makefile build (Xcode 3.2 based): 16 VMs
    • LLVM-gcc 4.2: 8 VMs (4 Self configs × 2 build configs)
    • Clang 1.6: 8 VMs (4 Self configs × 2 build configs)
Important findings
  1. 132 out of 152 VMs compiled and linked.
  2. 20 VMs (all Clang with Xcode 3.2 on Mac OS X 10.6) did not compile. The reason: surprisingly, Clang 1.6 did not support C++ proper.
  3. CMake helps alot. But this makes another post.

Do they run?

Well, not all of them, I think. Most non-profiled, non-fastfloat do, and they do load self worlds. Most clang-compiled, optimized VMs get hiccups when you want to dismiss a debugger in a Self world. If you want, I provide all 132 VMs for testing purposes.

The Table

Finally, here is the promised table.

Build chart: which Configurations built.

PS: As some of you liked the play-form of the last post, here’s my first try of a haiku:

Looking at three screens,
Hundredfiftytwo VMs,
that takes time to build.

Since the last official VMs for Self were build, some time has gone by. Compilers changed, standard libraries evolved, other software ceased to exists or is not as widely used anymore as in the early days of Self’s history.

As a vidid example, enter a play with me, the play of “Friend or Foe.” Our protagonists:

  • Ego. Our beloved Self VM, which is to be built.
  • Wildebeest. This old friend used to build Self for some ten years or more. He has changed over time. (Also known as gcc)
  • Boing. A young fellow, who recently stepped up to claim the lands of Wildebeest. (You might know him as clang).
  • A Herald.

ACT I
Ten to Fifteen Years Ago.
In light space.

[Enter Ego, Wildebeest.]

Ego:

Hello, young Wildebeest. Your colleagues from Solaris have sent me to you, as I wish to be build on this platform called Linux. Can you do that for me?

Wildebeest:

Although I am younger than you, I think that will work.

[Wildebeest tries to built Ego.]

Wildebeest:

Yes, this should do it. A little rough around the edges, but that will sort out.

Ego:

There we go.

[Exit.]

ACT II
In 2012.
Early summer in Germany.

[Enter Ego, Wildebeest.]

Ego:

Wildebeest, old friend, we haven’t met in a long time. Grant me to be built once again, for that Linux and please for that sixth OS X.

Wildebeest:

Yes, let’s do that.

[Wildebeest tries to built Ego.]

Wildebeest:

Oh, we have to do something, I handle friends different now.

See, you have lots of code like this:

class mapOopClass: public oopsOopClass  {
 …
 public:
  // constructor
  friend mapOop as_mapOop(void* p) { return mapOop(as_memOop(p)); }
}

Well, and you then use as_mapOop everywhere. I don’t do this anymore.

[Enter Herald.]

Herald:

The -ffriend-injecton option.
Inject friend functions into the enclosing namespace, so that they are visible outside the scope of the class in which they are declared.  Friend functions were documented to work this way in the old Annotated C++ Reference Manual, and versions of G++ before 4.1 always worked that way”

[Exit Herald.]

Ego:

Then let us go on and use that option.

Wildebeest:

But know that I won’t support that option somewhen in the future.

Ego:

Beg your pardon?

Wildebeest:

You should manually declare your former friend method in an outer context. Like this:

// Forward-declaration for friend.
mapOop as_mapOop(void* p);
class mapOopClass: public oopsOopClass  {
 …
 public:
  // constructor
  static mapOop as_mapOop(void* p) { return mapOop(as_memOop(p)); }
}

[Ego sighs]

Ego:

Well then, it just around five dozen of files to modify…

[Exit.]

ACT III
In 2012.
Same place, a little later.

[Enter Ego, Boing.]

Boing:

Ah, greetings, Ego. I am the new star on the horizon. Want to try me?

Ego:

And who shall you be?

Boing:

I am the new default compiler on the OS Xen, I am part of that LLVM project, and last but not least, I have influential friends in the FreeBSD world. They even let me build their kernel.

Ego:

That is impressive indeed.

Boing:

I am the future™.

Ego:

If you say so… lets give it a shot.

[Boing tries to build Ego.]

Boing:

I can certainly compile you but not link you. I am missing many things like as_mapOop. Do you have an implementation for those functions?

Ego:

Well, yes, they are right there in their classes.

Boing:

No way I see them.

Ego [hesitating]:

Are you sure? What if I leave out the forward declaration like in the original code I had.

class mapOopClass: public oopsOopClass  {
…
 public:
  // constructor
  friend mapOop as_mapOop(void* p) { return mapOop(as_memOop(p)); }
}

Boing:

Ok, now I can find the implementation. But you have to somehow declare them in the outer context of the class to still use these functions as before.

Ego:

But that is exactly what I had before that and you weren’t able to find the implementation, were you?

What if I just pass you an -ffriend-injecton option?

Boing:

There is no such thing like an -ffriend-injection option with me.

Ego [angry]:

So you are the new rising star and cannot cope with me?

Leave me, I have to think.

[Exit.]

EPILOGUE

[Enter Herald]

Herald:

So Ego had to change again to satisfy both, Boing and Wildebeest. And so it did.

See what it came up with:

class mapOopClass: public oopsOopClass {
…
 public:
  // constructor
  static mapOop as_mapOop(void* p) { return mapOop(as_memOop(p)); }
…
};
static inline mapOop as_mapOop(void* p) { return mapOopClass::as_mapOop(p); }

This worked out for all our three heroes.

[Exit.]

END

Eventually, this peculiarity of friend declared function could be resolved by introducing several inline functions that act as mere aliases to their static, class-bound counterparts.

However, I did not convert every friend function that had its implementation in the class header into a static member/static inline non-member pair.

  • “True” friend functions, i.e., those that are accessible globally and not class-bound but have to access private members of other classes, were left as is. However, their implementation has been moved to a respective implementation (i.e., .cpp) file.
  • Friend functions, that were mere shortcuts and so could be used without full qualification have been turned into normal static functions. They now have to be called with their class prefixed.
  • Constructors (like create_objVector) and converters (like as_mapOop) are now static members of their class and also have a static inline non-member function.
Eventually, this also separated the semantically different uses of C++ friends, which I consider a good thing.

I hope this finally ends the tale of friends for Self.

So, why revamping Self’s build process? What could be the outcome of this?

My mini-roadmap for this project is

  1. Make the Self VM build on major operating systems
  2. Make the Self VM build with those OSs’ prevalent compilers
  3. Make building the Self VM more approachable.

As Self is traditionally a Unix VM and there is currently too few code for Windows in the current code base, the first point limits to Linux and Mac OS X at the moment. The versions of both I can get hold of and, therefore, aim for are as follows:

  • Mac OS X 10.6 (Snow Leopard)
  • Mac OS X 10.7 (Lion)
  • Ubuntu 12.04
  • Fedora 17

To my astonishment, there are only two overlaps in the default compilers for each and every platform, namely LLVM-GCC for OS Xen and Clang 3.0 for Linuxen and OS 10.6. By ‘default’ I mean those that are easily installable without manual compiling or adding strange servers to your package managers sources. This limits the compiler choice to the following compilers:

  • GCC
    • 4.2 (Apple LLVM-gcc)
    • 4.4, 4.5, 4.6
    • 4.7
  • Clang
    • 3.0
    • 3.1

As this is already confusing, I hope the following table will clarify the compiler–OS relations a little.

Linux Mac OS X
Ubuntu 12.04 Fedora 17 10.6 (Snow Leopard)/Xcode 4.2 10.7 (Lion)/Xcode 4.3
GCC Gnu GCC 4.4, 4.5, 4.6 Gnu GCC 4.7 Apple LLVM-GCC 4.2 Apple LLVM-GCC 4.2
Clang Clang 3.0 Clang 3.0 Clang 3.0 Clang 3.1

And as you would guess it, every compiler complains about different things within the same code base. But more on this in a later post.

I plan to frequently post a similar table showing my progress in building the Self VM with the different OS–compiler combinations.

So much for points 1 and 2. Quite orthogonal to this is 3, making the build process itself more approachable.
I will have a separate post on this later-on.

Allow me to introduce myself.

My name is Tobias, I’m from Germany and in my spare time I enjoy to look at printed matter, meaning I like typography.

While you are reading this, I’ve laid my fingers on the build-process of the Self VM. Since a few weeks I am looking into how a Self VM is built and what can be done to make this process easier understood and more intuitively carried out.

I will irregularly  report on my progress an tell about my findings here.

For a start, you can watch my progress on github.

Best
—Tobias

PS: You can participate! Fork on github and try it yourself!

This follows on from my previous post on an error handling mechanism. Prompted by David Ungar to think further on how that mechanism could be made more general and powerful, and less of a one off special exception, this is a description of one way in which certain objects can adjust their behaviour based not only on their delegatees and but also on a ‘perspective’ – a viewpoint bound to a process. It’s a great example I think of the power of Self as a language.

Subjectivity and Us

Normal objects in Self are objective – that is their behaviour depends only on themselves and will be the same in all contexts.

In 1996, Randall B Smith and David Ungar wrote a paper called “A simple and unifying approach to subjective objects” (download as a pdf) in which they described a system called ‘Us’ where objects were subjective rather than objective. The behaviour of Us objects (in technical terms the lookup mechanism for delegation) depended on what perspective the message was sent with.

Objects were to be built up in two planes – by normal delegation and by layering pieces onto the whole. Each message received by the object would be dispatched into that three dimensional space based upon not only message selector but also on an implicit argument to the message – the perspective object.

So two people could look at an object in two different ways – to me it might be a circle, to you a set of slots. Or it might look like a pie chart to me and a table to you. An object would be not so much a concrete thing like a pebble but a ‘figment of its viewers beliefs’ (as the authors quote Alan Kay as saying)

Randall and David didn’t build a complete system this way. I don’t know of anyone who has, although the work done on Classboxes in Smalltalk-80 and Java is a very interesting application of similar ideas to modularisation and there are a number of other interesting papers which cite the Us paper.

I understand Randall and David’s paper correctly, a pure Us implementation would require a check at every lookup, so turning Self into Us would take a lot of effort and changing the VM lookup code if it were to be efficient.

However it might be a useful exercise to build a mechanism to investigate and play with subjectivity, even if that mechanism doesn’t get us all the benefits of a full-blown Us.

This is a thought experiment on some of the Us principles. It’s not an implementation of Us and neither Randall not David should be blamed for its inadequacies! In particular, perspectives in Us are not necessarily bound to a process, nor do they necessarily as I understand it hold true beyond the initial method – that is unlike the mechanism below a perspective wouldn’t stay fixed for the call stack until the original method exits. As well, Us wasn’t envisioned as an being a capacity of Self but as a new language which was a superset of the existing Self language.

On the other hand, as you’ll see below, this is a short and simple experiment!

Mechanism

The mechanism I have in mind looks like this:

To make an object “subjective”, we share the subjective mixin:

o: (| 
    m* = mixins subjective 
  |)

Ordinary slots will behave ordinarily:

o: (|
     m* = mixins subjective.
     x = 1. 
  |).
 o x == 1

To create subjective behaviour, we add layers (with overlapping slots) as parent slots:

o: (|
 m* = mixins subjective.
 default* = (| hello = ('Hello') |).
 french* = (| hello = ('Bonjour') |)
|)

These layers shouldn’t themselves have further parent slots, and they should overlap so that their slots are the same. If we want undefined behaviour we need to do something like (| hello = (|l = lobby| l raiseError) |) rather than leaving the hello slot out of our layer.

Once we have done this, sending our object the message ‘hello’ will get us the string ‘Hello’. However, if we do:

[o hello] @ 'french'

then we will get the result ‘Bonjour’.

What’s happening?

Perspective objects must understand a message forObject: o Selector: s, where o is the current self of the object doing the lookup and s is the selector. They return a canonicalString. Strings know to return themselves, so ‘french’ is a shortcut for a perspective object which always returns ‘french’. The system finds the perspective object by sending a message to self, so

[o hello] @ french

will, as you would expect, look for the perspective object by sending the message ‘french’ to self.

Perspectives are placed on the process. They are not placed in a stack; but the @ message on traits block manages reinstalling the old perspective after itself, so if we have:

([o hello] @ 'french'), ' ', o hello

the result is ‘Bonjour Hello’

OK, so how is this done?

(1) We create the mixin:

mixins subjective = (|
  ambiguousSelector: sel
               Type: t 
          Delegatee: d 
       MethodHolder: m 
          Arguments: a = ( | l = lobby | 

    sel    sendTo: self 
     DelegatingTo: ((l process this perspective forObject: self 
                                                 Selector: sel) sendTo: self)
    WithArguments: a

 ) 
|)

This traps the vm send error message ambiguousSelector:Type:Delegatee:MethodHolder:Arguments: and chooses which of the parents of our object is resent the message based on sending to the perspective object found on process this the message forObject:Selector:, which returns a string which is then sent to our original object to get the appropriate delagatee.

(2) globals process is given a new assignable slot called ‘perspective’ with the contents the string ‘default‘.

(3) traits block is given a new slot @

@ p = ( | h. r | 
  h: process this perspective. 
  process this perspective: p. 
  r: onNonLocalReturn: [|:v| process this perspective: h. v ] 
               IfFail: [|:e| process this perspective: h. raiseError ]. 
  process this perspective: h. r 
)

which handles installing a new perspective and cleaning up after itself, and

(4) traits string is given a new slot

forObject: o Selector: s = (self)

so that we can just use a string if we find it easier.

That’s it.

What can we do with it?

Well, first off let’s implement the error handling style mechanism from last post.

defaultBehavior = "Apart from all the other slots of course" (| 
    m* = mixins subjective. 
    default* = (| error: x = ( "Normal existing error code") |). 
    logErrors* = (| error: x = ( "Some code to log our error but not bring up debugger") |). 
    logErrorsPerspective = 'logErrors' 
|)

Now we’re done. If we try to get

9 / 0

we get a debugger, whereas if we do

[ 9 / 0 ] @ logErrorsPerspective

then any error silently logged without a debugger being opened.

This isn’t perfect, and there are I’m sure lots of interestingly dangerous corner cases and unexpected behaviours lurking, but its a great example of way in which a simple but well thought out base like Self gives us great power. There are many languages where this wouldn’t nearly be as nice to play with.

Follow

Get every new post delivered to your Inbox.