Thursday, March 8, 2007

A Tale of Two CPUs

Exploring and comparing "Accelerated Objective-C Dispatch" on PPC and x86

Links to Darwin source in this post require an Apple ID. They're free, so get one.

I learned about the Obj-C runtime pages last year, and was reminded by this post to one of Apple's mailing lists. Mr. Maebe did a great job of explaining things, but there's always more to the story.

Who 0xfffeff00 is

Mr. Feffoo is a copy of the _objc_msgSend assembly routine defined in objc-msg-ppc.s. As Mr. Maebe pointed out, the only documentation is the source code. The easiest way to find it is to google 0xfffeff00. Apple's policy of requiring a login to view their source code seems to keep google's spider at bay, but our old dead friend, DarwinSource, is still cached. The first found occurrence of 0xfffeff00 is in objc-rtp.h.

<side note>
While we're looking at _objc_msgSend, I'd like to remind everyone who says "messaging nil objects returns nil" that our friend Peter Ammon showed us that this is not always the case. If your code makes that assumption, it wouldn't hurt to fix it now.
</side note>

Notice that this is the PPC version of Darwin. 0xfffeff00 is the value assigned to kRTAddress_objc_msgSend, but only for PPC. While this header file is very interesting, its associated source file is the real fun:

Copyright 2004, Apple Computer, Inc.
Author: Jim Laskey

Implementation of the "objc runtime pages", an fixed area
in high memory that can be reached via an absolute branch.

Hat's off to Mr. Laskey for this file. In my opinion, what happens herein is some of the most interesting code in Darwin. I'll leave it to the reader to enjoy the code that actually copies the _objc_msgSend routine to high memory. Code that treats code like the data it is always makes me smile.

PPC Limits Set

The magic number here is 0xffff8000. Above this address lie the "comm pages", the documentation for which eludes me now(anyone?). The reason this address is important is that a PPC absolute conditional branch can reach any address at or above it, thereby allowing the OS or kernel to copy some commonly used functions to those virtual memory pages.

In the Programming Environments Manual, 'bc', 'bca', 'bcl', and 'bcla' instructions are defined as having a 14 bit address field concatenated with two low zero bits, so that the destination address is guaranteed to be a multiple of four. That means that this is effectively a 16 bit address field. While 'bc' and 'bcl' have their charms, we're more interested in the absolute variants.

If we set the high bit, or sign bit, of the address field and clear the lower bits, the CPU's sign extension produces the number 0xffff8000. And if we use the absolute variants 'bca' and 'bcla', we can branch to that address directly(as opposed to our current address minus 0x8000).

PPC Limits Broken

The comm pages are nice, but they're already used for standard ANSI and/or Unix C functions, not Apple's Obj-C stuff. So how does Apple implement similar behavior for common Obj-C runtime routines like _objc_msgSend?

Simple- build more comm pages below the ones that already exist, and use a branch instruction with a broader range that can reach those addresses.

Assuming you're viewing the conditional branch documentation in the PEM, scroll up a bit to the unconditional branch instructions, specifically 'ba' and 'bla'. Notice the 24 bits available in the destination address field. This field is also concatenated with 2 low zeroes, which effectively gives us 26 bits. (Sure, we could use 'bcctr' to get a full 32 bits, but that would require 2 extra instructions to load the address and would clobber the count register.)

So, if we sacrifice the ability to branch conditionally, we gain 10 bits, or 1024 bytes of addressability*. If we use the unconditional branch instructions, we can branch to anywhere at or above 0xfe000000, which is well below the addresses of the routines defined in objc-rtp.m.


The _objc_msgSend assembly routine is copied into high RAM at launch time, and PPC code that was compiled with Xcode's "Accelerated Objective-C Dispatch" option uses the high-RAM version of _objc_msgSend instead of the usual symbol stub.

Intel Wins For Once

I don't like x86 assembly and machine code any more than the next guy, but here's something that doesn't suck about the Intel chips:

Darwin x86 has no runtime pages, because it doesn't need them. Not being able to load a 32 bit number in a single instruction is a limitation of the PPC ISA that Intel doesn't share. x86 Obj-C code is "accelerated" by design, and the "Accelerated Objective-C Dispatch" option has no effect whatsoever. Since they don't need symbol stubs, they don't need the "accelerated" workaround.

It's not inconceivable that the "Accelerated Objective-C Dispatch" option will only exist for as long as Apple supports PPC machines.

One More Thing™

Though it's only loosely related to this article, it's interesting to note the following code from objc-rtp.m:

// initialize code in ObjC runtime pages
rtp_set_up_objc_msgSend(kRTAddress_objc_msgSend, kRTSize_objc_msgSend);

rtp_set_up_other(kRTAddress_objc_assign_ivar, kRTSize_objc_assign_ivar,
"objc_assign_ivar", objc_assign_ivar_gc, objc_assign_ivar_non_gc);

An entire function to assign a value to an instance variable? And an optimized version at that? Hmm.

While I currently have no NDA obligation with Apple, I've been told that the information at which this code hints is hush-hush for now. So, out of respect for The Masters, I will only direct the reader to Xcode's "Project Settings" window. Specifically, the description of the "-fobjc-gc" flag.

For those who don't want to fish around for details, "gc" stands for "Great Code".

Many thanks to Mr. Laskey, Mr. Ammon, and all of the Apple coders working on the Obj-C runtime. Our applications only execute correctly and easily because you have all applied an enormous effort in designing a crucial part of the OS X underbelly. We all appreciate your work.

If any of you attend C4[1], the drinks are on me.

* I meant to say 33,521,664 bytes, not 1024 bytes. Minor oversight, sorry about that.

No comments: