Sunday, December 24, 2006

Checksum, Please

I was playing around with file comparisons recently, and I thought I could have some fun with crc32(). It's a common function, so it's probably in the system somewhere already. I asked man about it and there it is, in the 'n' section of the man pages:

DESCRIPTION
This package provides a Tcl-only implementation of the CRC-32 algorithm based upon information provided at http://www.naaccr.org/standard/crc32/document.html If the Trf package is available then the crc-zlib command is used to perform the calculation.

Tcl-only, great. And searching the XCode docs just lead to the same man page. But I couldn't believe there's not a C function in the system somewhere, so I asked google to have a look at the Darwin source, and found this page. Ah, it's part of zlib, as mentioned in the man page description. No wonder Apple has no docs describing it.

So next time you need a 32-bit checksum, add -lz to your linker flags and send me a postcard.

Friday, December 22, 2006

MacMob

I propose a new Mac software promotion.

For the next few days, participating developers will set aside 20% of their profits and pool the cash. Using the cash, we will procure an assortment of small arms and blunt objects, and hire a P.I. to find the jackass who shat on Aaron's holidays.

Monday, December 18, 2006

Of Balls and Whiners

I've never been paid to write code, so I'm ill-qualified to form an opinion on the whole MacHeist thing. I do however have a keen eye for idiocy, and one group in particular seems in need of calling-out. To paraphrase:

"Screw you, Gus! You just lost a potential and/or current customer!"

What? Let's try to understand the reasoning behind such a statement, starting with the most illogical.

Possibility the First:
These people actually think there is some correlation between a developer's opinions or attitude and his products. It could happen I guess. If a developer can maintain a list of pirated serial numbers, and make their app not work with them, it stands to reason that they could just as easily maintain a list of apps that have been included in the MacHeist bundle, and make their apps not work when those other apps are installed. And if the developer has balls enough to speak his mind without trying to kiss anyone's ass, he's certainly capable of making his app trash your whole system! Therefore, as cautious consumers and MacHeist fans, we would all do well to avoid apps from that developer.

Possibility the Second:
These people actually think that the revenue lost by their secession will have an impact great enough to make a developer see the error of his ways. That sounds reasonable, maybe. Assuming Flying Meat has fewer than ten paying customers, that's a sizeable chunk of income to lose.

Possibility the Third:
These people actually think that although a single user's threat would have little impact, a concerted effort by many organized users will have the desired effect. So, if we all band together and sign the petition, we can force him to change his ways.

Possibility the Fourth:
These people subconsciously realize that they're bringing nothing to the table and feel embarrassed to burden Gus any longer with all their support issues. The only sensible thing to do is to remove themselves from the picture in a flaming puff of glory.

It never occurred to me before all this, but now I'm wondering: When I'm shopping for software, how much do I care about the developer's personal life? Say the guy sacrifices goats and small children on weekends. Can he write code that makes me do that? No? Ok, then, yea I'll buy his apps. On the other hand, say he keeps his mouth shut when he takes issue with something, to avoid possibly stepping on anyone's toes. I wouldn't respect him quite as much, but I'd still buy his apps.

I guess I shouldn't complain, really. All of the informed opinions on both sides of the issue have given me things to think about that otherwise wouldn't have occurred to me. And the whiners balanced it out with some much needed comedy.

Sunday, December 17, 2006

Implicit memcpy(3) calls

Peter Hosey was kind enough to document the speed difference between malloc() followed by bzero() and calloc(). They are two different ways to allocate and zero a block of memory, but it turns out that calloc() is much faster.

I was wondering how much of that speed difference is the result of having one fewer function call in your code, and that reminded me of a lesson I learned a few months ago...

Some time ago, I picked up the practice of initializing all local variables at their declarations, like:

double aDouble = 0.0;

In many cases, it's not really necessary, and the coders who disagreed with the practice correctly pointed out that I was simply adding extra code and extending the execution time. My reasoning was that not having garbage data in a local variable is worth the extra few cycles it takes to zero an integer or even a double.

At the time, I was a PowerPlant junkie, and would initialize LStrings with Str_Empty or whatever it was called. And all was good.

Since I've switched to XCode, I've been using C strings more than ever before. NSMutableString is great when it's possible to use it, but creating, modifying and releasing a few hundred million of them gets annoying. So I was using lots of C strings and initializing them to zeroed bytes like so:

char aString[1000] = {0};

I knew that only the first byte needed to be zero to qualify the local variable as a null string, but I didn't like garbage data. In addition, if any of my subsequent string handling code accidentally left out the null terminator, it would still be there. It was a win/win situation, and it worked flawlessly.

As it turned out, my adoption of the initialize-blindly-at-declaration practice slowed things down considerably when dealing with C strings. After optimizing all I could, Shark told me that about 55% of my code's running time was spent in memcpy(). At first I thought that was a good sign. After all, I was not calling memcpy() explicitly in my code, so it must have been invoked by printf, scanf, and the like. And I figured that when more than half the time is spent in low level system functions, I've done all I can.

Out of curiosity, I disassembled the app I was working on with otool and searched for "memcpy". Holy shit, it's everywhere. It's not only being invoked from printf/scanf(which Shark would have told me if I had asked). This is when the lesson sunk in- zeroing an entire C string of arbitrary length is a Bad Idea™. gcc translated my harmless {0} into a memcpy call. Combine that with 10 or 20 C strings per method, with each method being called a few million times, and I began to see why memcpy was the latest culprit according to Shark. Although I was not calling memcpy() explicitly, gcc was. A lot.

So I still initalize atomic variables in their declarations, but I'm much more careful when using composite data types. Before typing this post I just used my instinct to decide which way to go. But since Peter was generous enough to provide some empirical data, I may as well follow suit.

Not surprisingly, PowerPC and Intel code behave quite differently. Here's what I found when using the initialize-at-declaration approach for C strings. The following disassembly is modified output from otx.

PowerPC
If the char array is less than or equal to 32 bytes, gcc produces inline load/store instructions. For example, a 16 byte string:

3c400004  lis   r2,0x4
3842ef54  addi  r2,r2,0xef54
80020000  lwz   r0,0x0(r2)
81220004  lwz   r9,0x4(r2)
81620008  lwz   r11,0x8(r2)
8042000c  lwz   r2,0xc(r2)
901e0018  stw   r0,0x18(r30)
913e001c  stw   r9,0x1c(r30)
917e0020  stw   r11,0x20(r30)
905e0024  stw   r2,0x24(r30)

where the data pointed to by 0x3ef54 is a bunch of zeroes, and r30 is a copy of the stack pointer.

If the char array is greater than 32 bytes, gcc inserts a call to memcpy().

Intel
If the char array is less than or equal to 64 bytes, gcc produces inline move instructions. For example, the same 16 byte string:

c745e800000000  movl  $0x00000000,0xffffffe8(%ebp)
c745ec00000000  movl  $0x00000000,0xffffffec(%ebp)
c745f000000000  movl  $0x00000000,0xfffffff0(%ebp)
c745f400000000  movl  $0x00000000,0xfffffff4(%ebp)

If the char array is greater than 64 bytes, gcc inserts a call to either memset() or memcpy().

So apparently, if the length of your C string is more than 16 times the number of bytes your CPU can zero with a single instruction, gcc will insert a call to memcpy(). Moral of the story- only zero the first byte of a C string, unless it's absolutely necessary, or speed doesn't matter.

Saturday, December 16, 2006

Belated C4 words

0xC4 == 196, which is roughly the number of Mac laptops I saw there. I gotta get me one of those. Jokes aside, there were actually 98 attendees, which seems somehow related to 196...

Summaries of the presentations have been posted all over, so I'll just relay my personal experiences. One thing that I noticed was how accessible everyone was. Nobody cared that I don't have any apps out there, or that I asked a lot of stupid questions. A good example would be Wolf Rentzsch. Here's a successful consultant type who's forgotten more than I've learned, and is still totally easy to talk to. I hope I'm that nice when I'm rich and famous. Thanks again, Wolf, for the whole show!

Bob Frank is another good example. Snakebite himself. I chatted with him several times, but he's more into Java, so we didn't have a lot in common. Still, he was more than happy to answer stupid questions.

I also got to chat a bit with Gus Mueller. I had to raise my voice a little because he's approximately 8 feet tall. I think the people insulting him on his blog lately don't know just how big this guy is. Seriously, don't mess with Gus. Nice guy though, and it must have been a good feeling to see at least half of the presenters using VoodooPad.

Speaking of tall, I met Aaron Hillegass. WTF! I was looking up at him even when he was on the other side of the room. Another nice guy. I had fun telling him all about how I hadn't read any of his books, and never heard of the ranch and boot camp, etc.

Actually, that was a recurring theme for me. Hopefully by next year I'll have more experience with some of the awesome apps these guys write. One app that caught my eye was Knox. Marko Karppinen was sitting at my table on day 2, and he showed me all the cool stuff it does. If you need encrypted disk images, give it a shot.

Another guy at my table was Travis Cripps. They say you gotta watch out for the quiet types, and that's him. He knows his stuff, keep an eye out for his apps.

I talked quite a bit with Chris Patterson. Sadly, Chris is currently writing non-Mac code. We met up with one of his coworkers, Daron something, at Jak's Tap after day 1. That was awesome, because we're all former PowerPlant geeks. Beer, multiple inheritance, stack based classes, beer, polymorphism, RTTI, beer. The chicks were diggin it.

That reminds me of the only thing I regret about C4. On day 2 there was a guy sitting behind me who spoke up after every single presentation. Normally I wouldn't have remembered him, but the thing is- he didn't ask a single question. It was always some comment or suggestion or polite disagreement. I kept wondering who this guy was, who seemed to know just as much or more about every presenter's topic. So I glanced at his name tag and googled him after it was all over: Ben Artin...

Hey, that's funny- his pgp key is right next to meeroh's key on the keyserver. I haven't heard from meeroh in a long time. If you've written more than 3 lines of PowerPlant code, you probably remember meeroh from comp.sys.mac.oop.powerplant. Most of the questions I had over the years were answered by him, sometimes in a previous post, sometimes my own. That dude was smart as hell, wouldn't it be great to meet him...

Oh, look- meeroh changed his name to Ben Artin.

Dammit! That was meeroh! 10 feet away from me all day. No wonder he seemed to know so much. He did.

So, I didn't meet meeroh/Ben. Maybe next time. But I did talk to lots of excellent coders, and had a blast. Quick shouts to Paul Kafasis, John Gruber, Steve Dekorte, DB, Daniel Jalkut, Brent Simmons, and Ed, Mike and Kevin from Apple. Hopefully I'll see all of you again next year...

Friday, December 15, 2006

numMacDevBlogs++;

Welcome, all, to YAMDB. I write Mac code and I have opinions, so it's only fitting that I should have a blog. Also, Daniel Jalkut told me to start one, and I don't want to disappoint.

I've been writing Mac code for about 8 years, and only recently took the Obj-C/Cocoa plunge. So far, no regrets other than the message dispatch overhead(flame at will). It does seem curious, though, how "Mac development" seems to be synonymous with "Cocoa development" these days. Maybe it's just me, but now that I have a reason to learn Unix, I find the underlying system equally as fascinating as Apple's latest API. Before the flames engulf me- I am not insulting Cocoa, and I am enjoying learning more about Cocoa every day from all the other Mac devs. It's simply easier for me to learn the Unix stuff, with good ole C.

With that in mind, I hope to balance out the Cocoa-heavy blogosphere with a bit of the old-fashioned, procedural Unix goodness. I reserve the right to also post Cocoa stuff occasionally.

I also hope to keep things purely technical and avoid politics. Given the current climate of MacHeist, phantom exploits, and general Wintards, we'll see how that pans out.