This is a tale of compacting garbage collectors, buggy Just In Time compilers, and phone firmware that just should never have been released.
For the past few weeks I've been working on getting our astronomy application working smoothly on the Nokia 6600. Alas, the last firmware glitch I've run across is the nail in the coffin. First, a little back ground. As you would expect, this application is mathematically intensive, as it provides a visual display of what stars, planets and constellation are visible in the customers' sky for any particular time and location on Earth. When I first got past the general firmware bugs (see below), I noticed a peculiar crash. Randomly, the text labels on the screen showing objects in the sky would disappear, or the horizon line would disappear and then the labels, and then the horizon line may come back. After awhile the application would simply freeze up.
I suspected many things at first. It seemed like my arrays were changing values, so I suspected the garbage collector and I removed all calls to System.gc(). Then, as per Nokia's recommendation in their Know Issues document I added one call to System.gc() along with a 100ms delay to let the garbage collector do its work (why doesn't the system lock until its done compacting?). No change, the application still lost labels and crashed randomly. I had been using threads and I had heard the Monty VM handled threads differently. I removed all threads (no "new Thread" allocations) but found that on the 6600 Thread.activeCount() returned 5, versus on the N-Gage it returned 1. Hmm. Still, with no threads I figured something would change. Nope. So then I thought perhaps the VM has one thread for input, one for painting, etc. and perhaps my data was being overwritten. I tried doing some synchronization, and then I tried some more severe synchronization that guaranteed a lock on the data while it was either being read of being created. No change, the application still lost its labels and horizon, and then locked up.
I had to narrow the scope of this problem down, so I started displaying some of my variables on the screen in the crazy hope that I could see where and why it was crashing. This is when I found the last straw that broke my porting hopes back. First I noticed that my labels were disappearing because the application truly believed they had all been off screen. I suspected memory corruption, but noticed something decidedly more deceptive. When I displayed a particular Y variable, just when it lost the labels, the value of this Y variable changed from -308 to -65228. I thought, dang, that shouldn't be that large (or rather small)! Having lost the ability to do decimal to hexadecimal conversion in my head long ago, I whipped out my calculator. -308 is 0xFECC as a short value, while -65228 is –(0xFECC) as an int. Whoa, what was going on here?
Apparently what is happening is my most used code is being translated from bytecode to native instructions by the CLDC HI virtual machine. This is the VM that is in the Nokia 6600, known as "HotSpot" or "Monty". The "HI" stands for "HotSpot Implementation". Anyway, this VM is supposed to analyze running applications and take the sections of code that are most often used and convert them into native instructions. Done correctly, this will produce a tremendous increase in performance. However, I now feel that at least a negative short value being promoted to an int is translated incorrectly into native instructions on the 6600. What seems to be happening is the native instructions are taking the 16 bits from the short, checking to see if it's negative (high bit set I presume) and then negating it. Don't ask me why, it should just extend the high bit to the upper 16 bits, but what do I know? Knowing that I use "short" 922 times in my application, I started to loose hope that I could workaround this problem.
Concerned that my code was being changed under my feet, and not in an improved way, I attempted several other workarounds. First I inlined the code at risk everywhere I use it. This changed the random crash, but not completely, it would only sometimes loose labels before locking up. Then I had heard someone obtained success by compiling with the Java SDK version 1.3 instead of 1.4, so I tried that. My code did end up different (200 bytes smaller), but now the randomness had moved and actually disabled my Settings dialog. I tried removing all uses of streams (I only used them for application settings saved into RMS) as I had heard they leak memory on the 6600, and the random crash moved yet again.
Since I do not have any true debugging capability (just text on the display or saved information in RMS), I can't be completely certain what the HotSpot JIT compiler is doing to my code. All I know is that it apparently isn't doing it correctly and I'm tired of trying to work around all of the failures of this particular phone.
I wrote this summary to give back to the community of developers who have unknowingly helped me in my quest to port our product onto the 6600. More than anyone (including the latest Nokia Known Issues for the 6600 document), the developers on this forum have pointed out the oddities of this phone and potential workarounds. For the record, here are the ones that bit our application:
A) The first firmware anomaly we ran into was the Form.append() problem, which is really a problem with StringItems that have no labels. Form.append() works fine, but on the Nokia 6600 you can't append a StringItem if it has a null as its label, which is what happens when you do something like Form.append("Hello World");. Do Form.append(new StringItem("Hello", "World")); instead.
B) The second firmware idiosyncrasy I found was that the Calendar class appears, well, useless. Calendar.getInstance() just does not work, it appears, for time zones that are negative. But since that only includes the entire western hemisphere, I figured big deal right? Anyway, I could only work around this problem by removing all uses of the Calendar class and then I used System.currentTimeMillis() which I fudged into a Julian Day value and then converted that into Gregorian calendar values (year, month, day, etc.). Also note that TimeZone.getDefault() does not work either, which may be the root cause of the Calendar.getInstance() problem. Good thing I already had a database that included the time zones for our customers, or it would've been all over.
C) Number three was the problem with ChoiceGroups. This was a simple problem; when creating a ChoiceGroup with a label, you will need to call setLabel() on the ChoiceGroup to make the label stick on the 6600. I find it odd that StringItems with no labels crash the phone, but it throws out labels for ChoiceGroups. Go figure.
The firmware for the phone I'm using is version 4.09.1, which sounds like the most recent. At least it should be since I just purchased this phone last week for $400 from T-Mobile. Kind of a waste of $400, but a frustrating waste as well.
Best of luck to all of you in your porting endeavors!