all items
rss feed

/ \
Ugh, stop twitching
object lesson
coding Posted 2007-11-07 15:45:04 by Jim Crawford
Chris Lamont has produced a PDF explaining the Excel formatting bug that led to certain values near 65,536 being displayed as 100,000. On the second page, he writes: “The bug seems to be introduced when the formatting routine was updated from older 16-bit assembly code used in previous versions of Excel to a presumably faster 32-bit version in Excel 2007. It is surprising such a bug slipped through, but to anyone thinking they can write an IEEE 754 floating-point to text routine using only bit twiddling and integer math with no 'sprintf' cheating, please try to write one and see how hard it is to get right!”

The Excel team deserves no sympathy here. The real mistake here wasn't the bug: it was that they made a tradeoff that drastically increased the likelihood of bugs -- and drastically decreased the maintainability of their code -- for a negligible speed gain. The mistake was exactly their decision to implement a custom, assembly-optimized decimal print routine rather than just calling sprintf. Or “cheating,” as Lamont puts it.

He later goes on to offer the rationale that “converting floating-point values to text needs to be high performance for Excel.” I'm sorry, but you're not going to sit there and tell me they ran a profile and sprintf came up near the top. The people who write standard library implementations are not chumps. It would take tens of thousands of calls during a single screen update to generate a noticeable hitch on CPUs manufactured in this millenium, and I would be stunned if Excel peaked at a tenth of that. And if it turns out the standard library was implemented by chumps, you're Microsoft and you can afford to switch library providers.

But that's just a basic premature-optimization lesson. As evidenced by the involvement of 16-bit assembly language, this decision was very old. Maybe they hadn't invented profilers back then, who knows? There's another, more interesting way to look at this mistake, and a subtler lesson to learn. As the master put it: “byzantine code paths extract costs as long as they exist, not just as they are written.”

Case in point: a coder on the Excel team took a look at this particular 16-bit assembly routine sometime during the past year or so, failed to realize that it was fucking 2007 already, and decided it was high time to rewrite the code in 1986-era assembly language rather than 1982-era assembly language.
[link to this] [See more on “coding”]

add a comment
Only anonymous comments are available for now until I get the user system up and running again. Not many people were logging in anyway, so enh.
Permitted HTML tags: <b>, <i>, <u>, <tt>. Also permitted is the <q> pseudo-tag which is meant to delimit quotes from other messages.
To prove you are sentient, please type "sentient" into this box

what's this?
This is Jim Crawford's blog. Details and contact information.

On Twitter: @mogwai_poet

recent comments
Overview (Anonymous on may 2014 microblog digest)
no subject (Anonymous on troboclops - hate edge)
no subject (Anonymous on troboclops - hate edge)
hp printer support phone number (Anonymous on troboclops - hate edge)
great (Anonymous on take a key for coming in)
Thank you very much (Anonymous on take a key for coming in)
Astrologer for Love Problems (Anonymous on troboclops - hate edge)
Hp Printer Support Phone Number (Anonymous on troboclops - hate edge)
Please visit site (Anonymous on troboclops - hate edge)
Please visit site (Anonymous on troboclops - hate edge)
Please visit site (Anonymous on troboclops - hate edge)
Please visit site (Anonymous on troboclops - hate edge)
Job Astrology (Anonymous on may 2014 microblog digest)
Finance Astrologer (Anonymous on may 2014 microblog digest)
SOURABH (Anonymous on troboclops - hate edge)
Comments RSS