Thanks for sharing! Dispatching single-pass parser, it looks like (unless I missed something on a cursory glance). The bit about constructing a huge descending-length regex was interesting. I had forgotten about the WeeCodes, since I never use 'em. ;) That's a good chunk of the handlers.
Reading parts of it makes me sad for being stuck on 1.5.2, however. I can see some significant speedups if one could use just newer libs*. Which of course a full rewrite would address. Plus the logic is already built; that's 90% of the battle.
* _Perhaps_ with HTMLParser, but I wouldn't be stuck on that.