Gee. I'd think a Perl guru like Ben should be able to whip
something together pretty easily...
:-)
I've done a little browsing around with Google.
[link|http://xml.apache.org/fop/index.html|FOP] is a Java XSL to PDF tool. Since HTML can be viewed as a subset of HTML, perhaps something like this could be useful. Also on the Apache site above are Perl tools for XML, but nothing seems to be related to PDF output (that I could find).
Adobe has tools for PDF to HTML, but not the other way around (that I've found) and they're limited to Mac and Win.
[link|http://www.pdfzone.com/products/software/tool_activepdfwebgrabber.html|activePDFWinGrabber] is a Win tool which does HTML to PDF.
[link|http://www.pdfzone.com/products/software/tool_html2ps.html|html2ps] is a Perl HTML to PS script. It claims to support much of the HTML4 spec and "incidentally, the PostScript and PDF versions of the HTML 4.0 draft, were generated using html2ps" and "When converting the PostScript document to PDF - using some other program such as version 5.0 or later of Aladdin Ghostscript, or Adobe Acrobat Distiller - the original hyperlinks in the HTML documents will be retained in the PDF document.".
[link|http://www.pdfzone.com/products/software/tool_HTMLDOC.html|HTMLDOC] has similar claims about some HTML 4 support.
Looks like the last 2 links above are worth investigating.
HTH.
Cheers,
Scott.