Intel of course - like I can afford a real machine :) It's gcc 3.3 with default options.

I'm sure the math is using a hybrid of FPU and emulation. The extra bytes must surely play the same role in 64/96 as they do in 53/80.

(You may already know this, but just in case - in numerical computing accuracy is worthless if you lose it all at once in a calculation due to over/underflow, so the thing to do is have a way of gradually underflowing so you can halt a calculation before it loses significance. The FPU has a "condition" or "status" register that indicates when a calculation has underflowed. However, the underflow can sometimes be handled by switching to a "denormalized" number format that is less accurate but can represent smaller numbers. So the math code can grind away just up to this point of "gradual underflow" and then stop with assurance of full precision.)