Keeping pools of objects around so that you can reuse existing ones is an old optimization to avoid expensive creations. The very fact that this has been noted as a bottleneck and the optimization applied for thread creation is evidence that thread creation is expensive. Whether it works for you depends on your usage pattern, but if your process hangs around for a while, both asking for and returning objects, then the odds are very good that your usage pattern fits.

The fact that this can beat a select solution means little. As I mentioned, select has its own performance drawbacks. (Such as busy waits.) Similarly note that the performance of heavily threaded code is strongly OS dependent. It is no accident that you used NT and Solaris as your examples, but not Linux or HP-UX.

For more, a lot more, I find [link|http://www.kegel.com/c10k.html|The C10K Problem] a good read. I am fairly sure that you have seen it before, but there probably have been updates since you last read it, and there are a lot of links that are just plain good. Among other relatively recent additions are links (which I have admittedly not read) on how non-blocking I/O can be a performance win for Java.

Cheers,
Ben