NCR and others had 32+ CPU boxes ~ 10 years ago, IIRC. OS/2 has had threading for many, many years. There is not going to be a quick transition to hyperthreaded applications if the past is any guide.
I interpreted the article to be saying that the past isn't any guide; that is, that the environment has changed. Ten years ago, the ROI wasn't there for threading (compared to the ROI by simply waiting for Moore's Law to catch up).
What I do see happening, potentially, is some small company coming up with a plug-in VM-like system that sits on the OS or on the hardware and lets existing applications take better advantage of parallelism in the hardware.
I see mainstream programming languages growing libraries which make concurrent applications easier to write. There has been some talk about this recently for Java and Python. If C# can do it without overcomplicating it (check out COM threading sometime ;) then we might get it at the language level industry-wide.