At PARC we estimated that we could get
about a factor of 5 from special low level (HW+firmware) design. If
Moore's Law is doubling every 18 months, then this is almost 4 years
of being ahead if you can compete with ordinary silicon (your factor
of 8 would be 3 turns of Moore's Law, or about 4.5 to 5 years). The
Alto was a big success because it took Chuck Thacker and two
technicians only about 3.5 *months* to make the first machine, and it
only took another month to move the first Smalltalk over from the
NOVA to the Alto. So we were off and running.
If we believe Chuck's estimate that we've lost about a factor of a
thousand in efficiency from the poor design choices (in many
dimensions) of Intel and Motorola (and Chuck is a very conservative
estimator), then this is 10 doublings lost, or about 180 months, or
about *15 years* for Moore's Law to catch up to a really good scheme.
This is a good argument for trying very different architectures that
allow a direct target to be highly efficient VHLL *system* execution.
A small group approaching this should try to do everything with
modern CAD and avoid getting messed up with intricate packaging
problems of many different types. So I would look at one of the
modern processes that allows CPU and memory logic to be on the same
die and try to make what is essentially an entire machine on that die
(especially with regard to how processing, memories and switching
relate). Just how the various parallelisms trade off these days and
what is on and off chip would be interesting to explore. A good
motivator from some years ago (so it would be done a little
differently today) is Henry Fuch's "pixel planes" architecture for
making a renderer as a smart memory system that has a processor for
each pixel. Such a system can be have a slower clock and still beat
the pants off a faster single processor von Neumann type architecture.
-Alan Kay
[link|http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-March/055371.html|http://lists.squeakf...March/055371.html]