Maximally Scalable Computing
via Physics-Based Models & Architectures
(working title)
ABSTRACT
for SGER and/or CAREER proposal
Michael P. Frank
CISE Department
University of Florida
June 6, 2000
The following is a draft abstract describing a line of research to be proposed
for NSF support under SGER, CAREER, and/or possibly other programs.
Motivation and Background
We are all aware of the great economic and social benefits that our increasingly
cost-effective computing technologies have been contributing to the nation
and to humankind, benefits that have been especially apparent during the
record-setting U.S. economic expansion of the last decade. How, as
scientists, can we best help to ensure that rapid improvements in computing
technology will continue for as long as possible?
Over the next few decades, assuming that Moore's Law [] continues to
hold, the sizes of our lowest-level computational devices (initially MOSFET
transistors, then whatever replaces them) will rapidly approach the nanometer
scale, and affordable computational systems will contain exponentially-increasing
numbers of such devices. As a result, the question of whether those
devices are being operated and organized in the most asymptotically efficient
manner becomes increasingly important. With the number of devices
per machine approaching the trillions and beyond, even a small polynomial
defecit in asymptotic efficiency can easily overwhelm typical constant
factor differences between architectures, and translate into significant
real deficits in computer performance below what could potentially be achieved.
To gain insight into the most asymptotically efficient and most scalable
possible ways of organizing computation, my colleagues and I have been
studying models of computation based directly on the (known) computational
capabilities and limits of physics itself [,]. Our models incorporate
all known fundamental physical constraints on information processing, such
as the speed-of-light limit, fundamental quantum limits on information
density and processing rates, and the very important constraints posed
by physical reversibility and the second law of thermodynamics. (Our
models are also perfectly compatible with the possibility of quantum-coherent
computing [,], though it is not yet clear whether that particular feature
will become practical to implement within the foreseeable future.)
The key feature of all our models is that they permit computations to
be performed in a manner that can be made both logically and thermodynamically
reversible
to an arbitrarily high degree of perfection (which must be scaled up modestly
with machine size). These models are thermodynamically efficient,
continually recycling nearly all of the energy used to encode their computational
states. This allows the devices to be effectively packed into dense,
three-dimensional configurations that decrease communication delays and
thereby speed up many parallel computations.
We have posed the very strong conjecture, which we call the "Tight Church's
Thesis" [,] that our preferred model (or some minor variation of it) is,
for all classes of computational tasks, asymptotically exactly as
efficient (within only a constant, not a polynomial factor) as is the most
efficient possible physical mechanism for performing the task. In
fact, we have proven that our model, in its physical instantiation, is
strictly
more efficient, by polynomial factors, than all bounded-temperature
physical implementations of traditional (irreversible) models of computation,
for a broad class of problems.
The major attraction for theoretical computer science of having such
a tight model is that it permits developing exact predictions of
the asymptotic performance of aggressive algorithms, and studying asymptotically
exact complexity classes, without the worry that the model's predictions
will fail to scale well to very large machine sizes, or that some more
efficient model will come along later to undercut one's lower bounds by
some polynomial factor. In contrast, the sort of model that we propose
could only be unseated if there were surprising new developments in fundamental
physics.
For computer engineers, the attraction of these physical models is that
they provide a very direct and explicit guide to the design of efficient
computer architectures, especially in the nano-scale regime. We have
calculated that, for a wide variety of proposed nano-scale computing
mechanisms and reasonable thermal constraints, any densely-packed machine
larger than a few microns or millimeters across will operate faster if
it uses our reversible model than if it uses any physically possible
implementation of any irreversible model of computation.
In the course of our research to date, we have actually implemented
and tested physical instantiations of our model using standard silicon
VLSI processes, including a simple universal parallel architecture, and
a more conventional RISC-style CPU. With present technology, the
number of devices in an inexpensive machine is not yet so great as to yield
benefits in raw performance per unit cost from the use of reversible technology.
However, even today, reversible technology can yield enormous increases
in performance per unit power, which may have near-term practical
applications in contexts such as mobile, embedded, or space-based systems
where power limitations are critical.
Finally, an interesting long-term insight gained from our work on reversible
computing is that the underlying physical fact of reversibility cannot
be efficiently hidden away at a low level, and instead, for optimal scalability
of computation to ever-larger numbers of processing elements, reversibility
must eventually be propagated all the way up to the level of the instruction-set
architecture, programming language, and application algorithms. So,
a large fraction of the field of computer science will eventually need
to evolve to accomodate a significant concern with reversibility, if we
are truly to approach the ultimate physical limits of computing.
Proposed Plan of Research
The aim of the proposed research is fourfold:
-
To continue to refine our theoretical understanding of the fundamental
physical limits of computation;
-
To continue our theoretical and design work on models of computation, computer
architectures, programming languages, and algorithms that are explicitly
designed with the intent to attain the maximum asymptotic performance that
is compatible with physical law, within competitive constant factors;
-
To continue our work on developing implementations of our designs that
are cost-effective for near-term specialized applications in low-power
computing and densely-packed supercomputing;
-
To use our insights from the physics of computation to help guide the widespread,
longer-term efforts to develop a cost-effective, general purpose nano-scale
computing technology to replace the MOSFET transistor, once that device
has reached its physical limits.
Here are my plans for working towards these four goals, in more detail.
1. Fundamental physical limits.
This part of the work will involve an ongoing personal effort, with assistance
from my colleagues, to understand more, and at a deeper level, about the
fundamental implications that physics has for computation (and vice-versa).
This is a long-term effort. Currently evident aspects of this effort
include:
-
Quantum computing. It is important for this effort to continue
to keep up with the latest developments in quantum computing (theoretical
& experimental). If and when practical, scalable quantum computing
starts to look feasible, it should be included in our model, and we should
focus more effort on developing quantum coherent algorithms.
-
Entropy density. We would like to learn more about the fundamental
quantum limits on entropy/information densities. Presently understood
bounds [,] are rather loose and approximate. We would like to know
more accurately how high of an information density can be realistically
achieved, as a function of energy density.
-
Relationship between physical entropy and computational information.
We think we have gained significant new insights into the meaning of physical
entropy as a result of our work in reversible computation. Specifically,
physical
entropy can be simply defined, for all practical purposes, as the amount
of physically-expressed information in a system that (for one reason or
another) cannot be uncomputed. (Where "uncomputing" refers
to any process that clears some redundant, encoded information by performing,
in reverse, a series of physical steps that would have generated that information.)
We have been invited by the journal Entropy to contribute a paper
to discuss this viewpoint in greater depth.
-
Relativistic quantum models. I would like to better understand
the computational implications, if any, of relativistic quantum physics
(QED, QCD, etc.). To this end I will educate myself further about
these fields.
-
General relativity. We would also like to attain a better
understanding of the (very long-term) computational implications of general
relativity. Gravitational limits are the only fundamental physical
constraint on computation that we know of that is not yet well reflected
in our models. (Of course, gravitational effects are not expected
to become important any time soon, so this task has a relatively low priority.)
2. Physics-based computing models and architectures.
The second major goal of this research is to design and develop optimally
efficient and effective computing architectures, at all levels from theoretical
models through processor architectures, languages, and algorithms.
This work includes:
-
Model building and dissemination. Although the basic structure
of our physical models of computing has already been defined [], more work
remains to be done in formalizing the models more thoroughly, and especially
in disseminating what we have learned about the advantages of these types
of models to the wider theoretical computer science community, with more
publications in more widely-read journals.
-
Asymptotic advantages of reversible models. Though we have
published two papers and a thesis chapter [,] dealing with this topic already,
some of our latest results in this area have not yet been journaled, and
all of these results need to be cleaned up and presented more formally
and rigorously. Some further extensions of the scaling theory are
also possible.
-
Theoretical bounds on overhead from reversibility. The polynomial
asymptotic performance advantages of reversible models, which arise from
thermal constraints, are attenuated by small algorithmic overheads that
are generally incurred in reversible computing. We have made significant
progress [,] towards establishing tight theoretical lower bounds on the
worst-case amount of this overhead case, and we believe this work can be
completed fairly readily. This will enable a more precise and definitive
characterization of the maximum asymptotic speedup that is attainable through
reversibility in the general case.
-
Reversible algorithms. Regardless of the general case, reversibility
yields even greater speedups for many specific classes of problems for
which highly efficient reversible algorithms exist []. This line
of work will develop efficient reversible equivalents for a wide variety
of important conventional algorithms.
-
Reversible hardware architectures. There is of course the
issue of coming up with an efficient hardware architecture, with a convenient
programming model. Conventional irreversible architectures are still
evolving rapidly, with many interesting recent innovations such as reconfigurable
[] and processor-in-memory [] architectures. We foresee an ongoing
effort to develop efficient partially- to fully-reversible versions of
the best ideas in conventional irreversible architectures.
-
Reversible programming languages. Similarly, we would like
to develop asymptotically efficient reversible variants of the best new
ideas in programming languages. We note that a large part of the
challenge of developing an asymptotically efficient programming language
is to develop an efficient & convenient language for expressing parallel
computations. This is a problem that to our knowledge the parallel
computing community has not yet solved in a fully satisfactory way.
Our work will aim to integrate our insights in the physics of computing
with the parallel programming challenge. (For example, perhaps an
effective reversible, parallel programming language would be similar to
a language for specifying an assembly-line process in a physical factory.
This makes sense because information, like material objects, cannot just
appear and disappear, but rather can be moved from place to place, assembled
(computed), and disassembled (uncomputed).)
3. Development of Practical Near-Term Applications.
There are a number of interdisciplinary challenges that remain for bringing
reversible computing ideas to practical fruition, even for the more near-term
applications such as ultra-low-power computing in energy-limited environments,
and for densely-configured supercomputing. These challenges include:
-
Power supplies. Develop a high-Q resonant oscillator
with the right characteristics for driving MOSFET-based reversible ("adiabatic")
circuits. Currently we think that MEMS-based mechanical oscillators
look like the best approach, as they can attain Q's in the tens
of thousands, and sufficiently high frequencies, and are not too large.
I am currently collaborating with MEMS experts Toshi Nishida (in UF's ECE
department) and Sam Miller (at Sandia National Labs) to work out the details
of this approach, and also I am looking at analog designs with power electronics
expert Khai Ngo, also in ECE.
-
Leakage bounds. More accurately characterize expected leakage
characteristics of future MOSFET technology generations to determine upper
bounds on the feasible performance/power ratio of adiabatic technology.
(Leakage is an important limiting factor on all low-power MOSFET technologies.)
Current estimates are that adiabatic technology will enable substantial
low-power advantages throughout at least the next 10 years of MOSFET technology,
but for a more precise forecast, I am talking with researchers who are
involved in producing the ITRS (International Technology Roadmap for Semiconductors).
-
Packaging. For dense supercomputing applications, better 3-D
packaging and SOI (silicon-on-insulator) technology needs to be developed
to enable large numbers (hundreds to thousands) of layers of reversible
circuitry to be cost-effectively integrated into modestly-sized packages.
Sandia National Labs also has experience in that area, and I am talking
with Thomas Fischer, head of their packaging division, about working together
on this problem.
-
RAMs. Good, dense adiabatic RAMs with long-term data retention
need to be developed. Adiabatic SRAM and DRAM designs exist, but
the SRAMs are not very dense, and the DRAMs may have problems with refresh
rates. I am talking with Joe Brewer, expert in nonvolatile RAMs,
about alternative ideas in this area.
-
Functional elements. Work needs to be done on developing more
efficient adiabatic versions of critical functional circuit elements such
as adders, multipliers, etc. This will involve developing hardware
equivalents of known efficient reversible algorithms. This work will
be done in collaboration with Bill Eisenstadt, a VLSI circuits expert in
UF's ECE department.
-
Architecture & algorithms. The reversible architecture
and algorithms work mentioned in the previous section will of course also
contribute to this effort, although practical near-term reversible architectures
need not be fully reversible.
-
Applications. Last but not least, more work is needed to identify
specific, suitable near-term applications for this technology. In
small systems, the greater energy efficiency of the adiabatic approach
comes with a price tag of somewhat lower logic-hardware efficiency.
Therefore, it is not cost-effective except in applications where total
costs are limited more by energy-related costs than by the cost of the
logic hardware itself. Identifying such applications and quantifying
the improvements in cost-efficiency that can be attained is of course critical
to the development of reversible computing as a practical near-term technology
for the low-power niche. I am drawing upon a number of university
and industry contacts for more information about application requirements
in this area.
-
Experimental validation. In addition to creating the above-mentioned
designs & characterizing them theoretically & through simulation,
we have a strong desire to thoroughly validate our work with physical experiments
as we go along. This will ensure that our understanding of the relevant
physical issues does not miss anything critical. All of the above
design elements, except for the new packaging technology, can be implemented
straightforwardly using academically available design tools and lithographic
processes. (For advanced packaging, we will rely more on collaborations
with industrial partners such as Sandia.) Testing of components and
systems can primarily proceed in a typical electronics workbench environment,
although we would like to augment the electronic testing with thermal imaging
and high-precision thermoelectrics-based calorimetry (such as described
in []) to enable us to isolate parts of our system that are dissipating
too much power, and also to accurately measure the dissipation in subsystems
that are operating in an extremely low-power regime.
4. Help guide efforts towards nano-scale computing technologies.
Part of this work will involve keeping up with the latest research developments
in all the promising areas dealing with potential future nano-scale computing
mechanisms, for example:
-
Nanoelectronic structures (quantum dots, resonant tunning transistors,
hybrid structures) []
-
Superconducting electronics []
-
Molecular/nanotube electronics []
-
Nano-mechanical computing []
-
and also various advanced cooling technologies.
Of course, this list will change as promising new technologies arise.
The purpose of studying the arising technologies is to enable accurate
forecasting of their parameters that are relevant to computation, which
will enable us to more accurately determine the size/cost scale of machines
above which our new asymptotically optimal architectures will begin to
dominate. Thus we will be better able to anticipate when will be
the best time to focus our maximum effort towards pushing these novel architectures
into the mainstream.
In addition to influencing computer architecture, there is an important
need to disseminate to researchers who are working on new low-level device
technologies some of the important insights about device design that have
been learned from studies of physical models of computing. These
insights include the following:
-
Any competitive nano-scale computational device must be designed to support
an asymptotically thermodynamically reversible ("adiabatic") mode of operation,
or else it will not be competitive with devices that do, except in very
small computing systems.
-
A key parameter determining the overall computational efficiency of a reversible
device is something which we call the entropy coefficient - the
ratio between the device's entropy generation per operation, and its frequency
of operation (this ratio is roughly constant with frequency in many asymptotically
reversible mechanisms). This parameter should be minimized.
We don't know of any fundamental lower bounds as to how small it may be.
-
The entropy generation of the device when it is operated irreversibly should
also be minimized. (Though there is an unavoidable lower bound on
this parameter, namely kB ln 2 per bit erased.)
-
The minimum rate of leakage of a device's energy out of its intended state
space is another critical technological parameter. If this parameter
can not be made insignificantly small, then it sets an upper bound on the
scalability of 3-D architectures, even if they are reversible. We
don't know of any fundamental lower bounds on this parameter. In
fact, we know that in both large MOSFET devices, and nano-mechanical molecular
devices [], it can indeed be made insignificantly small. But it is
critical that the developers of other candidate nano-scale technologies
also pay attention to this parameter, or they will find themselves uncompetitive
for massively parallel computing. Also, this parameter must be kept
below a certain bound [] in order for a technology to be a potential candidate
for supporting scalable quantum coherent computing.
We believe there is a valuable role to be played in making sure that applied
physicists and device technologists who are working on future nano-scale
computing mechanisms are well aware of these issues, as it will help enable
machines based on these mechanisms to be as scalable as possible.
Conclusion
In summary, I am planning an ambitious, multifaceted research career, pursuing
an interdisciplinary but unified set of scientific and engineering studies
that are expected to have a significant impact across many areas of computer
engineering and computer science, and that ultimately will help computing
performance to continue to improve as rapidly as possible, for as long
as possible, so as hopefully to foster continuing national and international
prosperity.
I am preparing now to submit NSF proposals to continue the significant
start that I have already made towards this goal during the course of my
Ph.D. work and in my first year as a faculty member at UF.
I anticipate making even more rapid progress in the above-outlined areas
over the next few years, as I recruit graduate students, continue to build
my industry & university collaborations, and publish a variety of journal
articles describing this work (I have already been invited to submit articles
relating to this work to four different journals). Over time,
I hope that the continuation of this project, together with its network
of collaborators, will become an important nexus in the worldwide research
effort pushing boldly onwards towards the ultimate physical limits of computation.
I hope that the NSF and its directorates will agree with me that the
proposed research program represents a significant potential long-term
scientific contribution towards the future of our technology and of our
civilization, and that the time is now ripe to pursue it.