Project objective. Data transfers account for a significant fraction of the overall energy and power of today’s multicores, and greatly impact their performance. While an arithmetic operation consumes about 20pJ at a 28nm technology, transferring the operands from the other corner of a 400mm chip requires 1nJ and tens of processor cycles. At the same time, power and bandwidth constraints impede the performance of multicore chips and limit their scalability, as exemplified by the advent of dark silicon. This project takes a fresh look at the management and data transfers across on-chip storage locations, and proposes a hardware-software co-design that eliminates unnecessary communication and increases energy efficiency. The design is enhanced through the introduction of nanophotonics, which enable the implementation of a manycore “virtual chip” by splitting a large single-chip design into multiple smaller chiplets, and connecting them via optical fibers. The chiplets are spread far apart in space to allow for efficient cooling with conventional forced air technology, while the low latency and high bandwidth density of optical signaling maintain the tight coupling of cores characteristic of single-chip designs. Thus, the system architecture pushes back the power and bandwidth walls to create virtual chips at a scale and energy-efficiency impossible to realize with conventional technology. Intellectual merit. Our preliminary investigation in energy-aware cache management shows that we can eliminate half of the messages sent on an on-chip interconnect, minimize the electrical distance of the remaining ones, and minimize directory capacity, overall saving more than 20% of the processor energy. Utilizing nanophotonics allows for even higher energy efficiency, and pushes back the power and bandwidth walls to create virtual chips at a scale impossible to realize with conventional technology. Our results indicate that our design attains average speedup of 1.8x (2.2x maximum) over the best single- and multichip alternatives and 41% (61% maximum) smaller energy-delay product, while it scales beyond 4K cores. Integrated educational plan. The advent of multicores has brought large-scale parallel systems to the masses for the first time. As such, understanding their architectural trade-offs and design considerations has become an educational requirement of future computer scientists and engineers. We address this concern by integrating a rigorous educational plan to our research agenda. This plan (a) strongly connects research to education by integrating the research tools we’ll develop into teaching activities, incorporating research findings into the curriculum, and developing new graduate and undergraduate courses (b) fosters collaborative student projects and broadens the participation of undergraduates, women, and minorities in research, (c) expands our educational activities through tutorials and new workshops at major conferences, and (d) brings together the computer architecture and photonics communities through focused workshops. Broader impact. Energy consumption is a limiter of growth and big science, and negatively impacts both the environment and the economics of computing. Worldwide computer energy consumption reached 408TWh in 2010, and 150TWh in the U.S. alone, accounting for 3.8% of the U.S. energy production for a total of $15B. Energy consumption hampers big science, as realizing an exascale machine requires a 200x reduction in energy per instruction. At the same time, multicore chips cannot realize their full potentia
|Effective start/end date||5/1/15 → 4/30/21|
- National Science Foundation (CCF-1453853)
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.