Moderator/Panelists: James Hoe (Carnigie Mellon University), Joel Emer (Intel), Doug Burger (Microsoft and University of Texas at Austin)
Microprocessors are simulated at every development stage to evaluate their functionality, performance, and lately, power and temperature.† Because microprocessor complexity generally grows faster then microprocessor performance, some believe the performance of accurate simulators has been effectively slowing down relative to the simulated system.† This panel will discuss (i) if this problem really exists and, if so, (ii) techniques to significantly improve simulation performance, specifically through various simulator parallelization techniques.
Joel Emerís position statement
Recalling that the purpose of architecture research is to provide a sufficiently compelling case for an idea to proceed toward design, I believe that are an increasing number of cases where our current methodologies are ineffective. That is they are incapable of providing compelling evidence of the merit of a design. Examples include large multiprocessor systems and large caches (especially shared caches). In each of these cases the length of the simulation at an adequate level of fidelity (which I believe is quite high) result in simulation lengths measured in days and weeks. Such simulation lengths are impractical for wide exploration of a design space. I would further argue that the more radical a proposal is the more insufficient our current approaches due to the need for more and longer benchmarks to be convincing.
At this time, the most promising approach that I am aware of is to use FPGAs as a platform for performance modeling. Such an approach has the appeal of running simulations more rapidly than a pure software simulator and being more flexible for design space exploration than a prototype. Unfortunately, it has the disadvantage of turning a software programming exercise into a hardware design process. Thus, to be practical I believe we need to develop a more systematic and easier-to-program approach to hardware design. Some of the attributes of such a approach include: more modularity, well-defined simulation primitives, debugging aids and a higher level representation than traditional hardware design languages provide.
Doug Burgerís position statement
Our evaluation methodologies to date have relied on shared research into general-purpose computing, where the community works together to innovate on a shared substrate.† The success of infrastructures such as SimpleScalar, M5, SimOS, and others result from this shared model that many researchers use to advance this shared state of the art. The exponential growth in transistors will continue for a few more generations at least.† However, the shift away from devoting the bulk of the transistors from more powerful single cores, coupled with the ongoing changes in the computing industry, is making traditional microarchitectural simulation increasingly irrelevant.† The on-chip real estate is going toward SoC functions (such as the inclusion of network interfaces and memory controllers), accelerators (such as graphics units), and additional cores.† My view is that the era of advances in general-purpose computing, while still important, have slowed appreciably and may be coming to an end.† Our simulation infrastructures and in fact methodologies are not capable of simulating the rapid changes in workloads and system requirements, and in particular, are not ready for a world in which systems optimize for a domain of workloads or even specific workloads.† As the community fragments, we need higher-level system models that permit rapid, high-level estimation of the capability of a specific innovation.† This model is quite different from what researchers currently use, and is perhaps different from some of the shared infrastructure projects in flight today.