Source: EEWeb
By: Lauro Rizzatti and Jean-Marie Brunet
This is part 1 of a 3-part series called Hardware Emulation: Realizing its Potential. The series looks at the progression of hardware emulation from the early days to far into the new millennium. Part 1 covers the first decade from inception from ~1986 to about ~2000 when the technology found limited use. Part 2 traces changes to hardware emulation as it became the hub of system-on-chip (SoC) design verification methodology. Part 3 envisions what to expect from hardware emulation in the future given the dramatic changes in the semiconductor and electronic system design process.
Historical Perspective
In the second half of the 1980s, a new verification technology was conceived. Called hardware emulation, it was promoted as being able to verify and debug the largest chip designs being built.
The driving notion was to substitute the logic simulator with an array of Field Programmable Gate Arrays (FPGAs) configured to mimic or “emulate” the behavior of the design-under-test (DUT). The ambitious objective required a vast collection of FPGAs that numbered in the hundreds mounted on boards, installed cabinets and interconnected via active backplanes populated by another array of FPGAs.
In addition to being re-programmable, the main advantage was the speed of execution, which was much faster than the typical software simulator. The high-speed allowed the connection of the emulator to a physical target system. This deployment scheme was known as in-circuit emulation or ICE.
Early emulators were difficult and time-consuming to deploy and the industry coined the expression “time-to-emulation (TTE)” to measure the time required to bring-up the DUT to the point where emulation could begin. Typically, TTE was measured in several months, long enough that the emulator often was ready for use past the time the foundry released the first silicon, defeating its purpose.
These emulators were single-user resources, which contributed to their high cost of ownership (COO). In fact, emulators were more expensive than most engineering budgets could manage. As a result, only the most difficult designs, such as graphics and processors, benefited from hardware emulation.
Table I summarizes the characteristics of the Quickturn System Realizer launched in 1994. Back then, it was the state of the art in emulation.
Table I: Specifications of a popular hardware emulator circa 1995 were considered state of the art. (Source: 1992 Quickturn news release) | |
Quickturn System Realizer | |
Programmable Device | Xilinx XC4013 |
Architecture | Array of commercial FPGAs |
Design Capacity | Up to ~2.2 million gates |
Deployment Mode | ICE |
Speed of Emulation | Up to 8MHz |
Time-to-Emulation | Months |
Compilation Time | Days |
Deployment | Very difficult |
Support | Intense |
Concurrent Users | Only single users |
Remote Access | No |
Cost | 3 $/gate |
Early Hardware Emulator Deployment
Deployment of the early emulation platform proceeded in steps. First, an engineer had to compile the DUT for emulation on an off-line server. Once the design binary object was finalized, it had to be loaded onto the array of FPGAs inside the emulator. At this stage, emulation and debugging could begin.
DUT Compilation
The compiler had to convert a DUT gate-level netlist or, later, a DUT register transfer level (RTL) description into a timing-correct, fully placed and routed system of tens or even hundreds of FPGAs.
One of the toughest issues in an array of FPGAs was, and still is, the limited number of input/output pins that prevent the full use of the FPGA resources. It created a bottleneck that the FPGA partitioner had to overcome.
Different companies experimented with different schemes to alleviate it. Among the solutions, Quickturn invented the partial crossbar. Virtual Machine Works devised an FPGA compiler technology called Virtual Wire that transmitted several I/O signals on a single pin under the control of a timing scheduling process, a synchronous multiplexing method.
In general, asynchronous pin-multiplexing was the most common approach. A less-than-perfect partitioner may assign large blocks of logic to one or more FPGAs causing an increase in interconnectivity that may require a pin multiplexing factor of 32x or 64x or higher. This produces a dramatic drop in emulation speed. Likewise, a partitioner that does not handle timing—in those days none did—may introduce long critical paths on combinational signals by routing them through multiple FPGAs, called hops, dramatically hindering the maximum speed of emulation.
Another source of serious trouble was the routing of clock trees. The unpredictable DUT layout onto the fabric of the FPGAs typically yielded a large number of timing violations. Countless experts devised a myriad of smart ideas to tame this issue.
DUT memories were implemented via memory models that configured the on-board standard memory chips to act as specialized memory devices. The creation of memory models was not rocket science, but it added one more undertaking to the deployment of the emulator.
After all of the above was done, the FPGAs had to be placed-and-routed.
Run-time Control and Design Debug
The early emulators did not have an operating system (OS). Instead, they came with a runtime control program that performed two main functions.
First, it loaded the design object, which was a datastream to configure the FPGAs, into the box and configured its hardware to mimic the DUT behavior. Once this mapping process succeeded, the runtime control connected the physical target system to the DUT and, in stand-by mode, waited for the users to start/stop the run.
The second function was the rather rudimentary design debugger. The lack of visibility into commercial FPGAs was addressed by compiling probes in limited number into the design providing a rather small space-window in the DUT. This overextended the design-iteration-times (DIT) every time a missing probe was needed since it required a full recompilation of the design. The DUT internal activity was captured in on-board memories that had limited storage, constraining the time-window in the DUT. Figure 1 captures graphically the space/time restricted windows.
Figure 1: Three trace windows were restricted in time and space due to limited storage. Source: Authors
Hardware Emulation versus FPGA Prototyping
The idea of configuring a programmable device to behave like a digital design before committing it to silicon pre-dated hardware emulation. That idea gave rise to a technology called FPGA prototyping. The commercial FPGA prototyping offerings started at the end of the 1990s by two pioneers: Dini and Aptix Corporation.
Differences between Hardware Emulation and FPGA Prototyping
From the perspective of the hardware architecture, an emulation system is more complex than commercial FPGA prototyping boards. Off-the-shelf FPGA prototyping boards are comprised of a few FPGA devices, although those custom made for internal use could reach into the hundreds of chips.
By contrast, emulation systems must be highly scalable and encompass up to several hundred programmable devices and sacrifice some level of performance for the sake of scalability. They offer flexibility to model any type of DUT memory via pre-configured memory models, but the approach limits the maximum speed of the emulator.
Conversely, in an FPGA prototyping board, DUT memories are implemented with physical memory chips of equivalent functionality allowing for higher speed of execution than equivalent memory models.
Further, emulators must ensure that sufficient signals, gates, clocks, I/Os, and routing options are available to handle a wide range of designs at a non-trivial performance point.
From the perspective of software support, design bring-up and compilation in an emulator is an automatic process with minimum manual intervention. In the early days, this was a challenging proposition as noted above. In an FPGA prototype, design bring-up and compilation are manual processes necessary to fine-tune DUT mapping for trading off automation for higher speeds of execution.
When the DUT is coded in RTL, whether mapping the DUT onto an emulator or on an FPGA prototyping board, the compiler must synthesize it into a gate-level netlist. The difference is that in an emulation system, synthesis optimizations are sacrificed in the interest of producing a netlist as quickly as possible. In an FPGA prototype, synthesis optimizations are used to reduce the size of the FPGA binary stream.
From the user perspective, different performance and flexibility tradeoffs differentiate the two verification tools. A prototyping board targets higher performance with less emphasis on debugging and bring-up time. An emulation system excels at debugging hardware and integrating hardware and embedded software.
Table 2 summarizes the main differences between early FPGA prototyping boards and early emulation systems.
Table 2: Differences between early emulators and FPGA prototypes highlight why emulators are easier to deploy and use. (Source: Authors) | ||
Criteria | Emulator | FPGA Prototype |
Design Capacity | > 1MG | << 1MG |
Design Scalability | Scalable | Not scalable |
Design Setup | Weeks | Months |
Design Debug | Good | Very Poor |
Execution Speed | 1-5MHz | > 10MHz |
ICE Support | Yes | Yes |
Price | > $1M | $50k – $200k |
Hardware Emulation versus FPGA Prototyping Applications
Roughly speaking, until the end of the 1990s, simulation was used on all designs for hardware verification. Emulation was limited to hardware verification of processor and graphics designs. FPGA prototyping became popular for system-level validation and early embedded software validation. Table 3 captures the rough percentages of the three applications.
Table 3: Verification tool deployment in percentages at the end of the 1990s shows the popularity of simulation. (Source: Authors) | |
Verification Technology | Year 2000 |
Simulation | 100% |
Emulation | << 1% |
In-house FPGA Prototyping | < 2% |
Commercial FPGA Prototyping | << 10% |
Conclusion
In the early days, hardware emulation was used for only the most sophisticated, challenging designs at the time.
In Part 2 of this series, which will be live on EEWeb next Tuesday, we will look at how hardware emulation moved into the new millennium with a new outlook on chip design verification.
About Lauro Rizzatti (Verification Consultant, Rizzatti LLC)
Dr. Lauro Rizzatti is a verification consultant and industry expert on hardware emulation. Previously, Dr. Rizzatti held positions in management, product marketing, technical marketing and engineering. (www.rizzatti.com)
About Jean-Marie Brunet (Director of Marketing, Mentor Emulation Division, Mentor, a Siemens Business)
Jean-Marie Brunet is the senior marketing director for the Emulation Division at Mentor, a Siemens business. He has served for over 20 years in application engineering, marketing and management roles in the EDA industry, and has held IC design and design management positions at STMicroelectronics, Cadence, and Micron among others. Brunet holds a Master of Science degree in Electrical Engineering from I.S.E.N Electronic Engineering School in Lille, France. (www.mentor.com)