Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation...

13
Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim ; Ciesielski, M. ; Dept. of Electr. & Comput. Eng., Univ. of Massachusetts, Amherst, MA, USA National Sun Yat-sen University Embedded System Laboratory

Transcript of Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation...

Page 1: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Presenter : Ching-Hua Huang

2013/11/4

Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models

Cited count : 3

Dusung Kim ; Ciesielski, M. ; Dept. of Electr. & Comput. Eng., Univ. of Massachusetts, Amherst, MA, USAKyuho Shim ; Seiyang Yang ;Dept. of Comput. Eng. Pusan National Univ., Busan, Korea Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011

National Sun Yat-sen University

Embedded System Laboratory

Page 2: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Simulation speedup offered by distributed parallel event-driven simulation is known to be seriously limited by the synchronization and communication overhead. These limiting factors are particularly severe in gate-level timing simulation. This paper describes a radically different approach to gate-level simulation based on a concept of temporal rather than conventional spatial parallelism. The proposed method partitions the entire simulation run into simulation slices in temporal domain and each slice is simulated separately. With each slice being independent from each other, an almost linear speedup is achievable with a large number of simulation nodes.

Abstract

2

Page 3: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

This concept naturally enables “correct by simulation” methodology that explicitly maintains the consistency between the reference and the target specifications. Experimental results clearly show a significant simulation speed-up.

Abstract (Cont.)

3

Page 4: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

4

What’s the problem

The performance of hardware simulation For complex designs becomes prohibitively low. Limited by the synchronization and communication overhead.

Proposed method to solve above problem A radically different approach to gate-level simulation based on a

concept of temporal parallelism.

Page 5: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Related Work

5

[7] SimCluster

[This paper]Temporal Parallel Simulation:

A Fast Gate-level HDL Simulation Using Higher Level Models

[2]TPSim – GL timing

simulationThe basic idea of this approach and preliminary results for special cases were introduced.

[6] Parallel Discrete Event Simulation

(PDES)

[9] Principles of conservative

parallel simulation

lock-step based synchronization

partitions the design into separate modules and performs concurrent simulation

Rollback-based synchronization

[12] performance improvement

[13] speed up

Developed the first Verilogdistributed simulator

A large gate-leveldecoder design improvement

Page 6: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

6

Proposed method – TPSim TPSim (Temporal Parallel Simulation)

(1) Partitions the entire simulation into slices in temporal domain. (2) Each slice is simulated separately. It consists of two major steps:

。Fast reference simulation Performed on a high-level abstraction of the design. To store essential state information.

。Detailed, fine-grained target simulation Performed on a lower level (gate-level) model. It is applied in parallel to each simulation slice.

(1) State checking(2) State matching

Page 7: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

7

Difficulties in Generalization of Temporal Parallelism (1) Multiple Asynchronous Clocks

Multiple-clock design may not be 100% cycle-by-cycle consistent with the RTL simulation.

Proposed solution : Abstract delay annotation method Allowed to overlap by the value equal to the longest delay in the

design

DataA[N-1:0]

ReqB

ClkB

Page 8: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

8

Difficulties in Generalization of Temporal Parallelism (2) State Checkpointing in Event-driven Simulation

Finding correct placement for checkpoints is more difficult because of arbitrary delay between the event edges.

Proposed solution : Checkpoint window The size of the checkpoint window is one clock-cycle equivalent

The correct value for Q could be reliably obtained at the end of each window

Overlap period must be increased accordingly so that it contains the entire target checkpoint window.

Page 9: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

9

Difficulties in Generalization of Temporal Parallelism (3) State Matching

Maintain functional correctness of the restored target state. During synthesis the design undergoes a number of logic

transformations。Combinational and sequential logic optimization, retiming, and

algebraic transformations

Proposed solution : A promising preliminary work in state matching has recently been published in [17].

Handling testbench Testbench is a sequential process

。It has no hardware “states” ,so it cannot be restarted at an arbitrary point of time.

Proposed solution : Testbench forwarding Saved continuously during the reference simulation

Page 10: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Before the experiment…

10

How many performance can TPSim improve ? Slices Multiple clock issue ?

Tool selection Synthesis : Design Compiler Cell library : 65nm technology library Simulator : NC-Sim 8.2

Page 11: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Experiment 1 – JPEG Encoder

11

This design was from OpenCores

Total gate count of GL design is 0.9M

Page 12: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

This design was from OpenCores Total gate count of GL design is 25K

Experiment 2 – AES (Advanced Encryption Standard)

12

Page 13: Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.

Conclusions and My comments

13

Conclusions This is accomplished by performing temporal partitioning

of the simulation period. This paper provides not only significant performance

improvement but also a smarter method for simulation-based verification.

My comments Because, I have some problem about the Performance

gap between RTL and GL timing simulation. This paper give me a other reference about this area.