Hongtao Du

27
1 Blind Source Separation Synthesis Structures for Hongtao Du AICIP Research ECE Department University of Tennessee Feb 23, 2005

description

Hongtao Du. AICIP Research ECE Department University of Tennessee Feb 23, 2005. Background. Blind Source Separation (BSS) Motivation: “cocktail party problem” BSS Model: (Mixing) (Unmixing) BSS Algorithms ICA LCNN Pixel level processing. - PowerPoint PPT Presentation

Transcript of Hongtao Du

Page 1: Hongtao Du

1

Blind Source Separation

Synthesis Structures for

Hongtao Du

AICIP Research

ECE DepartmentUniversity of Tennessee

Feb 23, 2005

Page 2: Hongtao Du

2

Background

• Blind Source Separation (BSS) Motivation: “cocktail party problem”• BSS Model:

(Mixing) (Unmixing)

• BSS Algorithms– ICA– LCNN

• Pixel level processing

WXS

nmnm

n

m x

x

ww

ww

s

s

1

1

1111

weight matrix or unmixing matrix

W

ASX the observed signal (pixel)X S the source signal (pure pixel or

noise) 1AW

Page 3: Hongtao Du

3

Synthesis Structures

• Serial Processing– Processing pixel-by-pixel in a serial sequence

• Parallel Processing– Using SIMD structure– Multiple pixels in, multiple pixels out– Depending on hardware constraints

• Segment Processing– Pipeline structure– Parallel processing

Page 4: Hongtao Du

4

Contrast Stretching

s, r : grey level of input pixel and output pixel

bmrrTs

Page 5: Hongtao Du

5

Component Contrast

Page 6: Hongtao Du

6

Component Contrast - RTL

Page 7: Hongtao Du

7

Component Contrast - Schematic

Page 8: Hongtao Du

8

Top-level - Schematic

Page 9: Hongtao Du

9

Pre-layout Simulation

Page 10: Hongtao Du

10

Pre-layout Simulation – Small Signal

Page 11: Hongtao Du

11

Pre-layout Simulation – Reset

Page 12: Hongtao Du

12

Pre-layout Simulation – Write Enable

Page 13: Hongtao Du

13

Contrast Stretching (32-bit) – FPGA layout

Page 14: Hongtao Du

14

Contrast Stretching (8-bit) – FPGA layout

Page 15: Hongtao Du

15

Comparison 32-bit v.s. 8-bit• Device utilization summary:

– 32-bit– Number of External IOBs 132 out of 158 83%– Number of Occupied SLICEs 605 out of 12288 4%

– 8-bit– Number of External IOBs 36 out of 158 22%– Number of Occupied SLICEs 53 out of 12288 1%

• Clock Report

Constraint Requested Actual Frequency

TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 14.464 ns 69.14MHz

Estimated Delay

16.63 ns32-bit

TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 7.450 ns 134.23MHz11.33 ns8-bit

Page 16: Hongtao Du

16

Parallel Contrast- Schematic

Page 17: Hongtao Du

17

Pre-layout Simulation

Page 18: Hongtao Du

18

Parallel Contrast Stretching – FPGA layout

Page 19: Hongtao Du

19

Constraint

• Device utilization summary:– 32-bit

– Number of External IOBs 580 out of 158 367%– Number of Occupied SLICEs 4838 out of 12288 39%

– Too many required IOBs, exceeding the target FPGA capacity– 8-bit

– Number of External IOBs 148 out of 158 93%– Number of Occupied SLICEs 422 out of 12288 3%

• Clock Report

Constraint Requested Actual Frequency

TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns / /

Estimated Delay

16.63 ns32-bit

TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 6.748 ns 148.19MHz11.63 ns8-bit

Page 20: Hongtao Du

20

Pipeline Contrast- Schematic

Page 21: Hongtao Du

21

Top-level - Schematic

Page 22: Hongtao Du

22

Pre-layout Simulation

Page 23: Hongtao Du

23

Pre-layout Simulation - threshold

Page 24: Hongtao Du

24

Pipeline Contrast Stretching – FPGA layout

Page 25: Hongtao Du

25

Synthesis PerformanceSynthesis Performance (8-bit) Device: Xilinx V1000EHQ-6

• Device utilization summary:– Number of External IOBs 156 out of 158 98%– Number of Occupied SLICEs 586 out of 12288 4%– Total equivalent gate count for design 13,474

• Clock Report

Constraint Requested Actual Frequency

TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 20.944ns 47.75MHz

Estimated Delay

11.63 ns

Page 26: Hongtao Du

26

Serial v.s. Parallel

Structure Requested Actual FrequencyEstimated Delay

Serial 100.000 ns 7.450 ns 134.23MHz11.33 ns

100.000 ns 20.944ns 47.75MHz11.63 ns

100.000 ns 6.748 ns 148.19MHz11.63 nsParallel

Pipeline

• Serial processing should have the minimum delay, but actually not.

• Parallel processing is the fastest structure• Pipeline is the most efficient structure, but very slow.

Page 27: Hongtao Du

27

Serial Parallel

Pipeline