TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011....

23
TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011

Transcript of TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011....

Page 1: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

TKT-2431 SoC design

Introduction to exercises

SoC design / Fall 2011

Page 2: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercises

Assistants:

Antti Alhonen [email protected]

Jussi Raasakka [email protected]

(Otto Esko [email protected])

In the project work, a simplified H.263 video encoder is

implemented on Altera DE2 FPGA Development and Education

board

The projects work consists of a set of exercises

After successfully finishing each exercise, one should have a

working H.263 video encoder

Exercises: Mon 14-16, Tue 14-16, Wed 16-18 (TC417)

Assistance not available in any other time

All needed software is installed on the PCs of the class and can

be used whenever the class is not reserved for other courses

SoC design / Fall 2011

Page 3: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercises cont.

Attending the exercise hours is voluntary

The following assignment is introduced

Tools and algorithms are introduced

Hints are given

Questions are answered

Completing each of the exercises is mandatory

The returns have to be in time

The returns have to be accepted

Exercise work is carried out in groups of 1-2 students

Groups of 2 persons are preferred

SoC design / Fall 2011

Page 4: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercises cont.

The exercise work consists of several phases and sub-tasks

Receiving and understanding the system requirements

Writing a system specification

Software implementation of the encoder

Functional verification on PC workstation

Migrating the SW implementation onto FPGA

Verification and performance profiling for pure SW implementation

HW/SW partitioning and hardware acceleration

Verification and performance profiling for accelerated implementation

Documentation

SoC design / Fall 2011

Page 5: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercises cont.

Completed exercise work is valid for three successive exams

Points from the exercise work

You can gain points from some of the exercises

See exercise pages for more detail

Bonus point criteria will be explained during the first exercises

http://www.tkt.cs.tut.fi/kurssit/2431

SoC design / Fall 2011

Page 6: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercise 1 / Part 1

Introduction to topic

SoC design / Fall 2011

Page 7: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Topic of the work

A simplified H.263 video encoder on DE2 FPGA Education and

Development board

The system design flow

Introducing the requirements for video encoder

Functional specification is written

Software implementation written in ANSI C language of the video

encoder algorithm is made and verified on PC workstation

Initial hardware architecture containing a single Nios II softcore CPU and

necessary peripherals is synthesized for FPGA

Software version is migrated to Nios II processor on FPGA

Design is partitioned into software and hardware according to the

profiling result of software implementation

DCT algorithm is accelerated with dedicated logic

Accelerated system is implemented and verified on FPGA

Performance analysis is carried out for the accelerated system as well

and compared with the pure software implementation

SoC design / Fall 2011

Page 8: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

H.263

The basics of H.263 video encoding are explained during following

exercises

Students are encouraged to get familiar with video encoding algorithms

in general before they start the project

H.263 has a lot in common with algorithms like JPEG and MPEG-2

A very simplified version of H.263 video encoder (resembling motion

JPEG) is used.

Only INTRA coding (i.e. prediction of subsequent frames is not applied)

Algorithms used are DCT (Discrete Cosine Transform), Quantization,

RLE (Run-Length Encoding), and VLC coding

SoC design / Fall 2011

Page 9: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Software

Altera Quartus II v7.2

System development front-end

Schematic editing

FPGA synthesis

SOPC builder for building Avalon/Nios based systems

Integrated Iogic analyzer

Nios II IDE

Software development environment for Nios II processor

Part of Nios II development kit

Mentor Graphics ModelSim

Simulating own VHDL blocks/designs

ffplay

video player

tmndec

H.263 decoder

nios2-terminal

Terminal software for reading from jtag uart

SoC design / Fall 2011

Page 10: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Hardware

Altera DE2 Development and Education Board Cyclone II 2C35 FPGA

33,216 logic elements

483,840 bits of embedded RAM

35 Embedded multipliers

4 PLLs

475 User I/O pins (at maximum)

External memory devices

4 MB Flash

512 KB SRAM

8 MB SDRAM

RS-232 serial port

Used for communication between PC and Nios II processor

USB blaster port

Used for programming the FPGA (memory contents and HW configuration)

In addition, the board contains following peripherals (not so relevant for the project)

Ethernet MAC/PHY device

4x user push-buttons, 18x toggle switches

18x red user leds, 9x green user leds

8x dual 7-segment display

2x expansion headers (40 user I/O pins / header)

SD flash connector header

50 MHz and 27 MHz Oscillators

SoC design / Fall 2011

Page 11: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercise returns

Exercises are returned as follows:

Return for an exercise has to be made before the next week’s sunday at

23:59 by E-mail

Return your exercises to [email protected]

All the required documents have to be in either pdf or pure text-file

format

The subject for the email has the following form:

SOCD_Ex<exercise_number>_G<group_number> where

<exercise_number> is the number of the exercise in question and

<group_number> is the number of your group.

SoC design / Fall 2011

Page 12: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Bonus points

Three main exercise returns are rated

Excellent: 1 bonus point for the exam

The returned document is very good and/or the returned source codes

work correctly and are well done

Accepted: no bonus

The returned document or code is acceptable

Rejected: no bonus, the return has to be corrected

Use common sense: Do not return rubbish!

All the exercises have to be accepted

Exercise points for the exam can be obtained:

1 point can be obtained from each of the exercises 2, 5, 12

Encoder achieves the given frame rate criteria (2p if > 75fps, 1p if > 50

fps)

2 most optimized encoder are awarded extra points

Bonus exercise: Dual Nios II encoder implementation

Up to 3 bonus points can be achieved

SoC design / Fall 2011

Page 13: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Exercise 1, Part 2

Introduction to algorithms

SoC design / Fall 2011

Page 14: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Requirements for Video Transmission

Communication delay

More important in video conferencing applications than in file-based

streaming applications

Should be as low as possible (< 250 ms, even 150 ms)

Should be kept as constant as possible

Avoiding burst of frames followed by a still image

Buffering

Frame rate

Affects to perceived smoothness of motion

Under 10 fps video stream is perceived as “fast slide show”

Image resolution

Directly proportional to data size of a raw image

Depends on the application

SoC design / Fall 2011

Page 15: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Introduction to H.263 Standard

May 1996, ITU-T recommendation v1

Block-based ( Macroblock size is 16 pixels by 16 lines )

Motion estimation for temporal redundancy reduction

Same objects are likely to be present in adjacent frames

Half pixel accurate motion vectors

DCT for spatial redundancy reduction

8 x 8 blocks

Adjacent pixel values have only a little difference

Quantization (lossy)

Control of compression ratio

RLE and Huffman as entropy coding algorithms

SoC design / Fall 2011

Page 16: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Block Diagram of H.263 Encoder

pre-processing DCT Q Entropy coding

Q-1

IDCT

Mot. Est.

Mot. Comp

++

-

Previous reconstructed pictures

(same image as the decoder

observes)

motion vector v(u,v)

v(u,v)

bits

ou

t (Hu

ffman

, VL

C)

7 0 4 0 0 0 0 1

1 9 3 0 0 0 0 0

2 0 0 0 0 0 0 0

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

No need to send

zeros in 8x8 block to

the decoder

1/2 pixel accurate

(interpolation)

Prediction error computation

1

0

1

10

0

1

0

In Intra mode, MBs are coded directly

SoC design / Fall 2011

Page 17: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Discrete Cosine Transform (DCT)

Assumption: Adjacent pixels differ only a little from each other

Thus, data in the frequency domain is easier to compress

Spatial domain compression

Pixels are grouped into blocks and the blocks are then transformed

into frequency domain

Essential information is then in more compact form

Important DCT-coefficients in upper-left corner, that is, in low frequencies

Compression is achieved by discarding the less important information

of the transformed block

Quantization of coefficients

DCT itself is a lossless transform

Limited accuracy with coefficients, however, leads to some loss of

information

SoC design / Fall 2011

Page 18: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Entropy Encoding

After quantization, the quantized coefficients are

compressed in a lossless manner using entropy

encoding

Run-length coding

o Lower amplitude coefficient likely to be zero

o Arrange successive quantized non-zero coefficients

into combinations of (LAST, RUN, LEVEL)

• Last = Whether this is the final non-zero coefficient in the

block

• RUN = Number of preceding zeros

• LEVEL = sign and magnitude of the non-zero coefficient

o Coefficients are processed in zig-zag order

• Due to the fact that running zeros are most likely located

at higher frequencies

Huffman coding (variable length coding)

o After RLE coefficients are encoded based on the

statistical characteristics

• Shorter codewords for symbols which occur with high

probability

SoC design / Fall 2011

Page 19: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

H.263 – Project work

A simplified version of H.263 video encoder (resembling motion

JPEG) is used.

only INTRA coding (i.e. prediction of subsequent frames is not applied)

used algorithms are DCT (Discrete Cosine Transform), quantization, RLE

(Run-Length Encoding), and VLC coding.

Image resolution used is QCIF (176 x 144)

Encoder:

Decoder:

pre-processing DCT Q Entropy coding

Q-1 IDCT

Reconstructed pictures

Entropy decoding

011001011

011001011

SoC design / Fall 2011

Page 20: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Design flow

Specification

HW/SW partitioning

Final Implementation

Requirements

Verif

icati

on

Performance analysis

Performance analysisD

ocu

men

tati

on

SW Implementation

Performance analysis

SoC design / Fall 2011

Page 21: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Specification

In this week the specification of the encoder is started

Required C source codes for the encoder are pre-given

Can be downloaded from course web-pages

You have to write a simple specification for the video encoder system

you are going to implement

Specification does not have to be long

It is the quality of the contents that matters

4-7 pages in total (including the chapters introduced on next week)

The specification should be written before the implementation

An implementation document will be written later

A diagram of the video encoding flow is required

Control and data flow diagram describing how the pre-given H.263

functions are used

SoC design / Fall 2011

Page 22: TKT-2431 SoC design · TKT-2431 SoC design Introduction to exercises SoC design / Fall 2011. Exercises Assistants: ... Documentation SoC design / Fall 2011. Exercises cont. Completed

Specification (2)

1. Introduction

What is being specified

2. Flow of encoding

Present different phases of the encoding

Explain the encoding flow briefly

A flow diagram of encoding is required!

3. Encoder interface

Inputs and outputs of encoder

What kind of data is read in?

What is the output data like?

4.Description of algorithms

Function prototypes

Description of function parameters and return values

Description of function behavior and purpose in this design

At least DCT, quantization, RLE, and VLC have to be covered here

The subsequent sections will be written in exercise 2.

SoC design / Fall 2011