Week 6 - Friday

What did we talk about last time? Shading Lambertian Gouraud Phong

Anti-aliasing

One system for representing color is RGB With Red, Green, and Blue components,

you can combine them to make most visible colors

Combining colors is an additive process: With no colors, the background is black Adding colors never makes a darker color Pure Red added to pure Green added to pure

Blue makes White RGB is a good model for computer screens

If the R, G, B values happen to be the same, the color is a shade of gray 255, 255, 255 = White 128, 128, 128 = Gray 0, 0, 0 = Black

To convert a color to a shade of gray, use the following formula: Value = .3R + .59G + .11B

Based on the way the human eye perceives colors as light intensities

We can adjust the brightness of a picture by multiplying each pixel's R,G, and B value by a scalar b b ∈ [0,1) darkens b ∈ (1,∞) brightens

We can adjust the contrast of a picture by multiplying each pixel's R,G, and B value by a scalar c and then adding -128c + 128 to the value c ∈ [0,1) decreases contrast c ∈ (1,∞) increases contrast

After adjustments, values must be clamped to the range [0, 255] (or whatever the range is)

HSV Hue (which color) Saturation (how colorful the color is) Value (how bright the color is)

Hue is represented as an angle between 0° and 360°

Saturation and value are often given between 0 and 1

Saturation in HSV is not the same as in HSL

LoadContent() method Update() method Draw() method Texture2D objects Sprites Drawing primitives Vertex buffers Index buffers

What do we have? Virtual camera (viewpoint) 3D objects Light sources Shading Textures

What do we want? 2D image

For API design, practical top-down problem solving, and hardware design, and efficiency, rendering is described as a pipeline

This pipeline contains three conceptual stages:

Produces material

to be rendered

Application

Decides what,

how, and where to

render

Geometry

Renders the final image

Rasterizer

The application stage is the stage completely controlled by the programmer

As the application develops, many changes of implementation may be done to improve performance

The output of the application stage are rendering primitives Points Lines Triangles

Reading input Managing non-graphical output Texture animation Animation via transforms Collision detection Updating the state of the world in general

The Application Stage also handles a lot of acceleration Most of this acceleration is telling the renderer what NOT to

render Acceleration algorithms Hierarchical view frustum culling BSP trees Quadtrees Octrees

The output of the Application Stage is polygons The Geometry Stage processes these polygons using the

following pipeline:

Model and View

Transform

Vertex Shading Projection Clipping Screen

Mapping

Each 3D model has its own coordinate system called model space

When combining all the models in a scene together, the models must be converted from model space to world space

After that, we still have to account for the position of the camera

We transform the models into camera space or eye spacewith a view transform

Then, the camera will sit at (0,0,0), looking into negative z The z-axis comes out of the screen in the book's examples and

in MonoGame (but not in older DirectX)

Figuring out the effect of light on a material is called shading This involves computing a (sometimes complex) shading

equation at different points on an object Typically, information is computed on a per-vertex basis and

may include: Location Normals Colors

Projection transforms the view volume into a standardized unit cube

Vertices then have a 2D location and a z-value

There are two common forms of projection: Orthographic: Parallel lines stay

parallel, objects do not get smaller in the distance

Perspective: The farther away an object is, the smaller it appears

Clipping process the polygons based on their location relative to the view volume

A polygon completely inside the view volume is unchanged A polygon completely outside the view volume is ignored (not rendered) A polygon partially inside is clipped New vertices on the boundary of the volume are created

Since everything has been transformed into a unit cube, dedicated hardware can do the clipping in exactly the same way, every time

Screen-mapping transforms the x and y coordinates of each polygon from the unit cube to screen coordinates

A few oddities: DirectX has weird coordinate systems for pixels where the location is the center of the

pixel DirectX conforms to the Windows standard of pixel (0,0) being in the upper left of the

screen OpenGL conforms to the Cartesian system with pixel (0,0) in the lower left of the screen

Backface culling removes all polygons that are not facing toward the screen

A simple dot product is all that is needed This step is done in hardware in MonoGame and OpenGL You just have to turn it on Beware: If you screw up your normals, polygons could vanish

The goal of the Rasterizer Stage is to take all the transformed geometric data and set colors for all the pixels in the screen space

Doing so is called: Rasterization Scan Conversion

Note that the word pixel is actually short for "picture element"

As you should expect, the Rasterization Stage is also divided into a pipeline of several functional stages:

Triangle Setup

Triangle Traversal

Pixel Shading Merging

Setup Data for each triangle is computed This could include normals

Traversal Each pixel whose center is overlapped by a triangle must have a fragment

generated for the part of the triangle that overlaps the pixel The properties of this fragment are created by interpolating data from the

vertices These are done with fixed-operation (non-customizable) hardware

This is where the magic happens Given the data from the other stages, per-pixel shading

(coloring) happens here This stage is programmable, allowing for many different

shading effects to be applied Perhaps the most important effect is texturing or texture

mapping

Texturing is gluing a (usually) 2D image onto a polygon To do so, we map texture coordinates onto polygon coordinates Pixels in a texture are called texels This is fully supported in hardware Multiple textures can be applied in some cases

The final screen data containing the colors for each pixel is stored in the color buffer

The merging stage is responsible for merging the colors from each of the fragments from the pixel shading stage into a final color for a pixel

Deeply linked with merging is visibility: The final color of the pixel should be the one corresponding to a visible polygon (and not one behind it)

The Z-buffer is often used for this

To deal with the question of visibility, most modern systems use a Z-buffer or depth buffer

The Z-buffer keeps track of the z-values for each pixel on the screen

As a fragment is rendered, its color is put into the color buffer only if its z value is closer than the current value in the z-buffer (which is then updated)

This is called a depth test

Modern GPU's are generally responsible for the geometry and rasterization stages of the overall rendering pipeline

The following shows color-coded functional stages inside those stages Red is fully programmable Purple is configurable Blue is not programmable at all

Vertex Shader

Geometry Shader Clipping Screen

MappingTriangle

SetupTriangle Traversal

Pixel Shader Merger

Supported in hardware by all modern GPUs For each vertex, it modifies, creates, or ignores: Color Normal Texture coordinates Position

It must also transform vertices from model space to homogeneous clip space

Vertices cannot be created or destroyed, and results cannot be passed from vertex to vertex Massive parallelism is possible

Newest shader added to the family, and optional Comes right after the vertex shader Input is a single primitive Output is zero or more primitives The geometry shader can be used to: Tessellate simple meshes into more complex ones Make limited copies of primitives

Clipping and triangle set up is fixed in function Everything else in determining the final color of the fragment is

done here Because we aren't actually shading a full pixel, just a particular fragment

of a triangle that covers a pixel A lot of the work is based on the lighting model The pixel shader cannot look at neighboring pixels Except that some information about gradient can be given

Multiple render targets means that many different colors for a single fragment can be made and stored in different buffers

Fragment colors are combined into the frame buffer This is where stencil and Z-buffer operations happen It's not fully programmable, but there are a number of

settings that can be used Multiplication Addition Subtraction Min/max

We will be interested in a number of operations on vectors, including: Addition Scalar multiplication Dot product Norm Cross product

The dot product is a form of multiplication between two vectors that produces a scalar

Rvu ∈=⋅ ∑−

=

1

0

n

iiivu

A norm is a way of measuring the magnitude of a vector We are actually only interested in the L2 norm, but there are

many (infinite) norms for any given vector space We'll denote the norm of u as ||u||

𝐮𝐮 = 𝐮𝐮 � 𝐮𝐮 = �𝑖𝑖=0

𝑛𝑛−1

𝑢𝑢𝑖𝑖2

A vector can either be a point in space or an arrow (direction and distance)

The norm of a vector is its distance from the origin (or the length of the arrow)

In R2 and R3, the dot product is: 𝐮𝐮 � 𝐯𝐯 = 𝑢𝑢𝑥𝑥𝑣𝑣𝑥𝑥 + 𝑢𝑢𝑦𝑦𝑣𝑣𝑦𝑦 + 𝑢𝑢𝑧𝑧𝑣𝑣𝑧𝑧 or

𝐮𝐮 � 𝐯𝐯 = 𝐮𝐮 𝐯𝐯 cos𝜑𝜑where ϕ is the smallest angle between u and v

The cross product of two vectors finds a vector that is orthogonal to both

For 3D vectors u and v in an orthonormal basis, the cross product w is:

−−−

=×=

=

xyyx

zxxz

yzzy

z

y

x

vuvu

vuvu

vuvu

w

w

w

vuw

𝐰𝐰 = 𝐮𝐮 × 𝐯𝐯 = 𝐮𝐮 𝐯𝐯 sin θ 𝐮𝐮 × 𝐯𝐯 = −𝐯𝐯 × 𝐮𝐮 𝑎𝑎𝐮𝐮 + 𝑏𝑏𝐯𝐯 × 𝐰𝐰 = 𝑎𝑎 𝐮𝐮 × 𝐰𝐰 + 𝑏𝑏(𝐯𝐯 × 𝐰𝐰) In addition w⊥u and w⊥v u, v, and w form a right-handed system

We will be interested in a number of operations on matrices, including: Addition Scalar multiplication Transpose Trace Matrix-matrix multiplication Determinant Inverse

Multiplication MN is legal only if M is p x q and N is q x r Each row of M and each column of N are combined with a dot

product and put in the corresponding row and column element

𝐌𝐌𝐌𝐌 = 𝐓𝐓 = 𝑡𝑡𝑖𝑖𝑖𝑖 = �𝑘𝑘=0

𝑞𝑞−1𝑚𝑚𝑖𝑖𝑘𝑘𝑛𝑛𝑘𝑘𝑖𝑖

The determinant is a measure of the "magnitude" of a square matrix

We'll focus on determinants for 2 x 2 and 3 x 3 matrices

100111001110

0100)det( mmmmmm

mm−=== MM

===

222120

121110

020100

)det(

mmm

mmm

mmm

MM211200221001201102

211002201201221100

mmmmmmmmm

mmmmmmmmm

−−−++

For a square matrix M where |M| ≠ 0, there is a multiplicative inverse M-1 such that MM-1 = I

For cases up to 4 x 4, we can use the adjoint:

Properties of the inverse: (M-1)T = (MT)-1

(MN)-1 = N-1M-1

)adj(11 MM

M =−

A square matrix is orthogonal if and only if its transpose is its inverse MMT = MTM = I

Lots of special things are true about an orthogonal matrix M |M| = ± 1 M-1 = MT

MT is also orthogonal ||Mu|| = ||u|| Mu ⊥ Mv iff u ⊥ v If M and N are orthogonal, so is MN

An orthogonal matrix is equivalent to an orthonormal basis of vectors lined up together

We add an extra value to our vectors It's a 0 if it’s a direction It's a 1 if it's a point

Now we can do a rotation, scale, or shear with a matrix (with an extra row and column):

=

1000

0

0

0

222120

121110

020100

mmm

mmm

mmm

M

Then, we multiply by a translation matrix (which doesn't affect a vector)

A 3 x 3 matrix cannot translate a vector

=

1000

100

010

001

z

y

x

t

t

t

T

Explicit form (works for 2D and 3D lines) : r(t) = o + td o is a point on the line and d is its direction vector

Implicit form (2D lines only): p is on L if and only if n • p + c = 0

If p and q are both points on L then we can describe L with n • (p – q) = 0 Thus, n is perpendicular to L n = (-(py- qy),(px – qx)) = (a, b)

Once we are in 3D, we have to talk about planes as well The explicit form of a plane is similar to a line: p(u,v) = o + us + vt o is a point on the plane s and t are vectors that span the plane s x t is the normal of the plane

Adding a vector after a linear (3 x 3) transform makes an affine transform

Affine transforms can be stored in a 4 x 4 matrix using homogeneous notation

Affine transforms: Translation Rotation Scaling Reflection Shearing

Notation Name Characteristics

T(t) Translation matrix Moves a point (affine)

R Rotation matrix Rotates points (orthogonal and affine)

S(s) Scaling matrix Scales along x, y, an z axes according to s (affine)

Hij(s) Shear matrix Shears component i by factor s with respect to component j

E(h,p,r) Euler transform Orients by Euler angles head (yaw), pitch, and roll (orthogonal and affine)

Po(s) Orthographicprojection

Parallel projects onto a plane or volume (affine)

Pp(s) Perspective projection Project with perspective onto a plane or a volume

slerp(q,r,t) Slerp transform Interpolates quaternions q and r with parameter t

Move a point from one place to another by vector t = (tx, ty, tz) We can represent this with translation matrix T

==

1000

100

010

001

),,()(z

y

x

zyx t

t

t

tttTtT

Rotation, like translation, is a rigid-body transform (points don't change distance from each other and handedness doesn't change)

An orientation matrix is used to define up and forward (usually for a camera)

We often express rotation in terms of 3 separate x, y, and zrotation matrices

−

=

1000

0cossin0

0sincos0

0001

)(φφ

φφφxR

−=

1000

0cos0sin

0010

0sin0cos

)(φφ

φφ

φyR

−

=

1000

0100

00cossin

00sincos

)(φφ

φφ

φzR

Scaling is easy and can be done for all axes at the same time with matrix S

If sx = sy = sz, the scaling is called uniform or isotropic and nonuniform or anisotropic otherwise

=

1000

000

000

000

)(z

y

x

s

s

s

sS

A shearing transform distorts one dimension in terms of another with parameter s

Thus, there are six shearing matrices Hxy(s), Hxz(s), Hyx(s), Hyz(s), Hzx(s), and Hzy(s)

Here's an example of Hxz(s):

=

1000

0100

0010

001

)(

s

sxzH

A rigid-body transform preserves lengths, angles, and handedness

We can write any rigid-body transform X as a rotation matrix R multiplied by a translation matrix T(t)

==

1000

)(222120

121110

020100

z

y

x

trrr

trrr

trrr

RtTX

Because R is orthogonal, its inverse is its transpose

Because of the nature of a translation, its inverse is just its negative

Thus, the inverse of X is X-1 = (T(t)R)-1 = R-1T(t)-1 = RTT(-t)

The matrix used to transform points will not always work on surface normals

Rotation is fine Uniform scaling can stretch the normal (which should be unit) Non-uniform scaling distorts the normal

Transforming by the transpose of the adjoint always gives the correct answer

In practice, the transpose of the inverse is usually used

For normals and other things, we need to be able to compute inverses The inverse of a rigid body transform X is X-1 = (T(t)R)-1 = R-1T(t)-1 =

RTT(-t) For a concatenation of simple transforms with known parameters,

the inverse can be done by inverting the parameters and reversing the order:▪ If M = T(t)R(ϕ) then M-1 = R(-ϕ)T(-t)

For orthogonal matrices, M-1 = MT

If nothing is known, use the adjoint method

We can describe orientations from some default orientation using the Euler transform

The default is usually looking down the –zaxis with "up" as positive y

The new orientation is: E(h, p, r) = Rz(r)Rx(p)Ry(h) h is head, like shaking your head "no"▪ Also called yaw

p is pitch, like nodding your head back and forth

r is roll… the third dimension

Quaternions are a compact way to represent orientations Pros: Compact (only four values needed) Do not suffer from gimbal lock Are easy to interpolate between

Cons: Are confusing Use three imaginary numbers Have their own set of operations

If we animate by moving rigid bodies around each other, joints won't look natural

To do so, we define bones and skin and have the rigid bone changes dictate blended changes in the skin

Morphing interpolates between two complete 3D models Vertex correspondence

▪ What if there is not a 1 to 1 correspondence between vertices? Interpolation

▪ How do we combine the two models? If there's a 1 to 1 correspondence, we use parameter s∈[0,1] to indicate

where we are between the models and then find the new location mbased on the two locations p0 and p1

Morph targets is another technique that adds in weighted poses to a neutral model

10)1( ppm ss +−=

An orthographic projection maintains the property that parallel lines are still parallel after projection

The most basic orthographic projection matrix simply removes all the z values

This projection is not ideal because zvalues are lost Things behind the camera are in front z-buffer algorithms don't work

=

1000

0000

0010

0001

0P

To maintain relative depths and allow for clipping, we usually set up a canonical view volume based on (l,r,b,t,n,f)

These letters simply refer to the six bounding planes of the cube Left Right Bottom Top Near Far

Here is the (OpenGL) matrix that translates all points and scales them into the canonical view volume

−+

−−

−+

−−

−+

−−

=

1000

200

02

0

002

0

nfnf

nf

btbt

bt

lrlr

lr

P

A perspective projection does not preserve parallel lines Lines that are farther from the camera will appear smaller Thus, a view frustum must be normalized to a canonical view

volume Because points actually move (in x and y) based on their z

distance, there is a distorting term in the w row of the projection matrix

Here is the MonoGame projection matrix It is different from the OpenGL again because it only uses [0,1]

for z

−−

−−−−+

−

−+

−

=

0100

00

02

0

002

0

nffn

nffbtbt

btn

lrlr

lrn

P

Exam 1!

Exam 1 on Monday Finish Assignment 2, due tonight! Keep working on Project 2, due November 1

Week 6 - Friday

Documents

Transcript of Week 6 - Friday