Next-generation virtual and augmented reality exploration and ... MPEG Exploration and...

download Next-generation virtual and augmented reality exploration and ... MPEG Exploration and Standardization

of 61

  • date post

    21-Mar-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Next-generation virtual and augmented reality exploration and ... MPEG Exploration and...

  • Next-generation virtual and augmented reality exploration

    and standardization

    Prof. Gauthier Lafruit, gauthier.lafruit@ulb.ac.be

    1

    mailto:gauthier.lafruit@ulb.ac.be

  • MPEG: from Single- to Multi-View Compression

    2

    Single View:

    HEVC =

    High Efficiency Video Codec

    Compression of 2 orders of magnitude e.g. 2h movie on DVD, Bluray

    10 Mb/s

    Stereo & Multi-View:

    MV-HEVC =

    MultiView High Efficiency

    Video Codec 100 Mb/s

    3D-HEVC (Feb. 2015) =

    High Efficiency Video Codec

    + Depth

  • MPEG Exploration and Standardization

    3

    Standard = Have the same electrical and data format

    CfEEE Draft CfP CfP ... FDIS IS

    FN-SMV PCC

    JPEG-PLENO (Point Cloud, Light Fields, Holography)Grand Challenge ICME

    360 parallax VR

    2-3 years

    360 VR

    • MPEG-Systems: • OMAF (Omni-Directional Application Format) = basic VR

    • FTV (Free Viewpoint TV): • FN (Free Navigation – DIBR) and SMV (SuperMultiView) • CfE (Call for Evidence) on FN and SMV finalized • EE (Exploration Experiments) 360 VR

    • JAhG JPEG-PLENO + MPEG-LightFields (VR) • Output document N16352:

    similarities between different formats • MPEG-3DG (3D Graphics)

    • Draft CfP (Call for Proposals) Point Clouds (PCC)

  • Free Navigation: From Multi-Viewpoint …

    4

  • Free Navigation: … to Any Viewpoint to look at

    5 aka Virtual Reality (VR)

  • Multi-Camera Free Navigation TV

    © NHK

  • Holoportation

    7 Holoportation © Microsoft Hololens

  • 6-DoF

    MPEG-VR roadmap

    8

    3-DoF

    Single panoramic texture [ref8]

    Left and Right panoramic textures [ref1] Light Fields [ref6]

    Short-term standardization Longer term exploration

  • Single Panoramic Texture

    9

  • Panoramic Texture

    10

    Fisheye lens: no stitching errors, relative low resolution [ref5] Multi-camera stitching: high resolution

    3-DoF

  • Warp the images to stitch

    11

    [ref4]

  • Stitching parallax errors

    12

    Proponent’s result

    http://web.cecs.pdx.edu/~fliu/papers/cvpr2014-stitching.pdf

    [ref2]

    http://web.cecs.pdx.edu/~fliu/papers/cvpr2014-stitching.pdf

  • Depth-based Stitching corrections

    13

    Google Jump Assembler [ref7]

    [ref3]

  • Stereoscopic Panoramic Textures

    14

  • Stereoscopic Panoramic Textures

    15

    3-DoF

    Left and Right panoramic textures

    Why not using light fields stored in a single texture?

    [ref1]

  • Omni-Directional Texture

    16[ref1]

  • Collect the right light ray for the given view

    17

    ODS = Omni-Directional Stereo

    View interpolation (on the circel w/o occlusions)

    corresponds to selecting the corresponding light

    rays of the light field

    [ref6]

  • Relation with Light Field Cameras

    18 Focus a posteriori in SW

    © Lytro

    Each Elemental Image contains directional

    light information

    (s, t)

    (u, v) F

    F’

    Q

    Q’

    Planes on which to render

    Ligh t Fie

    ld re

    p re

    se n

    tatio n

    Light rays emanating from the objects

  • Multi-Cameras: Discrete or In-a-Box

    19

    RayTrix

    Discrete cameras

    Microlens array

    https://www.youtube.com/watch?v=p2w1DNkITI8

    Dynamically Reparameterized Light Fields

    https://www.youtube.com/watch?v=p2w1DNkITI8

  • object L R

    M M = 0.5 L + 0.5 R

    M’ = 0.2 L + 0.8 R

    L

    R

    D ep

    th -b

    as ed

    c o

    rr ec

    ti o

    n

    object

    (A) (B)

    From all light rays that come from the object, select the ones (in red) corresponding to the camera view

    Discretization of the light rays (over two parallel planes) requires (depth based) interpolation: M’ is closer to R than to L, hence cannot be the midpoint M

    camera camera

    Light rays

    Lumigraph: Depth needed in Sparse Light Field

  • 6-DoF Free Navigation VR

    21

  • 3D model VR

    23

    6-DoF

  • 3D point cloud VR

    24

    6-DoF

    30 LIDAR positions © LISA-ULB

    Laser Time-of-Flight

  • Image-Based VR

    25

    6-DoF

    http://www.tobiasgurdan.de/research/

    http://www.tobiasgurdan.de/research/

  • q

    f

    (x, y, z) t

    s

    y x

    z

    u v(s, t)

    (u,v)

    BRDF Omni- directional colors

    Point Clouds and Image-Based Light Fields are theoretically equivalent

    http://mpeg.chiariglione.org/sites/default/files/files/standards/parts/docs/w16352_2016-06-03_Report_JAhG_light-sound_fields.pdf

    Many camera views

    Laser Time-of-Flight Point Cloud acquisition

    http://mpeg.chiariglione.org/sites/default/files/files/standards/parts/docs/w16352_2016-06-03_Report_JAhG_light-sound_fields.pdf

  • q f

    (x, y, z) t

    s

    y x

    z

    u v(s, t)

    (u,v) Camera 2

    (q,f) extrinsics

    BRDF Omni- directional colors

    Renderer colors pixels with texture information

    Category A: Scene geometry is given and Light transport is simulated for rendering

    Category B: Scene geometry estimated

    from Captured Light

    (px, py)

    Simulate Point Cloud light transport vs. Use directly the Light Field emanating from the points

  • Which format? Multi-View, Point Clouds, …?

    28

    Multi-Cam Input

    Point cloud

    3D Mesh

    3D Object

    © Microsoft

    https://www.youtube.com/watch?v=kZ-XZIV-o8s

    https://www.youtube.com/watch?v=kZ-XZIV-o8s

  • 3D Graphics Artefacts

    29© 8i.com © Microsoft

  • Image-Based vs. 3D Graphics

    34

  • Image-Based vs. 3D graphics Point Cloud

    35

    http://nozon.com/presenz © Nozon https://vimeo.com/49921117 © TimeSlice

    Start from 3D gfx and create images Start from images and create 3D gfx objects

    http://nozon.com/presenz https://vimeo.com/49921117

  • Image-Based vs. 3D graphics Point Cloud

    36

    http://replay-technologies.com/ © Replay Technologies https://www.youtube.com/watch?v=sw_LI8J-AlU © Holografika

    Start from images to create point clouds Start from a couple of image views and interpolate

    http://replay-technologies.com/ https://www.youtube.com/watch?v=sw_LI8J-AlU

  • Light Field displays = VR without goggles

    37© Holografika

    © NICT

    © NICT

  • 3D Light Field Interpolation for all-around viewing

    38

    In te

    rp o

    la ti

    o n

    In p

    u t

    https://vimeo.com/128641902 © ACM Siggraph

    https://vimeo.com/128641902

  • Challenges

    • Image-Based vs. 3D gfx (Point Clouds, Meshes, etc)

    • Acquisition/pre-processing cost

    • Rendering cost

    • Transmission cost

    • Performance metric = Quality vs. Bitrate

    • Quality = low latency, no visual artifacts, …

    • Quality ≠ PSNR

    39

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    1 000 2 000 3 000 4 000 5 000 6 000

    M O

    S

    Bitrate (kbps)

    BBB flowers

    Anchor

    NICT

    20,5% average BD-rate bitrate reduction

    New

    Q u

    al it

    y

  • MPEG-Light Fields & JPEG-PLENO

    • Light Fields = multi-camera acquisition (discrete and/or in-a-box)

    • Point Clouds

    • JPEG and MPEG will soon have Calls for Proposals (CfP)

    40

    Lytro Immerge

    RayTrix

  • Existing MPEG technology for 6-DoF DIBR/LightField-VR

    MPEG-FTV (=Free Viewpoint TV)

    SMV = Super-MultiView = Light Field displays

    FN = Free Navigation

    41

  • Bitrate reductions for the same quality

    42

    Bjøntegaard delta, versus anchors

    (Decoded view PSNR / Decoded video bitrate)

    Sequence UHasselt* Poznan Zheijang

    Big Buck Bunny flowers

    - -5.21% -4.62%

    PoznanBlocks - -16.25% -9.77%

    SoccerArc - -1.14% -11.93%

    SoccerLinear -31,6% +2.20% -0.18%

    Average (nonlinear) - -7.53% -8.77%

    Average (all) - -5.10% -6.62%

    Bjøntegaard delta, versus anchors (MOS / Decoded video bitrate)

    Sequence UHasselt Poznan Zheijang

    Big Buck Bunny flowers

    - -43.3% -19.2%

    Poznan Blocks - -36.9% -4.9%

    Soccer Arc - -70.1% -69.6%

    Soccer Linear 2 -46.51% -28.5% 27.0%

    Average (nonlinear) - -50.1% -31.3%

    Average (all) - -44.7% -16.7%

    PSNR metric MOS metric

    Vittorio Baroncini, Masayuki Tanimoto, Olgierd Stankiewicz, “Summary of the results of the Call for Evidence on Free- Viewpoint Television: Super-Multiview and Free