The Occul t / the A ppl e GPU
Transcript of The Occul t / the A ppl e GPU
⌘ The Occult / the Apple GPUAlyssa Rosenzweig
⌘ Introduc�on
⌘ The wizardsDougall JohnsonHector Mar�nSven PeterAlyssa Rosenzweig
Designed by Asahi in Canada.
Assembled in… also Canada. �
⌘ DCP
⌘ HardwareDiabolical Clusterpuck
…Er, wait.
⌘ HardwareDisplay CoprocessorManages the display controllerHas its own cursed coprocessor7 megabytes of firmware
⌘ DCP
⌘ RTKitReal Time KitSecret real-�me opera�ng systemApple firmware (and AirPods)Shared memory and mailbox
⌘ FirmwareObject-oriented C++Remote procedure callsUnstable ABI ⇒ maintenance nightmare
⌘ LinuxGoofy DRM/KMS driverAtomic KMS → DCP callsHaunted by IOSurface
⌘ Status
⌘ AGX
⌘ HardwareApple GraphicsTilerDual-issue, scalar instruc�on setMade for Metal
⌘ MesaGallium3D driverNIR compiler
⌘ Metal lacks OpenGL features
⌘ “Fun” with AGX
⌘ Divergence stylesMali: Branches with hardware reconvergenceAMD: Compiler manages execu�on masksApple: Count control flow nes�ng
⌘ AGX divergence
⌘ AGX control flow32 threads in a warpImplicit execu�on 32-bit maskNes�ng counter in r0l (0 if ac�ve)Warp-sta�c jumpsStructured if, else, do…whileControl flow sets r0l and mask
⌘ NIR control flowIf-elseInfinite loopBreak, con�nue
⌘ Implemen�ng if
if_icmp cond != 0 (n = 1)
...
pop_exec (n = 1)
⌘ Implemen�ng if…else
if_icmp cond != 0 (n = 1)
...
else_icmp cond == 0 (n = 1)
...
pop_exec (n = 1)
⌘ What about loops?
⌘ Implemen�ng loops, take 0No way to break!
start:
...
jmp_exec_any start
⌘ Implemen�ng loops, take 1
Break:
start:
...
do_while true (n = 1)
jmp_exec_any start
mov r0l, #1
pop_exec (n = 0)
⌘ Implemen�ng loops, take 2Don’t clobber the execu�on mask.
push_exec (n = 1)
start:
...
do_while true (n = 1)
jmp_exec_any start
pop_exec (n = 1)
⌘ Implemen�ng loops, take 2Implements loop { ... }Can break out of mul�ple loops at onceWhat about con�nue?
⌘ Implemen�ng loops, take 3Quoth the Dougall:
“Con�nue is a break.”
do {
do {
...
} while(0);
} while (cond);
⌘ Implemen�ng loops, take 3Model two nested loops in general.
push_exec (n = 2)
start:
....
do_while true (n = 2)
jmp_exec_any start
pop_exec (n = 2)
⌘ Implemen�ng loops, take 3Break:
Con�nue:
mov r0l, #2
pop_exec (n = 0)
mov r0l, #1
pop_exec (n = 0)
⌘ Implemen�ng loops, take 4Break:
Con�nue:
mov r0l, #(nested_if_count + 2)
pop_exec (n = 0)
mov r0l, #(nested_if_count + 1)
pop_exec (n = 0)
⌘ Conclusion
⌘ StatusDCP driver downstreamAGX upstream in Mesa
Passing 95% of dEQP-GLES2AGX kernel driver pending