EuroLLVM’19 SRC Safely Optimizing Casts between Pointers...

74
Safely Optimizing Casts between Pointers and Integers EuroLLVM’19 SRC Nuno P. Lopes Seoul National Univ. Juneyoung Lee Chung-Kil Hur Zhengyang Liu John Regehr University of Utah Ralf Jung Microsoft Research MPI-SWS 1

Transcript of EuroLLVM’19 SRC Safely Optimizing Casts between Pointers...

Page 1: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Safely Optimizing Casts between Pointers and Integers

EuroLLVM’19 SRC

Nuno P. Lopes

Seoul National Univ.Juneyoung Lee

Chung-Kil Hur

Zhengyang Liu

John RegehrUniversity of Utah

Ralf Jung

Microsoft Research

MPI-SWS

1

Page 2: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

2

Overview

Assembly

(x86-64, ARM, ..)LLVM IR

Pointer [0, 264) [0, 264) + provenance

Integer [0, 264) [0, 264) + ?

Page 3: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

We use C syntax for LLVM IR code

for readability

Problem: Pointer as a Pure Integer

3* https://godbolt.org/z/9eNt6w

Page 4: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

We use C syntax for LLVM IR code

for readability

Problem: Pointer as a Pure Integer

3* https://godbolt.org/z/9eNt6w

Page 5: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

We use C syntax for LLVM IR code

for readability

Problem: Pointer as a Pure Integer

3* https://godbolt.org/z/9eNt6w

Page 6: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

We use C syntax for LLVM IR code

for readability

Problem: Pointer as a Pure Integer

3* https://godbolt.org/z/9eNt6w

Page 7: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

We use C syntax for LLVM IR code

for readability

constant

prop.

Problem: Pointer as a Pure Integer

3* https://godbolt.org/z/9eNt6w

Page 8: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory:

3* https://godbolt.org/z/9eNt6w

Page 9: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x1000x100

3* https://godbolt.org/z/9eNt6w

Page 10: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

3* https://godbolt.org/z/9eNt6w

Page 11: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

true

3* https://godbolt.org/z/9eNt6w

Page 12: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

0x101true

3* https://godbolt.org/z/9eNt6w

Page 13: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

0x101true

10

3* https://godbolt.org/z/9eNt6w

Page 14: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

0x101true

10

3* https://godbolt.org/z/9eNt6w

Page 15: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

0x101true

10

10

3* https://godbolt.org/z/9eNt6w

Page 16: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

Problem: Pointer as a Pure Integer

0x0

Memory: -

0x100

0

0x1010x100 0x101

0x101true

10

10

3

Problem with “pointer as a pure integer”

Cannot protect accesses from different blocks

* https://godbolt.org/z/9eNt6w

Page 17: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

0x100 0x101

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory:

4* https://godbolt.org/z/9eNt6w

Page 18: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

(p,0x100) (q,0x101)

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory: p: -

0x100

0

0x101

q: 0ProvenanceProvenance

4* https://godbolt.org/z/9eNt6w

Page 19: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

(p,0x100) (q,0x101)

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory: p: -

0x100

true

0

0x101

q: 0ProvenanceProvenance

4* https://godbolt.org/z/9eNt6w

Page 20: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

(p,0x100) (q,0x101)

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory: p: -

0x100

(p,0x101)

true

0

0x101

q: 0ProvenanceProvenance

4* https://godbolt.org/z/9eNt6w

Page 21: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

(p,0x100) (q,0x101)

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory: p: -

0x100

(p,0x101)

true

0

0x101

q: 0ProvenanceProvenance

4* https://godbolt.org/z/9eNt6w

Page 22: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

(p,0x100) (q,0x101)

-

0x100

0

0x101

0

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

constant

prop.

LLVM’s Solution:Pointers have Provenance

0x0

Memory: p: -

0x100

(p,0x101)

true

0

0x101

q: 0ProvenanceProvenance

4

Undefined Behavior

because p ≠ q

* https://godbolt.org/z/9eNt6w

Page 23: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

5

What about Integers?

Assembly

(x86-64, ARM, ..)LLVM IR

Pointer [0, 264) [0, 264) + provenance

Integer [0, 264) [0, 264) + ?Casting

Page 24: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

constant

prop.

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

Miscompilation with PtrIntCast

6

Page 25: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

constant

prop.

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

Miscompilation with PtrIntCast

6

Page 26: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

constant

prop.

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

Miscompilation with PtrIntCast

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

int. eq.

prop.

cast

elim.

6

Page 27: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

int. eq.

prop.

7

Miscompilation with PtrIntCast

Page 28: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

int. eq.

prop.

7

Miscompilation with PtrIntCast

Page 29: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

int. eq.

prop.

7

Miscompilation with PtrIntCast

Page 30: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}cast

elim.

7

Miscompilation with PtrIntCast

Page 31: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

constant

prop.7

Miscompilation with PtrIntCast

Page 32: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

7

Miscompilation with PtrIntCast

Page 33: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

} 10

7

Miscompilation with PtrIntCast

Page 34: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

} 10

7

Miscompilation with PtrIntCast

Page 35: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

} 10

7

Miscompilation with PtrIntCast

Page 36: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

} 10We found this miscompilation bug

in both LLVM & GCC

7

Miscompilation with PtrIntCast

Page 37: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Which pass is responsible for it?

8

Page 38: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Problem depends on the model

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(0);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

constant

prop.

int. eq.

prop.

cast

elim.

9

Integer with provenancecannot explain

Integer without provenancecannot explain

Page 39: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance Model

int. eq.

prop.

10

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

Page 40: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance Model

int. eq.

prop.

10

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

Page 41: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance Model

int. eq.

prop.

10

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}Has

provenance q

Page 42: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance Model

int. eq.

prop.

10

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)iq = 10;print(q[0]);

}

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}Has

provenance p

Has

provenance q

Page 43: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

Integer-Without-Provenance Model

11

cast

elim.

Page 44: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

Integer-Without-Provenance Model

11

cast

elim.

Provenance p

removed

Page 45: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1]={0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(char*)(int)(p+1)=10;print(q[0]);

}

char p[1],q[1] = {0};int ip = (int)(p+1);int iq = (int)q;if (iq == ip) {*(p+1) = 10;print(q[0]);

}

Integer-Without-Provenance Model

11

cast

elim.

Provenance p

removed

Provenance p

remains

Page 46: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance is Unnatural

• Hard to explain integer equality propagation

• Hard to explain many other transformations as well

12

r = (i + j) - k r = i + (j – k)

r = (int)(float)j r = j

Page 47: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance is Unnatural

• Hard to explain integer equality propagation

• Hard to explain many other transformations as well

12

r = (i + j) - k r = i + (j – k)

r = (int)(float)j r = j

prov. – prov.?prov. + prov.?

Page 48: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-With-Provenance is Unnatural

• Hard to explain integer equality propagation

• Hard to explain many other transformations as well

12

r = (i + j) - k r = i + (j – k)

r = (int)(float)j r = j

prov. – prov.?prov. + prov.?

provenance in float types?

Page 49: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

13

Our Suggestion [OOPSLA’18]:Integer-Without-Provenance Model

Assembly

(x86-64, ARM, ..)LLVM IR

Pointer [0, 264) [0, 264) + provenance

Integer [0, 264) [𝟎, 𝟐𝟔𝟒)

Page 50: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer-Without-Provenance Model

Semantics of Casts

Problematic Optimizations

How to Recover Performance?

14

Page 51: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Semantics of Casts [OOPSLA’18]

1. Pointer-to-integer casts remove provenance

2. Integer-to-pointer casts gain full provenance

15

How to regain protection from unknown accesses?

By exploiting nondeterministic allocation

How to perform in-bounds checking on full-provenance pointers?

By recording in-bounds offsets at the pointer & checking when dereferenced

Page 52: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Optimizations Unsound in Our Model

1

6

p2 = (char*)(int)p p2 = p

c = icmp eq (int)p, (int)q c = icmp eq p, q

1. Cast Elimation

2. Integer Comparison to Pointer Comparison

Page 53: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Optimizations Unsound in Our Model

1

6

p2 = (char*)(int)p p2 = p

c = icmp eq (int)p, (int)q c = icmp eq p, q

1. Cast Elimation

2. Integer Comparison to Pointer Comparison

Full provenance Provenance p

Page 54: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Optimizations Unsound in Our Model

1

6

p2 = (char*)(int)p p2 = p

c = icmp eq (int)p, (int)q c = icmp eq p, q

Comparison of integers Comparison of pointers

1. Cast Elimation

2. Integer Comparison to Pointer Comparison

Full provenance Provenance p

Page 55: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Performance Issue

• Cast elimination removes significant portion of casts

• 13% of ptrtoints, 40% of inttoptrs from C/C++ benchmarks *

• Disabling cast elimination hinders other optimizations

• ptrtoint makes variables escaped

• inttoptr is regarded as pointing to an unknown object

• Disabling cast elimination causes slowdown

• 1% slowdown in perlbench_r, blender_r

17* SPEC2017rate + LLVM test-suite, -O3

Page 56: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Our Solution

1. Do not generate Ptr↔Int casts in the first place

• 86% of Ptr↔Int casts are introduced by LLVM, not by programmers

- Ptr Int casts are generated from pointer subtractions

- Int Ptr casts are from canonicalizing loads/stores as int types

• How: by introducing new features

2. Allow the previous optimizations conditionally

• How: by developing an analyzer to check such conditions

18

Page 57: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce PtrInt Casts:Introduce Pointer Subtraction Operation

19

i = psub p, q

psub 𝑝, 𝑞 ≝ ቐ

𝑝 − 𝑞

ip = ptrtoint piq = ptrtoint qi = ip - iq

Before Fix (Uses ptrtoint) After Fix (Uses psub)

If 𝑝𝑟𝑜𝑣 𝑝 = 𝑝𝑟𝑜𝑣 𝑞 ∨𝑝𝑟𝑜𝑣 𝑝 = full ∨ 𝑝𝑟𝑜𝑣 𝑞 = full

Otherwisepoison

Page 58: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce IntPtr Casts:Stop Canonicalizing Loads/Stores as Ints

20

v = load i64* pv2= load i8** p

* https://godbolt.org/z/y48Mkt

Page 59: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce IntPtr Casts:Stop Canonicalizing Loads/Stores as Ints

20

v = load i64* pv2= load i8** p

v = load i64* pv2= inttoptr v

* https://godbolt.org/z/y48Mkt

Page 60: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce IntPtr Casts:Stop Canonicalizing Loads/Stores as Ints

20

v = load i64* pv2= load i8** p

v = load i64* pv2= inttoptr v

* https://godbolt.org/z/y48Mkt

v = load i8** pv2= load i8** p

Page 61: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce IntPtr Casts:Stop Canonicalizing Loads/Stores as Ints

20

v = load i64* pv2= load i8** p

v = load i64* pv2= inttoptr v

* https://godbolt.org/z/y48Mkt

v = load i8** pv2= load i8** p

Use ‘d64’ (data type) instead

Has

Provenance

Supports

Integer operations

d64 Yes No

i64 No Yes

Unlike cast between intptr, d64ptr

preserves provenance.

Page 62: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

To reduce IntPtr Casts:Stop Canonicalizing Loads/Stores as Ints

20

v = load i64* pv2= load i8** p

v = load i64* pv2= inttoptr v

* https://godbolt.org/z/y48Mkt

v = load i8** pv2= load i8** p

Use ‘d64’ (data type) instead

Has

Provenance

Supports

Integer operations

d64 Yes No

i64 No Yes

Unlike cast between intptr, d64ptr

preserves provenance.

Page 63: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Conditionally Allowing Cast Elimination

2

1

p2 = inttoptr(ptrtoint p)c = icmp eq/ne p2, q c = icmp eq/ne p, q

• More examples & descriptions are listed at https://github.com/aqjune/eurollvm19

// p and q have same underlying object

p2 = inttoptr(ptrtoint p)c = psub p2, q c = psub p, q

Page 64: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Evaluation: the # of Casts

Baseline

(LLVM 8.0)No Cast Fold

Reduce

Cast Introduction

Conditionally

Fold

Before O3 # of ptrtoints 44K 44K 14K 14K

# of inttoptrs 1.5K 1.5K 1.5K 1.5K

After O3 # of ptrtoints 57K 66K 11K 11K

# of inttoptrs 29K 45K 5K 4.8K

Disable

unsound opts.

Add psub,

stop load/store to intConditionally

allow cast elim.

• C/C++ benchmarks of SPEC2017rate + LLVM Nightly Tests used

• 81% of ptrtoints / 83% of inttoptrs removed (compared to baseline)

Page 65: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

-1.00%

0.00%

1.00%

2.00%

3.00%

4.00%

i5-6600 i7-7700

Evaluation: Performance Impact

• LLVM Nightly Tests (C/C++): ~0.1% avg. slowdown (-1% ~ 3.6%)

23

<SPEC2017rate Speedup>

*Positive number means faster

Page 66: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

• Provenance helps compiler do more optimizations on pointers

• Integer with provenance works badly with integer optimizations

• We suggest separating pointers/integers conceptually

• We show how to regain performance after removing invalid optimizations

Conclusion

24https://github.com/aqjune/eurollvm19

Page 67: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

• Provenance helps compiler do more optimizations on pointers

• Integer with provenance works badly with integer optimizations

• We suggest separating pointers/integers conceptually

• We show how to regain performance after removing invalid optimizations

Conclusion

24https://github.com/aqjune/eurollvm19

PROGRAM: Name: ptrintload3ENTRY:v16 = ptrtoint i8* p1 to i16p2 = inttoptr i16 v16 to i8*v2 = load i8* p2v1 = load i8* p1

PRECONDS:Instruction "v2 = load i8* p2" has no UB.

CHECK:Instruction "v1 = load i8* p1" has no UB?v1 === v2?

Result: INCORRECT

We’re updating Aliveto support

pointer-integer casts!

Page 68: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

25

Page 69: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

supplementary slides

26

Page 70: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};

if (foo(p, q)) { //readonly*(p+i) = 10;print(0);

}

Constant Propagation and Readonly function

27

char p[1],q[1] = {0};

if (foo(p, q)) { //readonly*(p+i) = 10;print(q[0]);

}

constant

prop.

Page 71: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

char p[1],q[1] = {0};

if (foo(p, q)) { //readonly*(p+i) = 10;print(0);

}

Constant Propagation and Readonly function

27

char p[1],q[1] = {0};

if (foo(p, q)) { //readonly*(p+i) = 10;print(q[0]);

}

return (int)(p+1) == (int)q?

1?

constant

prop.

Page 72: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Integer Equality Propagation and Performance

Performed by many optimizations

• CVP, Instruction Simplify, GVN, Loop Exit Value Rewrite, …

Reduces code size

-10% in minisat, -6% in smg2000, -4% in simple_types_constant_folding, …

Boosts performance in small benchmarks

• x2000 speedup in nestedloop

28

Page 73: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Sound Optimizations that are already in LLVM

2

9

select (p==null), p, null null // null=(void*)0

gep(p, -(int)q) (void*)((int)p-(int)q)

Rationale

It is safe to replace p with (void*)(int)p.

Page 74: EuroLLVM’19 SRC Safely Optimizing Casts between Pointers ...llvm.org/devmtg/2019-04/slides/SRC-Lee-Safely_optimizing_casts_be… · Safely Optimizing Casts between Pointers and

Delayed Inbounds Checking

30

p = (char*)0x100 // p=(0x100,*)p2 = gep p, 1 // p=(0x101,*)

p3 = gep inbounds p, 1// p = (0x101,*,{0x100,0x101})

load p3 // 0x100, 0x101 should be// in-bounds addrs of the// object at 0x101