2016年10月20日 星期四

CppCon 2016: Michael Spencer “My Little Optimizer: Undefined Behavior is Magic" Note



compiler assume UB can't happen, and optimizing accordingly

//kernel bug
void foo(bool *ok){
 bool k = *ok;   <==dereference NULL pointer is UB, so ok will not be NULL
 if (!ok)
    return;
 blah(k)
}

into

void foo(bool *ok){
 bool k = *ok; 
 blah(k)
}




how is UB represented?

Explicitly:
 unreachable
 undef

implicitly:
 the optimizer knows some things just can't happen



int unreachable(int *out) {
  *out = 42;
  return *((int*)0);
}

==> load from null is UB, so replace it with unreachable


undef: a value which can have any bit pattern at any point in the program

int undef(int *p){
  int a;
  return *p + a;
}



signed math can't overflow or underflow

bool signed_underflow(int a, int b){
  return a - b > -1;
}

since a-b can't underflow, a-b will be greater than -1 if a >= b
bool signed_underflow(int a, int b){
  return a>= b;
}



Type aliasing: pointers to objects of different types can't alias (most of the time).
void check(int *h, long *k){
 *h = 5;
 *k =6;
  printf("%d\n", *h);
}
int main(void)
 long k;
 check((int*)&k, &k);
}

so optimizer will generate

void @check(i32* h, i32* k){
  store i32 5, i32* %h, !tbaa{!"int"}
  store i32 6, i32* %k, !tbaa{!"long"}
  call @printf("%d\"n", i32 5)
  ret void
}

c++ define infinite loop is UB, loop will terminate
N1528: Why undefined behavior for infinite loops?
為了優化

for (p = q; p != 0; p = p -> next) {
    ++count;
}
for (p = q; p != 0; p = p -> next) {
    ++count2;
}
to

for (p = q; p != 0; p = p -> next) {
    ++count;
    ++count2;
}

Infinite loop without side-effects
#include <iostream>
 
int fermat() {
  const int MAX = 1000;
  int a=1,b=1,c=1;
  // Endless loop with no side effects is UB
  while (1) {
    if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
    a++;
    if (a>MAX) { a=1; b++; }
    if (b>MAX) { b=1; c++; }
    if (c>MAX) { c=1;}
  }
  return 0;}
 
int main() {
  if (fermat())
    std::cout << "Fermat's Last Theorem has been disproved.\n";
  else
    std::cout << "Fermat's Last Theorem has not been disproved.\n";}
Possible output:
Fermat's Last Theorem has been disproved.



 Use of an object, or a pointer to an object, outside of its lifetime frequently results in undefined behavior.


optimizer will just replace original value to undef



references to null value
ex. 

bool foo(int &a){
 return &a == nullptr;
}
int main(){
  int *blah= nullptr;
  return foo(*blash);
}

optimizer generate

bool foo(int &a){
 return flase;
}
int main(){
  int *blah= nullptr;
  return foo(*blash);
}

then generate

bool foo(int &a){
 return flase;
}
int main(){
  return 0;
}



"this" will not be null because " calling any nonstatic method of any class through a null-pointer leads to undefined behavior"

Still Comparing "this" Pointer to Null?

ex. v8 bug
struct FieldType{
  FieldType *none(){
     return (FieldType*) (0);
  }
  bool IsNone() {
     return this == none();
  }
};

bool foo (FieldType *F){
   return F->isNone();
}

finally, optimizer inline two times and generate

bool foo (FieldType *F){
   return false;
}



signed {under,over}flow

因為假設overflow不會發生, 所以compiler才可以做optimization

1.
%1 = mul nsw i32 %x, 20
%A = sdiv i32 %1, 2  
==>
%A = mul nsw i32 %x, 10

2.
%2 = mul nsw i32 %x, 21
%B = icmp sgt i32 %2, 0
==>
%B = icmp sgt i32 %x, 0

%4 = sub new i32 0, %x
%5 = sub nsw i32 0, %y
%C = icmp sgt i32 %4, %5
==>
%c = icmp sgt i32 %y, %x


signed {under,over}flow for promotion

int f (int *a, int x, int y){
 return a[x+y];
}

signed 需要多一個指令作promotion, 影片裡面表示unsigned會overflow所以必須要多一個指令拿lower part
但是我自己用gcc 6.0/clang 3.7 結果都是和他說的相反

signed version
addl  %edx, %esi
movslq  %esi, %rax
movl  (%rdi,%rax,4), %eax

unsigned version
addl  %edx, %esi
movl  (%rdi,%rsi,4), %eax



loop induction is signed的話 optimizer可以做佳化
因為保證n+1 > n

template <class T>
T f(T N) {
  T i = 0;
  for(i<N;i+=2);
  return i;
}


signed 可以幫optimizer作induction variable的推測 (remove loop)
unsigned不行


int _signed(int N){
  int i =0;
  for (;i<N;i+=2);
  return i;
}

==>for loop可移掉

unsigned int _unsigned(unsigned int N){
  unsigned int i =0;
  for (;i<N;i+=2);
  return i;
}

==> for loop不能移掉



realloc sample

To see the undefined behavior, you need to read the fine print in the C99 standard, which says:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size.
#include <stdio.h>
#include <stdlib.h>
 
int main() {
  int *p = (int*)malloc(sizeof(int));
  int *q = (int*)realloc(p, sizeof(int));
  *p = 1;
  *q = 2;
  if (p == q)
    printf("%d %d\n", *p, *q);
}

O0, print 2, 2
O2, print 1, 2



Atomics optimization

int x;
atomic<int> y;
void foo(){
x= 0;
y.store(0, seq_cst);
if (y.load(seq_cst) == 0)
x= 1;
}

nothing can be observe the difference ==>
int x;
atomic<int> y;
void foo(){
  x= 0;
  y.store(0, seq_cst);
  if (0 == 0)
  x= 1;
}
==> simplify CFG
int x;
atomic<int> y;
void foo(){
  x= 0;
  y.store(0, seq_cst);
  x= 1;
}

==>

int x;
atomic<int> y;
void foo(){
  y.store(0, seq_cst);
  x= 1;
}

DATA RACE detail, see
compilers can optimize non-atomic memory accesses before and after atomic accesses. 





what can you do about it?
  1. -Wall -Wextra -Werror
  2. sanitize  (undefined, address, thread, memory)
  3. static analysis (clang, coverity, pvs-stdio