[PATCH 2/3] binder: do not initialize locals passed to copy_from_user()

Jann Horn jannh at google.com
Tue Feb 25 08:14:08 UTC 2020


On Tue, Feb 25, 2020 at 5:18 AM Kees Cook <keescook at chromium.org> wrote:
> On Mon, Feb 24, 2020 at 04:35:00PM +0100, glider at google.com wrote:
> > Certain copy_from_user() invocations in binder.c are known to
> > unconditionally initialize locals before their first use, like e.g. in
> > the following case:
> >
> >       struct binder_transaction_data tr;
> >       if (copy_from_user(&tr, ptr, sizeof(tr)))
> >               return -EFAULT;
> >
> > In such cases enabling CONFIG_INIT_STACK_ALL leads to insertion of
> > redundant locals initialization that the compiler fails to remove.
> > To work around this problem till Clang can deal with it, we apply
> > __do_not_initialize to local Binder structures.
>
> It should be possible to write a Coccinelle script to identify all these
> cases. (i.e. a single path from struct declaration to copy_from_user())
> and apply the changes automatically. This script could be checked into
> scripts/coccinelle/ to help keep these markings in sync...

I wonder whether it would instead be reasonable to define a helper
macro for "copy object from userspace to the kernel", and then use
this macro. Something like this:

#define copy_object_from_user(objp, src) ({              \
  __attribute__((uninitialized)) typeof(*(objp)) __buf; \
  copy_from_user(&__buf, (src), sizeof(*(objp)));        \
  *(objp) = __buf;                                       \
})

void blah(unsigned long user_addr) {
  struct foo stackobj;
  copy_object_from_user(&stackobj, user_addr);
  do_stuff(&stackobj);
}

Unfortunately, clang runs a MemCpy optimization pass before the Dead
Store Elimination pass, which makes the copy_from_user() go directly
to `stackobj` instead of __buf before DSE has had a chance to get rid
of the first memcpy(). Grrr...

But looking at the list of passes that LLVM runs, we can see that
between the MemCpy optimization and the DSE pass, we have Bit-Tracking
Dead Store Elimination... okay, fine, so we hack together some code
such that it contains a fake branch that is only resolved between
MemCpy and DSE:

unsigned long blub, blab;
unsigned long get_blub(void) { return blub & 5; }
#define copy_object_from_user(objp, src) ({              \
  __attribute__((uninitialized)) typeof(*(objp)) __buf; \
  *(char*)&__buf = 0;                                   \
  copy_from_user(&__buf, (src), sizeof(*(objp)));        \
  int x = get_blub(); int y = blab & 10;\
  if ((x & y) == 0) { \
    *(objp) = __buf;                                       \
  } \
})

But it still doesn't work, even though the IR looks like DSE ought to work:

*** IR Dump After Value Propagation ***
; Function Attrs: nounwind uwtable
define dso_local void @blah(i64) local_unnamed_addr #1 {
  %2 = alloca %struct.foo, align 4
  %3 = alloca %struct.foo, align 4
  %4 = bitcast %struct.foo* %2 to i8*
  call void @llvm.lifetime.start.p0i8(i64 1008, i8* nonnull %4) #4
  call void @llvm.memset.p0i8.i64(i8* nonnull align 4 %4, i8 -86, i64
1008, i1 false)
  %5 = bitcast %struct.foo* %3 to i8*
  call void @llvm.lifetime.start.p0i8(i64 1008, i8* nonnull %5) #4
  store i8 0, i8* %5, align 4, !tbaa !6
  call void @copy_from_user(i8* nonnull %5, i64 %0, i64 1008) #4
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 %4, i8*
nonnull align 4 %5, i64 1008, i1 false), !tbaa.struct !7
  call void @llvm.lifetime.end.p0i8(i64 1008, i8* nonnull %5) #4
  call void @do_stuff(%struct.foo* nonnull %2) #4
  call void @llvm.lifetime.end.p0i8(i64 1008, i8* nonnull %4) #4
  ret void
}
*** IR Dump After Dead Store Elimination ***
; Function Attrs: nounwind uwtable
define dso_local void @blah(i64) local_unnamed_addr #1 {
  %2 = alloca %struct.foo, align 4
  %3 = alloca %struct.foo, align 4
  %4 = bitcast %struct.foo* %2 to i8*
  call void @llvm.lifetime.start.p0i8(i64 1008, i8* nonnull %4) #4
  call void @llvm.memset.p0i8.i64(i8* nonnull align 4 %4, i8 -86, i64
1008, i1 false)
  %5 = bitcast %struct.foo* %3 to i8*
  call void @llvm.lifetime.start.p0i8(i64 1008, i8* nonnull %5) #4
  store i8 0, i8* %5, align 4, !tbaa !6
  call void @copy_from_user(i8* nonnull %5, i64 %0, i64 1008) #4
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 %4, i8*
nonnull align 4 %5, i64 1008, i1 false), !tbaa.struct !7
  call void @llvm.lifetime.end.p0i8(i64 1008, i8* nonnull %5) #4
  call void @do_stuff(%struct.foo* nonnull %2) #4
  call void @llvm.lifetime.end.p0i8(i64 1008, i8* nonnull %4) #4
  ret void
}

I guess maybe clang can't do DSE past a function call or something?

We also can't trick the DSE pass using an empty "asm volatile" with an
output-only memory operand, because the DSE pass can't optimize inline
asm.

Is there really no way to somehow hack this together inside a macro? :(


More information about the devel mailing list