how to collect information regarding function calls in run time?

Fri May 17 12:19:16 EDT 2019

On Fri, May 17, 2019 at 11:09 AM Valdis Klētnieks
<valdis.kletnieks at vt.edu> wrote:
>
> On Tue, 14 May 2019 16:11:51 -0300, Pedro Terra Delboni said:
>
> > I agree that the question alone seems like a weird one, I just assumed
> > when I wrote my first email that the explaining the motivation would
> > only consume time of the reader.
>
> Asking "what problem are you trying to solve" is a standard question, because
> whenever a programmer is saying "I can't get X to do Y", a good 85% of the time
> it turns out that  isn't working because using W to do Z is the
> already-existing API for what they actually wanted to do....
>
> > The subject I'm working on is Control-Flow Integrity, which instrument
> > a code so that each indirect jump (which are usually returns or
> > indirect calls) verify if the address they are returning is a valid
> > one (so there is a code stub that runs in every function call and
> > return).
>
> > The reason I want to count call instructions execution is because the
> > function return tied to the most executed call instruction will be the
> > one that will cause the greater increase in execution time, so by
> > inlining that call we'll be exchanging this cost for the cache impact
> > of the code expansion (as the code stub won't exist anymore for this
> > call).
>
> I suspect that the vast majority of functions that are *that* heavily used are
> either (a) already inlined or (b) too large to inline - for instance, kmalloc
> is used heavily, but having separate inlined copies everyplace to avoid the
> return statement is going to bloat the code - and even worse, make almost all
> the inline copies cache-cold instead of one shared cache-hot chunk of 2K.

It will bloat the code and the copies will negatively impact the
cache, however, the current implementation is very time consuming, to
the point where we believe the cache impact is a plausible option.

>
> And the question we *should* be asking is *not* "is the return address a plausible
> one".  It's "is the return address *the one we were called from*".  Checking
> whether kmalloc is about to return to a valid call point doesn't tell you much.
> Finding out that kmalloc is about to return to one of the 193,358 *other* call
> points rather than the one it was actually called from is something big.
>

The reason we relaxed the question from which address we were called
to which address is plausible is because verifying the address which
we were called is a lot harder than verifying if we are returning to a
kmalloc call (whichever it may be).
The implementation which just verify a plausible address added enough
latency for us to believe it is a viable option.
The more secure implementation which verifies if a call is returning
to its exact call point is the one we are trying to find ways to make
it viable.

>
>